CSC320 Visual Computing
Assignment 2

Due:Submit by: Feb 28 at 11:50 PM
Late penalty: 20% for up to 48 hours late, not accepted after that.
Hand in: Submit electronically here.
Marking: Simple, clear, concise is best. It reflects the application of the right, good ideas. Take a look at my code marking scheme and marking scheme.
Groups: Work in groups of 2
Environment:We will test your code under cslinux or systems in 1158 under Linux or in CC2140.

Please read the Questions and Answers Section at the bottom!!

For this assignment, you will expand the capabilities of the Assignment 1, question 1. You can choose to use yours, your partners or someone elses A1 question 1. You must state whose you have used.

  1. [10 Marks] Build a tool that can be used to manipulate the Foreground Image from assignment 1. It should support the following operations, each producing a new image. Some of the operations seem to apply to greyscale images. You should figure out a natural way to deal with colour images. For each operation, you should determine some way of visualizing the results of the operation. Greyscale is easy, Image gradient is not.
  2. [10 Marks] (Understanding 3.1 (Local Costs) see the next question) hand in a grid with integral values (the pixel intensities for the grey scale image (0 to 255)) with examples of the following labelled. In case an example is not possible explain why. Show your work. For simplicity here, use the Laplacian instead the LoG.
  3. [50 Marks] You will use Java to implement a variation of the Intelligent Scissors tool as outlined in
    Eric N. Mortensen and William A. Barrett, "Intelligent Scissors for Image Composition", Proc. SIGGRAPH'95, Los Angeles, CA, 1995, pp. 191-198
    Your interface should be similar to the scissoring interface in the gimp. You should aim to have a live wire generated while you move your mouse. The live wire connects your last control point to the current mouse position.

    Implement up to (and including) section 3.3 of the paper except, use the LoG instead of the Laplacian for fz in section 3.1. You should have a default Sigma, but also allow the user the option of specifying a Sigma.

    Note that the Mortensen paper applies to greyscale images. Your application will work on colour images. One possibility is to first turn your colour image into a greyscale image (as in question 1) and then apply the technique there. An alternative would be to modify the definitions of l(p,q), fZ, fG, fD in section 3.1 so that they somehow apply to colour images.

    A word of advice, as I have done for you in question 1, it is best to start with something simple that works. Do simple things first, get them working 100% and build on that. For example, you might want to use a simpler algorithm for 3.2 (all pairs shortest paths is simple). Worry about implementing 3.2 later. You might want to make your system work on greyscale images only to start. You might want to avoid Cursor Snap (end of 3.3) for starters. See extra features below. I also suggest that you maintain an architecture document. This ensures that you understand what you are trying to build and how the pieces fit together.

  4. [0 Marks] The application you have built so far will determine a closed path of pixels inside the Foreground image. Your job now is to determine an Alpha Matte based on this closed path. After the user completes their path, your application should modify the Foreground image (using alpha matting) so that only the interior of the closed path is visible. That is, alpha is set to 255 inside the closed path and 0 elsewhere. You can use GeneralPath (or any other part of the java api), this part is worth 0 marks now. See the next question.
  5. [5 Marks] (Extra Features) Add an interesting feature to your system. This could be Cursor Snap, path cooling, application of the approach to RGB images etc. You will be marked relative to your classmates on this. Please submit a document (features.html) outlining the additional features you have added. We will also count (as an extra feature) writing an Java code to determine the interior of a polygon. That is, without using GeneralPath
  6. [3 Marks] For this assignment, you should also turn in a final composition created using your program. Your composite can be derived from as many different images as you'd like. Make it interesting in some way--be it humorous, thought provoking, or artistic! You should use your own scissoring tool to cut the objects out and save them to matte files, but then can use gimp, Photoshop or any other image editing program to process the resulting mattes (move, rotate, adjust colors, warp, etc.) and combine them into your composite. Some instructions on how to do this with gimp are provided here (see here for a Photoshop version). You should still turn in a composition even if you don't get the program working fully, using the scissoring tool in gimp.

    The class will vote on the best composition.

To Do

Questions and Answers

Question:
Question 1: -For the LoG of the image, you state that we need to apply some arithmetic in order to actually view the result. Isn't the viewable result of an LoG a highlight of the edges (ie: zero crossings) found in the image? And if that's the case, then what's the use of the zero crossings button? Or am I missing some other visual interpretation for the LoG entirely? LoG: -You state that we need to sample LoG(x,y) at discrete points to obtain our mask. What do you exactly mean by this? Experimentally plug in random RGB values and use those same values to figure out sigma? Or is sigma entered manually? For that matter, for the Gaussian smoothing, is the sigma entered manually by the user? Question 2: -For Fg (the gradient magnitude value), what the devil is max(G)? The paper you recommend to us doesn't define this term at all. Secondly, what does it mean when the paper states that "gradient magnitude costs are scaled by Euclidean distance"?
Answer:
Question 1: When you apply the LoG to an image, the result is an array with numeric values in it. Remember, an image consists of pixels, each are specified with an R,G,B (and possibly alpha) value. Each of these values is in the range 0-255. The numeric values you get from the LoG will probably not be in that range (some may even be negative). LoG: Sigma is supplied to you (either by the user or by default). This determines the extent of the LoG mask (its dimensions). Now you need to put values in the mask, what values should you use? Use the closed form expression for the LoG for this. You should allow Sigma to be entered by the user. Question 2: max(G) is the maximum image gradient for the image. Basically, diagonal hops cost more than vertical/horizontal hops. The paper describes how to accomplish this.
Question:
Hi Professor, I have a question about LOG Function: I tried to find out how to get the values for LOG mask but I got totaly different values. When I substitude sigma = 1.4 into the LOG function for x and y equal 0, I got -1/pi*(1.4)^4 which is not equivalent to -40. Can you tell me what i am doing wrong? The other problem I have is how do we find the size of the LOG mask? is it 3*(sigma) then round down the answer? Thanks
Answer:
I think you are referring to http://www.cs.toronto.edu/~arnold/320/05s/assignments/02/log.html or its source http://www.cee.hw.ac.uk/hipr/html/log.html. The question is, why do they have integral values in their mask? Their mask has been scaled and rounded so that all values in the mask are integral. Why would you want to do that? Think about how you use the LoG. About sigma, yes, sigma determines the dimensions of the mask.
Question:
Is the paper perfect?
Answer:
Probably not, you will need to understand the concepts behind the paper to implement the ideas.
Question:
Hi, Arnold. We have snapping working and we also implemented live-wire, so I was wondering if live-wire counts as an extra feature, too, since we're not required to implement it.
Answer:
Live wire is part of the requirements.
Question:
I have updated the assignment, please see the GeneralPath updates in questions 4 and 5.
Answer: