CSC418 - Notes
Topic 5: Camera Models

Navigation: back up next notes exercises

Key concepts & Readings

OpenGL

Positioning the camera in OpenGL

The OpenGL auxilliary library provides the call gluLookAt() to create the viewing transformation. gluLookAt(ex,ey,ez,rx,ry,rz,ux,uy,uz), where (ex,ey,ez) gives the eye point, (rx,ry,rz) gives the reference or 'lookat' point, and (ux,uy,uz) gives the up vector.

This function call postmultiplies the current matrix, so the easiest way to use it is:

glMatrixMode(GL_MODELVIEW);
glLoadIdentity();
gluLookAt(ex,ey,ez,rx,ry,rz,ux,uy,uz);
/* setup modelling transformations here */

Setting up an orthographic camera in opengl

glMatrixMode(GL_PROJECTION);
glLoadIdentity();

followed by one of:

glOrtho(left, right, bottom, top, near, far)
gluOrtho2D(left,right,bottom,top)

In the above, near>0 and far>0 set the clipping planes at z=-near and z=-far. The function call gluOrtho2D() is the same as calling glOrtho() with near=0 and far=1.

Setting up a perspective camera in opengl

glMatrixMode(GL_PROJECTION);
glLoadIdentity();

followed by one of:

glFrustum(left, right, bottom, top, near, far)
gluPerspective(fovy, aspect, near, far)

The glFrustum() call uses the parameters as described above. An alternative specification is to use field-of-view and an aspect ratio to specify the image plane parameters. In gluPerspective(), fovy gives the field-of-view in the y-direction, measured in degrees and centred about y=0. The aspect ratio gives the relative horizontal size of the image, centred about x=0.

OpenGL functions for setting up view transformations

viewing transformation
modelview matrix

gluLookAt() 

projection transformation
projection matrix

glFrustum() 
gluPerspective() 
glOrtho() 
gluOrtho2D() 

viewing transformation

glViewport() 

An OpenGL Example

////////////////////////////////////////////////////
// Scene.cpp

// Template code for drawing an interesting scene.
////////////////////////////////////////////////////

#ifdef WIN32
#include <windows.h>
#endif

#include <GL/gl.h>
#include <GL/glu.h>
#include <stdio.h>
#include <stdlib.h>
#include <GL/glut.h>

void drawTable();
void draw_table_leg(float x, float y);

int Win[2];     // window (x,y) size

/*********************************************************
    PROC: glut_key_action()
    DOES: this function gets called for any keypresses
**********************************************************/

void glut_key_action(unsigned char key, int x, int y)
{
    if (key=='q') {
	    exit(0);
    } 
    glutPostRedisplay();
}

/*********************************************************
    PROC: myinit()
    DOES: performs most of the OpenGL intialization -- don't change
**********************************************************/

void myinit(void)
{
    GLfloat ambient[] = { 0.0, 0.0, 0.0, 1.0 };
    GLfloat diffuse[] = { 1.0, 1.0, 1.0, 1.0 };
    GLfloat specular[] = { 1.0, 1.0, 1.0, 1.0 };
    GLfloat position[] = { 0.0, 3.0, 3.0, 0.0 };
    
    GLfloat lmodel_ambient[] = { 0.2f, 0.2f, 0.2f, 1.0f };
    GLfloat local_view[] = { 0.0 };

    /**** set lighting parameters ****/
    glLightfv(GL_LIGHT0, GL_AMBIENT, ambient);
    glLightfv(GL_LIGHT0, GL_DIFFUSE, diffuse);
    glLightfv(GL_LIGHT0, GL_POSITION, position);
    glLightModelfv(GL_LIGHT_MODEL_AMBIENT, lmodel_ambient);
    glLightModelfv(GL_LIGHT_MODEL_LOCAL_VIEWER, local_view);
    glEnable(GL_LIGHTING);
    glEnable(GL_LIGHT0);
    glEnable(GL_AUTO_NORMAL);
    glEnable(GL_NORMALIZE);
    glEnable(GL_DEPTH_TEST);
    glDepthFunc(GL_LESS);
}

/*********************************************************
    PROC: set_colour();
    DOES: draws a teapot of the given colour -- don't change
**********************************************************/

void set_colour(float r, float g, float b)
{
    float ambient = 0.2f;
    float diffuse = 0.7f;
    float specular = 0.4f;
    GLfloat mat[4];

    /**** set ambient lighting parameters ****/
    mat[0] = ambient*r;
    mat[1] = ambient*g;
    mat[2] = ambient*b;
    mat[3] = 1.0;
    glMaterialfv (GL_FRONT, GL_AMBIENT, mat);

    /**** set diffuse lighting parameters ******/
    mat[0] = diffuse*r;
    mat[1] = diffuse*g;
    mat[2] = diffuse*b;
    mat[3] = 1.0;
    glMaterialfv (GL_FRONT, GL_DIFFUSE, mat);

    /**** set specular lighting parameters *****/
    mat[0] = specular*r;
    mat[1] = specular*g;
    mat[2] = specular*b;
    mat[3] = 1.0;
    glMaterialfv (GL_FRONT, GL_SPECULAR, mat);
    glMaterialf (GL_FRONT, GL_SHININESS, 0.5);
}

/*********************************************************
    PROC: display()
    DOES: this gets called by the event handler to draw
          the scene, so this is where you need to build
	  your scene -- make your changes and additions here
	  Add other procedures if you like.
**********************************************************/

void display(void)
{
    /* glClearColor (red, green, blue, alpha           */
    /* Ignore the meaning of the 'alpha' value for now */
    glClearColor(0.7f,0.7f,0.9f,1.0f);   /* set the background colour */
    /* OK, now clear the screen with the background colour */
    glClear(GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT);

    /* draw a yellow teapot */
    set_colour(1, 1, 0.6f);  /* yellow */
    glutSolidTeapot(1.0);

    /* draw another white teapot y=2 units up */
    glTranslatef(0,2,0);    /* translate by y=2 */
    set_colour(1, 1, 1);    /* white */
    glutSolidTeapot(1.0);
    
    /* draw a third red teapot x= -2 units (to the left) */
    /* note that each translation is RELATIVE to the previous one */
    glTranslatef(-2,0,0);    /* translate by x = -2 */
    set_colour(1, 0, 0);     /* red */
    glutSolidTeapot(1.0);
    /* let's now undo these relative transformations */
    glTranslatef(2,-2,0);

    /* An easier way to undo relative transformations is
       enclose them between  glPushMatrix() and glPopMatrix
       calls. The glPopMatrix() restores the coordinate system 
       in effect at the time of the most recent glPushMatrix(). */
    glPushMatrix();       
    glTranslatef(-2,0,0);     
    set_colour(0,1,0.6f);  
    glutSolidSphere(1.0,10,10);
    glPopMatrix();

    glPushMatrix();       
    glTranslatef(2,-0.5 ,0);
	  glRotatef(-90, 1,0,0);    /* rotate -90 deg about x */
    set_colour(0,0.6f,1);     /* mostly blue */
    glutSolidCone(1.0, 3.0,10,8);
    glPopMatrix();

    glPushMatrix();       
    glTranslatef(4,0,0);
    set_colour(0,1,0.6f);      /* mostly green */
    glutSolidCone(1.0, 3.0,10,8);
    glPopMatrix();

    glPushMatrix();
    glTranslatef(2,2,0);
    set_colour(1,0.2f,0.2f);    /* red torus */
    glutSolidTorus(0.2,0.5,16,10);
    glPopMatrix();

    glPushMatrix();
    glTranslatef(0,-5,0);      /* brown table */
    set_colour(0.8f,0.4f,0.5f);
    glRotatef(-90,1,0,0);      /* rotate table */
    glScalef(4,4,4);           /* scale by 4 */
    drawTable();
    glPopMatrix();
		
    glPushMatrix();
    glTranslatef(0,-1,2);  /* place on table-top (y=-1) and to the front */  
    glScalef(1,4,1);       /* scale in y */
    set_colour(1,0,1);     /* magenta colour */
    glTranslatef(0,0.5,0);
    glutSolidCube(1.0);
    glPopMatrix();

    glFlush();
}

/*********************************************************
    PROC: myReshape()
    DOES: handles the window being resized -- don't change
**********************************************************/

void myReshape(int w, int h)
{
	  Win[0] = w;
	  Win[1] = h;
    glViewport(0, 0, w, h);
    glMatrixMode(GL_PROJECTION);
    glLoadIdentity();
   /*** this defines the field of view of the camera   ***/
   /*** Making the first 4 parameters larger will give ***/
   /*** a larger field of view, therefore making the   ***/
   /*** objects in the scene appear smaller            ***/
    glFrustum(-1,1,-1,1,4,100);
    glMatrixMode(GL_MODELVIEW);
    glLoadIdentity();
   /*** this sets the virtual camera          ***/
   /*** gluLookAt( x,y,z,   x,y,z   x,y,z );  ***/
   /***            camera  look-at camera-up  ***/
   /***            pos'n    point   vector    ***/
    
   /*** place camera at (x=4, y=6, z=20), looking at origin,
          y-axis being up         ***/
    gluLookAt(4,6,20,0,0,0,0,1,0);
}

 /*********************************************************
     PROC: main()
     DOES: calls initialization, then hands over control
	   to the event handler, which calls 
	   display() whenever the screen needs to be redrawn
 **********************************************************/

int main(int argc, char** argv)
{
    glutInitDisplayMode (GLUT_SINGLE | GLUT_RGB | GLUT_DEPTH);
    glutInitWindowPosition (0, 0);
	  glutInitWindowSize(300,300);
    glutCreateWindow(argv[0]);
    myinit();
    glutReshapeFunc(myReshape);
	  glutKeyboardFunc(glut_key_action);
	
    printf("An OpenGL Still Life\n");
	  printf("Press 'q' to quit\n");

    glutDisplayFunc(display);
	  glutMainLoop();
	  return 0;         /* never reached */
}

/*********************************************************
    PROC: drawTable()
    DOES: draws a table, built from polygons -- add more functions
          of your own like this, if you can understand what is going
          on here
**********************************************************/

void drawTable()
{
  /* draw table top */
  glBegin(GL_POLYGON);
  glNormal3f(0,0,1);
  glVertex3f(-1,-1,1);
  glVertex3f(1,-1,1);
  glVertex3f(1,1,1);
  glVertex3f(-1,1,1);
  glEnd();

  /* draw the four table legs */
  draw_table_leg(-0.6f,-0.7f);
  draw_table_leg(0.6f,-0.7f);
  draw_table_leg(0.6f,0.7f);
  draw_table_leg(-0.6f,0.7f);
}

/*********************************************************
    PROC: draw_table_leg(x,y)
    DOES: draws a table leg in the given location
**********************************************************/

void draw_table_leg(float x, float y)
{
  glPushMatrix();
  glTranslatef(x,y,0.5f);
  glScalef(0.2f,0.2f,1);
  glutSolidCube(1.0); /* a helper function in glut library that draws a unit cube at origin */
  glPopMatrix();
  return;
}

Notes

Viewing Pipeline


(The viewing pipeline)

OCS Object Coordinate System
WCS World Coordinate System
VCS View Coordinate System
NDCS Normalized Device Coordinate System
DCS Device Coordinate System
CCS  

We can also separate projection transformation and perspective divisions into two steps. In that case, the view pipeline is:

CCS are also called homogenous coordinates. The extended pipeline is useful to understand for clipping.

Viewing Transformations

The modelling transformation expresses the position of objects in the world. The view transformation expresses the position of world object w.r.t. to the camera.

(Objects & cameras are placed in the world, i.e., w.r.t. to the world coordinate system)

Defining Camera Location

A common way of defining the camera position and orientation is to use:


Constructing Camera Frame

Vectors e, g, and t is all that is required to construct a coordinate system with origin e and uvw basis:

These vectors can then be used to form a change-of-basis matrix, which is composed with a translation to yield the viewing transformation Mv (or world-to-camera frame transformation).

Projection Transformation

Projection transformation takes a point in viewing space to the canonical volume space (also called normalized device coordinates).

View Volumes

View volumes are used for:

A perspective view volume:

An orthographic view volume:

Canonical View Volume:

Defined by all 3D points whose cartesian coordinates lies between +1 and -1. Clipping operations could be carried out directly based upon the view volumes defined, but is simpler using a canonical view volume. The coordinate system is referred to as the normalized device coordinate system or NDCS.

The transformation from VCS to NDCS can be conveniently viewed as part of the projection transformation, as we shall see shortly.

Viewport Transformation

Projection Transformation + Viewport Transformation

Aside: in world coordinates (xyz) we represent viewer or camera coordinates as (uvw); however, when we start taking about only viewer or camera coordinate (as in the following case) we represent the viewer or camera coordinates as uvw. In the following diagrams xyz are actually uvw constructed above. As a rule, one should be very careful about not confusing various coordinate frames that are used in graphics.


(Projection Transformation: From Viewing Coordinate System to Normalized Device Coordinates. Viewport Transformation: From NDCS to Device Coordinates)

Derivation of Projection Transformations

The general purpose of the projection transformation is to map a 3D point in VCS to a 2D point in NDCS. However, having a z-coordinate in NDCS allows us to do visibility calculations, so the point in NDCS will be 3D as well.

Orthographic Projections

Orthographics projections require only scaling and translation and are therefore the simplest. We make the following observation for the required transformation. (Keep in mind that near and far are defined on the negative z-axis, i.e., near>far.)

Point in Orthographic View Volume
(VCS)
Point in Canonical View Volume
(NDCS)
y=top
y=+1
y=bottom
y=-1
x=right
x=+1
x=left
y=-1
z=near
z=+1
z=far

z=-1

First translate the orthographic view volume along x,y, and z directions so that its origin coincides with the origin of the canonical view volume. Note that the center of canonical view volume is (0,0,0) and the center of the orthographic view volume is C_orthographic=((right+left)/2,(top+bottom)/2,(far+near)/2). So translating the orthographic view volume by the -C_orthographic does the trick. In the matrix form, we can write

Again considering just the y-coordinate.


y-coordinate: VCS vs. NDCS before translation

y-coordinate: VCS vs. NDCS after translation. top is now top' and bottom is now bottom'; however, note that top'-bottom'=top-bottom as translation does not introduce any scaling

To find the scale factor, we empoy the line equation y=mx+C. Considering the above figure, y(=y') is the y-coordinate of the point in NDCS, x(=y) is the y-coordinate of the corresponding point in translated VCS and C = 0. The slope of the line m is given by 2/(top'-bottom') or 2/(top-bottom).

Therefore,

y' = 2 * y / (top - bottom)

Similarly,

x' = 2 * x / (right - left)

z' = 2 * z / (near - far)

Combining the two matrics will give us the required orthographic projection:

so

(Note: in Shirley, p.114 refers to matrix M_o which combines orthographic projection and viewport transformation. We have instead chosen to separate the two; therefore, we have one extra matrix, M_orthographic.)

Perspective Projections

Perspective projections are more commonly used and require some additional effort to derive.

First, lets consider the basic perspective transformation and use the following example to construct a matrix that performs a perspective transformation:

From similar triangles, we can see that y'/d = y/z
Thus, y' = yd/z
Similarly, x' = xd/z

We can express this in matrix form by altering the value of the homogeneous parameter, h

We know that

So

Which is the required result.

Perspective transformations are non-linear due to division by h and can incorporate scaling. These also produce forshortening:

Derivation of Perspective Projection (Transformation from perspective view volume to canonical view volume)

First, we transform the points in perspective view volume to an orthographic view volume and then use the orthographic projection dervied in the previous section to transform the point into the canonical view volume.

where, VCS is the perspective view volume coordinate frame. Here, we want to compute matrix, M_p. (Shirely, p. 119).

Let's consider the x coordinates of a point p in perspective VCS. We know the x' = x * d / z. Now we want x'=x when z=near, so we can choose d=near. There by the perspective matrix becomes

and

That takes care of x and y coordinates. We still have to resolve the z coordinates. From the above equation,

z'=(A*z+B)*near/z (Equ. 1).

Note the following mapping for the z-coordinate.

Z-Coordinate in Perspective View Volume
Z-Coordinate in Orthographic View Volume
1
z=near
z'=near
2
z=far
z'=far

Substituting 1 and 2 in Equ. 1 we get, near = A*near+B and far = (A*near + B)*near/far, which can be solved for A and B.

A = (near+far)/near and B = -far.

So,

(See Shirley, p. 119)

Non-linearity in Perspective Transformations

The perspective transformations produce a z-coordinate for use in visibility calculations. It is, however, a non-linear function of the original z coordinate. To illustrate this, consider a railroad viewed in perspective as follows.
  tracks
    left:  x= -1, y= -1
    right: x=  1, y= -1

  view volume
    left = -1, right = 1
    bot  = -1, top   = 1
    near =  1, far   = 4


In this scene, what happens to z in VCS and NDCS as we move along the track? The following expressions tell us what we wish to know.


It can be directly seen that z_NDCS is a non-linear function of z_VCS.


What does this look like in the image plane? Let's determine z_VCS as a function of x_NDCS. With this we can point at the image and ask What is the real distance of this point?


Lastly, what does the train track look like in NDCS?

Straight lines in VCS correspond to straight lines in NDCS, although the 'speed' at which one moves along them is not the same in the two coordinate systems.

Viewport Transformation (From NDCS to DCS)

We want to derive a transformation that will map a point p=(x,y,z) described in canonical view volume on to the screen. Assume that screens resolution is nx by ny pixels.

Note: the screen is a 2-dimensional space so we will simply throw away the z-values. In reality, the z values are used to determine which objects are visible and which are hidden (objects with a smaller z-value is closer to eye and they hide objects that have higher z-values).

We observe the following mappings:

(x-coordinate)

(y-coordinate)

Using y=mx+c we can calculate the following matrix

so

Putting It All Together (From OCS to DCS)

Let P be a point in OCS. Let M_world be the world transformation matrix and M_v (Shirley, p. 115) is the viewing transformation matrix.

Case 1: Orthographic Camera

Case 2: Perspective Camera

Note that in Shirley, p. 111

Pinhole Cameras

Reviewing some of the basics of optical imaging systems helps reveal some of the implicit assumptions often made in producing images with computer graphics.

Ideal pinhole camera


Real pinhole camera


Aperture with a lens system


Why is depth-of-field limited?


Camera Elements


The 'aperture' of a lens is the diameter of the lens opening divided by the focal length of the lens. For a typical 50mm lens, the aperture may range from f/2 (large aperture) to f/22 (small aperture). The aperture will determine the depth of field of the resulting image.

Clipping in 3D

Both the Cohen-Sutherland line-clipping algorithm and the Sutherland-Hodgman polygon-clipping algorithm can be extended to 3D. We could choose to perform clipping in any one of VCS, CCS, or NDCS. As we shall explain shortly, there is a shortcoming to clipping in NDCS.

Clipping in VCS

Both the line-clipping and polygon-clipping algorithms made use of in/out tests for half-spaces. For the orthographic view volumes presented previously, the view-volume plane equations can be written in a consistent way, such that all the normals are pointing into the view volume. If F(P)>0, then P is inside the view volume.

  left:    x - left = 0
  right:  -x + right = 0
  bottom:  y - bottom = 0
  top:    -y + top = 0
  front:  -z - near = 0
  back:    z + far = 0

The same can also be done for the perspective view volume:

  left:    x + left*z/near = 0
  right:  -x - right*z/near = 0
  top:    -y - top*z/near = 0
  bottom:  y + bottom*z/near = 0
  front:  -z - near = 0
  back:    z + far = 0

The Cohen-Sutherland clipping procedure works exactly the same in 3D as it does in 2D. Vertex outcodes are generated and tested for a trivial accept or reject. If there is no trivial accept or reject, the line is clipped against one if the six view-volume planes, then tested again, and so on.

The Sutherland-Hodgman polygon clipping algorithm also works in a similar way. The polygon can be clipped against the six view-volume planes in succession.

Clipping in NDCS

NDCS provides a potentially nice coordinate system for clipping operations because the plane equations are simply defined and always remain unchanged. Furthermore, lines in VCS are lines in NDCS and therefore it would seem that correct intersections can be calculated, despite the fact that NDCS-space has a strange warp to it because it is post-perspective division.

The potential problem of clipping in NDCS is that the sign of the depth information is lost, as shown in the following example.

Clipping in CCS

We'll define the clipping region in CCS by first looking at the clipping region in NDCS:
-1 <= x/w <= 1
This means that in CCS, we have:
-w <= x <= w

This can be visualized as follows:

The clipping regions are analogous for y and z. The following example illustrates clipping in CCS.

Exercises

Exercise 1

Given a camera with the following parameters:

Peye=(3,2,4) Pref=(0.5,0.5,0.5) Vup=(0,1,0), and with an image plane at z=-1 (in Viewing coordinates),
  1. What is the view matrix MWCS-VCS?
  2. On the image plane, what are the coordinates of vertices A,B,C,D and E of the following pyramid?
  3. There will be two vanishing points. What are their coordinates on the image plane?

Exercise 2

Given the following camera parameters:

  1. Construct the viewing matrix Mworld-cam.
  2. Assume the following projection matrix,
    [1  0  0  0]
    [0  1  0  0]
    [0  0  1  0]
    [0  0 -1  0]
        
    and project onto the image plane the eight vertices of an axis-aligned unit cube, with corners at (0,0,0) and (1,1,1).
  3. After projection, lines which were parallel will appear to converge to a "vanishing point". The cube has three sets of parallel edges. Determine the coordinates of the vanishing points, on the image plane.

Exercise 3

Shadows an important features to model. An easy way to render the shadows cast by an object onto the ground plane is to draw a version of the object that has been projected onto the ground plane, in an appropriately dark colour. If the direction of the light source is specified by a vector L, compute the projected coordinates P'(x',y',z') of an object vertex P(x,y,z). Assume that you are projecting onto the xz plane.

Exercise 4

Derive the projection matrix for the oblique projection given by the view volume shown below. It should map the given view volume into the same NDC coordinate system used in OpenGL, namely one bounded by the cube -1<=x<=1, -1<=y<=1, -1<=z<=1. Assume that the view volume is not oblique when viewed from above.

Exercise 5

Suppose we wish to allow a user to translate the camera directly to the left and right, as shown in the figure below. When the user hits the left arrow key, the camera should move to the left, such that the object currently located at the center of the field of view and at a distance of 10 units should move to the right by 20 pixels. Assume a 512x512 image and a viewing frustum having a 30 degree horizontal field of view. How should the current viewing transformation be modified in order to implement the translation required by one press of the left arrow key?

Exercise 6

Suppose a perspective view volume is defined by: near=1, far=4, left=-1, right=1, bottom=-1, top=1. Clip the VCS line P1(0,2,0)P2 to this view volume. Perform your clipping computations in VCS.

Exercise 7

Assume a perspective view volume as defined for the above question. Assume a viewport image size of 800x800. Given the line AB having endpoints A(-1,-1,-4) B(0,-1,-10), determine:

  1. the midpoint of the projected line A'B'.
  2. the midpoint of the original line in terms of where it would project to in the image.