Control Keys

move to next slide (also Enter or Spacebar).
move to previous slide.
d  enable/disable drawing on slides
p  toggles between print and presentation view
CTRL  +  zoom in
CTRL  -  zoom out
CTRL  0  reset zoom

Slides can also be advanced by clicking on the left or right border of the slide.

Notation

Type Font Examples
Variables (scalars) italics $a, b, x, y$
Functions upright $\mathrm{f}, \mathrm{g}(x), \mathrm{max}(x)$
Vectors bold, elements row-wise $\mathbf{a}, \mathbf{b}= \begin{pmatrix}x\\y\end{pmatrix} = (x, y)^\top,$ $\mathbf{B}=(x, y, z)^\top$
Matrices Typewriter $\mathtt{A}, \mathtt{B}= \begin{bmatrix}a & b\\c & d\end{bmatrix}$
Sets calligraphic $\mathcal{A}, B=\{a, b\}, b \in \mathcal{B}$
Number systems, Coordinate spaces double-struck $\mathbb{N}, \mathbb{Z}, \mathbb{R}^2, \mathbb{R}^3$

Cameras

• To display a three-dimensional scene as a two-dimensional image, the mapping process must be described mathematically
• The mapping of 3D objects into a 2D image plane is often called projection
• To this end, in computer graphics different camera models are used

Camera Models

Effects caused by lens reflections
in the camera
• Often these camera models are idealized and can only approximately simulate the effects, which occur when observing the world with our eyes or with a real camera
• This introductory lecture describes only those geometric projections that arise when straight lines are used as projection rays

Camera Models

• Camera models can be classified into perspective projection and parallel projection
Perspective projection
Parallel projection

Perspective Projection

Pinhole Camera

• The perspective projection is very familiar to us as human beings, because our eye produces such a perspective projection
• An important attribute of the perspective projection, in contrast to the parallel projection, is that objects at a larger distance to the viewer or camera are displayed smaller
• The simplest perspective projection is a mapping that use a pinhole camera model

Pinhole Camera

• A pinhole camera consists of a camera body with a very small hole through which the light can enter
• The image is formed at the back of the camera body and is displayed upside-down
• A larger hole has the advantage that more light can enter the camera, resulting in shorter exposure times
• The disadvantage is that multiple projections overlap and the image is out of focus
object
camera
pinhole
image
of the object
larger pinhole

Pinhole Camera

• In computer graphics, usually an idealized model of a camera is used, which has an infinitely small hole
• This camera model can not simulate defocus, that is, all objects will be displayed perfectly sharp
• Furthermore, it is assumed that the image is formed on an imaginary image plane in front of the projection center, so that the image is no longer upside-down
object
camera
pinhole
image
of the object
center of projection
image plane

Perspective Projection

focal length
center of projection
image plane
focal length
image plane
$x$
$y$
$z$
$x$
$f$
$z$
$\tilde{\mathbf{P}}$
$\mathbf{P}$
$\tilde{\mathbf{P}}$
$\mathbf{P}$
$f \frac{p_x}{p_z}$
$f$
• The formula for mapping a 3D point $\mathbf{P}=(p_x,p_y,p_z)^\top$ to a point $\tilde{\mathbf{P}}= (\tilde{p}_x,\tilde{p}_y,\tilde{p}_z)^\top$ located at the image plane of the camera is given by:

$\tilde{\mathbf{P}}= \left( f \frac{p_x}{p_z}, f \frac{p_y}{p_z}, f \right)^\top$

• This follows immediately from the figure by application of the intercept theorem, since

$\frac{\tilde{p}_x}{f} = \frac{p_x}{p_z}$ and $\frac{\tilde{p}_y}{f} = \frac{p_y}{p_z}$

Perspective Projection

• Using homogeneous coordinates the perspective projection can be written as a linear mapping using a $4 \times 4$ matrix:

$\tilde{\mathbf{P}}= \begin{pmatrix} \tilde{p}_x \\ \tilde{p}_y \\ \tilde{p}_z \end{pmatrix}= \begin{pmatrix} f \frac{p_x}{p_z}\\ f \frac{p_y}{p_z}\\ f \end{pmatrix} \in \mathbb{R}^3 \longmapsto \underline{\tilde{\mathbf{P}}}= \begin{pmatrix}f \, p_x \\f \, p_y \\ f \, p_z\\ p_z\end{pmatrix} \in \mathbb{H}^3$

\begin{align}\underline{\tilde{\mathbf{P}}} & = \begin{pmatrix}f \, p_x \\f \, p_y \\ f \, p_z\\ p_z\end{pmatrix} = \underbrace{\begin{bmatrix} f & 0 & 0 & 0 \\ 0 & f & 0 & 0 \\ 0 & 0 & f & 0 \\ 0 & 0 & 1 & 0 \end{bmatrix}}_{\mathtt{A}} \begin{pmatrix}p_x \\p_y \\ p_z\\ 1\end{pmatrix}\\ \underline{\tilde{\mathbf{P}}} &=\mathtt{A}\, \underline{\mathbf{P}} \end{align}

Perspective Projection in OpenGL

$f$
image plane
$x$
$y$
$z$
• In OpenGL, the camera is pointing in the negative $z$-direction. Therefore, we have:

\begin{align}\underline{\tilde{\mathbf{P}}} & = \begin{pmatrix}f \, p_x \\f \, p_y \\ f \, p_z\\ -p_z\end{pmatrix} = \underbrace{\begin{bmatrix} f & 0 & 0 & 0 \\ 0 & f & 0 & 0 \\ 0 & 0 & f & 0 \\ 0 & 0 & -1 & 0 \end{bmatrix}}_{\mathtt{A}} \begin{pmatrix}p_x \\p_y \\ p_z\\ 1\end{pmatrix}\\ \underline{\tilde{\mathbf{P}}} &=\mathtt{A}\, \underline{\mathbf{P}} \end{align}

Perspective Projection in OpenGL

near
far
$x$
$z$
$-z_n$
$-z_f$
displayed
area
• In OpenGL, there is a so-called near- and a far-clipping plane
• The near-plane and far-plane are located parallel to the image plane
• Points are only displayed if their $z$-coordinate lies within the range defined by the near- and far-plane
• To this end, a new linear mapping is defined, such that for points with a $z$-coordinate on the near-plane it holds:

$p_z=-z_n \quad \mapsto \quad \tilde{p}_z=-1$

and for points on the far-plane:

$p_z=-z_f \quad \mapsto \quad \tilde{p}_z=1$

• In order to accomplish this, two new parameters $\alpha$ and $\beta$ are added to the linear transformation matrix

Perspective Projection in OpenGL

$\underline{\tilde{\mathbf{P}}} = \begin{pmatrix}f \, p_x \\f \, p_y \\ \alpha \, p_z + \beta \\ -p_z\end{pmatrix} = \begin{bmatrix} f & 0 & 0 & 0 \\ 0 & f & 0 & 0 \\ 0 & 0 & \alpha & \beta \\ 0 & 0 & -1 & 0 \end{bmatrix} \begin{pmatrix}p_x \\p_y \\ p_z\\ 1\end{pmatrix} \in \mathbb{H}^3$

• Thus, the projected point in Cartesian coordinates is:

$\tilde{\mathbf{P}}=(\tilde{p}_x,\tilde{p}_y,\tilde{p}_z)^\top = \left( f \frac{p_x}{-p_z}, f \frac{p_y}{-p_z}, -\alpha \, + \frac{\beta}{-p_z} \right)^\top \in \mathbb{R}^3$

• Now $\alpha$ and $\beta$ can be determined from the conditions for the mapping of the $z$-coordinate:
\begin{align}p_z=-z_n \,&\mapsto \, \tilde{p}_z=-1 \quad \Rightarrow -\alpha \, + \frac{\beta}{z_n} = -1 \\ p_z=-z_f \, &\mapsto \, \tilde{p}_z=\ 1 \,\,\, \,\quad \Rightarrow -\alpha \, + \frac{\beta}{z_f} = 1\end{align}

Perspective Projection in OpenGL

• Solving the equation system for $\alpha$ and $\beta$ provides:

\begin{align} \alpha &= \frac{z_f+z_n}{z_n-z_f}\\ \beta & = \frac{2 z_f \, z_n}{z_n-z_f}\end{align}

• Thus, for the new projection matrix we have:

\begin{align} \underline{\tilde{\mathbf{P}}} & = \underbrace{\begin{bmatrix} f & 0 & 0 & 0 \\ 0 & f & 0 & 0 \\ 0 & 0 & \frac{z_f+z_n}{z_n-z_f} & \frac{2 z_f \, z_n}{z_n-z_f} \\ 0 & 0 & -1 & 0 \end{bmatrix}}_{\mathtt{A}} \begin{pmatrix}p_x \\p_y \\ p_z\\ 1\end{pmatrix}\\ \underline{\tilde{\mathbf{P}}} &=\mathtt{A}\, \underline{\mathbf{P}} \end{align}

Perspective Projection in OpenGL

focal length A
image height A
focal length B
image height B
$y$
$z$
1
-1
$\Theta$
• Until now we have only defined the focal length $f$ as the distance between the image plane and the camera center, but nothing was stated about the size of the image plane in $x$- and $y$-direction
• In the end, only the ratio between the size of the image plane and the focal length is important, which is uniquely defined by the opening angle $\Theta$. All configurations with the same opening angle result in the same image (only with scaled $x$- and $y$-coordinates).
• In OpenGL, the size of the image plane is always chosen such that the resulting $x$- and $y$-coordinates are in the range $[-1; 1]$.
• For a given opening angle the focal length is therefore obtained by (compare figure):

$\frac{f}{1} = \frac{\cos( 0.5 \, \Theta)}{\sin( 0.5 \, \Theta)} \Leftrightarrow f = \mathrm{cotan}( 0.5 \, \Theta)$

Transformation Matrices in OpenGL

• For the projection from the camera coordinate system into the image plane the GL_PROJECTION matrix is used.
• The manipulation of this matrix is activated by
glMatrixMode(GL_PROJECTION);
• All functions for matrix manipulation, such as glLoadIdentity, glLoadMatrix, glMultMatrix, glRotate, glScale, glTranslate, glPushMatrix, glPopMatrix, gluPerspective are then executed on the GL_PROJECTION matrix.
• The current state of the GL_PROJECTION matrix influences the transformation of objects only if they are drawn (OpenGL as a state machine)

Perspective Projection in OpenGL

Creating a perspective projection matrix in OpenGL:

$\mathtt{A}$
$x$
$y$
$z$
$x$
$y$
$z$
glMatrixMode(GL_PROJECTION);
gluPerspective(fovy, aspect, near, far);


$\mathtt{A} = \begin{bmatrix} \frac{f}{\mathrm{aspect}} & 0 & 0 & 0 \\ 0 & f & 0 & 0 \\ 0 & 0 & \frac{\mathrm{far}+\mathrm{near}}{\mathrm{near}-\mathrm{far}} & \frac{2 \ast \mathrm{far} \ast \mathrm{near}}{\mathrm{near}-\mathrm{far}} \\ 0 & 0 & -1 & 0 \end{bmatrix}$

with $f = \mathrm{cotan}( 0.5 \ast \mathrm{fovy})$

and $\mathrm{aspect}= \mathrm{w} / \mathrm{h}$

Example: "Dolly Zoom" or "Vertigo Effect"

• The idea of the "Dolly Zoom" effect is to compensate a camera translation in $z$-direction ("Dolly") by a change in focal length ("Zoom")
• Mathematically it is easy to see from the projection equation that achieving the compensation is possible, because for the $y$-coordinate of a projected point we have:

$\tilde{p}_y = f \frac{p_y}{-p_z}$

• Since there is only one focal length $f$ but typically many 3D points with different depth value $p_z$ in the scene, the compensation can only be achieved for a selected depth value. This creates an interesting perspective effect.
• A well-known movie is Vertigo (1958) by Alfred Hitchcock, who has used this effect to simulate dizziness of the protagonist

Example: "Dolly Zoom" or "Vertigo Effect"

focal length $f$
$f'$
$y$
$z$
1
-1
$\Theta$
$y$
z
1
-1
$\Theta'$
object
time $t=t_0=1$
time $t > 1$
1
2
3
4
$3\,t$
e.g. unit cube
$\mathbf{P}$
$\mathbf{P}'$

Example: Dolly Zoom in OpenGL

class Renderer {

public:
float t; //time
const float d0; // initial distance

public:
Renderer() : t(1.0), d0(3.0), width(0), height(0) {}

public:
void display() {
glClearColor(0.0f, 0.0f, 0.0f, 0.0f);
glClear(GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT);

glMatrixMode(GL_PROJECTION);
gluPerspective (dollyZoomFovy(),
(float)width/(float)height,
0.1, 50.0);

glMatrixMode(GL_MODELVIEW);
// translate camera by 3 units
glTranslatef(0.0f, 0.0f, -t*d0);

// draw a cube in the local coordinate system
drawCube();
// draw random lines
drawRandomLines();
}

void init() {
glEnable(GL_DEPTH_TEST);

// create random values between -1.0 and 1.0
for(unsigned r=0; r < 1000; r++) {
int r = rand();
randVals.push_back(2.0*float(r)/float(RAND_MAX)-1.0f);
}
}

void resize(int w, int h) {
// ignore this for now
glViewport(0, 0, w, h);
width = w;
height = h;
}

float dollyZoomFovy() {
float fovyInit = 60.0f; // initial field of view
float theta = fovyInit / 180.0f * M_PI; // degree to rad
float f = 1.0f / tan(theta/2.0f);
float fNew = f * (d0*t-1) / (d0-1);
float thetaNew = atan(1.0f / fNew) * 2.0f;
float val = 180.0 * thetaNew / M_PI; //rad to degree
return val;
}

private:
int width;
int height;
std::vector<float> randVals;

private:
void drawCube() {

glColor3f(1.0f, 1.0f, 1.0f);
glLineWidth(3.0f);
glBegin(GL_LINE_LOOP);
glVertex3f(-1.0f, 1.0f, 1.0f);
glVertex3f( 1.0f, 1.0f, 1.0f);
glVertex3f( 1.0f,-1.0f, 1.0f);
glVertex3f(-1.0f,-1.0f, 1.0f);
glEnd();
glBegin(GL_LINE_LOOP);
glVertex3f(-1.0f, 1.0f,-1.0f);
glVertex3f( 1.0f, 1.0f,-1.0f);
glVertex3f( 1.0f,-1.0f,-1.0f);
glVertex3f(-1.0f,-1.0f,-1.0f);
glEnd();

glBegin(GL_LINE_LOOP);
glVertex3f( 1.0f, 1.0f,-1.0f);
glVertex3f( 1.0f, 1.0f, 1.0f);
glVertex3f( 1.0f,-1.0f, 1.0f);
glVertex3f( 1.0f,-1.0f,-1.0f);
glEnd();

glBegin(GL_LINE_LOOP);
glVertex3f(-1.0f, 1.0f,-1.0f);
glVertex3f(-1.0f, 1.0f, 1.0f);
glVertex3f(-1.0f,-1.0f, 1.0f);
glVertex3f(-1.0f,-1.0f,-1.0f);
glEnd();
glLineWidth(1.0);
}

void drawRandomLines() {
if(randVals.size() % 5) return;
unsigned i = 0;
while(i < randVals.size()) {
glColor3f(fabs(randVals[i++]),
fabs(randVals[i++]),
fabs(randVals[i++]));
float x = randVals[i++];
float y = randVals[i++];
glBegin(GL_LINES);
glVertex3f(x, y, -1.0f);
glVertex3f(x, y,  1.0f);
glEnd();
}
}
};


Transformation of the Camera

$\mathtt{T}_{\mathrm{\small cam}}$
$\mathtt{T}_{\mathrm{\small obj}}$
world coordinate system
local coordinate system
camera coordinate system
• Until now, it was assumed that the projection center of the camera is located at the origin of the global world coordinate system
• If a transformation $\mathtt{T}_{\mathrm{\small cam}}$ is applied to the camera, the projection of a point $\mathbf{P}$ defined in a local object coordinate system is given by:

$\underline{\tilde{\mathbf{P}}} = \mathtt{A} \, \mathtt{T}_{\mathrm{\small cam}}^{-1} \, \mathtt{T}_{\mathrm{\small obj}} \, \underline{\mathbf{P}}$

Transformation of the Camera

Mapping equation for homogeneous points:

$\underline{\tilde{\mathbf{P}}} = \mathtt{A} \, \mathtt{T}_{\mathrm{\small cam}}^{-1} \, \mathtt{T}_{\mathrm{\small obj}} \, \underline{\mathbf{P}}$

where the $4 \times 4$ matrix

• $\mathtt{T}_{\mathrm{\small obj}}$ describes the transformation from the local coordinate system to the world coordinate system
• $\mathtt{T}_{\mathrm{\small cam}}^{-1}$ describes the transformation from the world coordinate system to the camera coordinate system
• $\mathtt{A}$ describes the transformation from the camera coordinate system into the image plane

Transformation of the Camera

$\mathbf{C}_a$
$\mathbf{C}_b$
$\mathbf{e}_x$
$\mathbf{e}_y$
$\mathbf{e}_z$
$\tilde{\mathbf{a}}_x$
$\tilde{\mathbf{a}}_y$
$\tilde{\mathbf{a}}_z$
$\tilde{\mathbf{b}}_x$
$\tilde{\mathbf{b}}_y$
$\tilde{\mathbf{b}}_z$
World coordinate system
Local coordinate system
Camera coordinate system
• The transformation matrices are given by the basis vectors of the coordinate systems (as discussed in the chapters before)
• The transformation matrix $\mathtt{T}_{\mathrm{\small obj}}$ transforms a point from the local to the global coordinate system

$\mathtt{T}_{\mathrm{\small obj}} = \begin{bmatrix}\tilde{\mathbf{b}}_x & \tilde{\mathbf{b}}_y & \tilde{\mathbf{b}}_z & \mathbf{C}_b\\0 & 0 & 0 & 1\end{bmatrix}$

• The transformation matrix $\mathtt{T}_{\mathrm{\small cam}}$ transforms a point from the camera to the world coordinate system

\begin{align} \mathtt{T}_{\mathrm{\small cam}} & = \begin{bmatrix}\tilde{\mathbf{a}}_x & \tilde{\mathbf{a}}_y & \tilde{\mathbf{a}}_z & \mathbf{C}_a\\0 & 0 & 0 & 1\end{bmatrix} \\ & = \begin{bmatrix} \mathtt{R}_a & \mathbf{C}_a\\ \mathbf{0}^\top & 1\end{bmatrix} \end{align}

Transformation of the Camera

• For the inverse transformation $\mathtt{T}_{\mathrm{\small cam}}^{-1}$ from the world into the camera coordinate system we have (with $\mathtt{R}_a^{-1}= \mathtt{R}_a^\top$):

$\mathtt{T}_{\mathrm{\small cam}}^{-1} = \begin{bmatrix} \mathtt{R}_a & \mathbf{C}_a\\ \mathbf{0}^\top & 1\end{bmatrix}^{-1} = \begin{bmatrix} \mathtt{R}_a^{\top} & -\mathtt{R}_a^{\top} \mathbf{C}_a\\ \mathbf{0}^\top & 1\end{bmatrix}$

Transformation of the Camera in OpenGL

Mapping equation for homogeneous points:

$\underline{\tilde{\mathbf{P}}} = \mathtt{A} \, \underbrace{\mathtt{T}_{\mathrm{\small cam}}^{-1} \, \mathtt{T}_{\mathrm{\small obj}}}_{\mathtt{T}_{\mathrm{\small modelview}}} \, \underline{\mathbf{P}}$

• In OpenGL, all transformations, except for the projection matrix $\mathtt{A}$, are combined into a so-called GL_MODELVIEW matrix
• Thus, the GL_MODELVIEW matrix directly describes the transformation from the respective local coordinate system to the camera coordinate system
• The GL_PROJECTION matrix $\mathtt{A}$ describes the mapping from the camera coordinate system into the image plane

gluLookAt

• To simplify the definition of the matrix $\mathtt{T}_{\mathrm{\small cam}}^{-1}$ there is the GLU function
gluLookAt(eyex, eyey, eyez, refx, refy, refz, upx, upy, upz);
• By setting up an eye point $\mathbf{C}_{\mathrm{\small eye}}$, a targeted reference point  $\mathbf{P}_{\mathrm{\small ref}}$, and a vector $\mathbf{v}_{\mathrm{\small up}}$ (which defines the direction in which the $y$-coordinate of the camera is pointing) the basis vectors of the camera coordinate system can be computed:
eye point $\mathbf{C}_{\mathrm{\small eye}}$
reference point $\mathbf{P}_{\mathrm{\small ref}}$
up vector $\mathbf{v}_{\mathrm{\small up}}$
$\tilde{\mathbf{a}}_x$
$\tilde{\mathbf{a}}_y$
$\tilde{\mathbf{a}}_z$

\begin{align} \mathbf{d} & = \mathbf{C}_{\mathrm{\small eye}} - \mathbf{P}_{\mathrm{\small ref}}\\ \tilde{\mathbf{a}}_z &= \frac{\mathbf{d}}{|\mathbf{d}|}, \mathbf{v}' = \frac{\mathbf{v}_{\mathrm{\small up}}}{|\mathbf{v}_{\mathrm{\small up}}|} \\ \tilde{\mathbf{a}}_x &= \mathbf{v}'\times \tilde{\mathbf{a}}_z \\ \tilde{\mathbf{a}}_y &= \tilde{\mathbf{a}}_z \times \tilde{\mathbf{a}}_x\\ \mathtt{R}_{a} & = \begin{bmatrix}\tilde{\mathbf{a}}_x & \tilde{\mathbf{a}}_y & \tilde{\mathbf{a}}_z \end{bmatrix} \\ \end{align}

This results in:
$\mathtt{T}_{\mathrm{\small cam}}^{-1} = \begin{bmatrix} \mathtt{R}_a^{\top} & -\mathtt{R}_a^{\top} \mathbf{C}_{\mathrm{\small eye}}\\ \mathbf{0}^\top & 1\end{bmatrix}$

Example: gluLookAt

class Renderer {
...
void resize(int w, int h) {
glViewport(0, 0, w, h);
glMatrixMode(GL_PROJECTION);
gluPerspective (30.0, (float)w/(float)h, 2.0, 20.0);
}
void display() {
glClearColor(0.0f, 0.0f, 0.0f, 0.0f);
glClear(GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT);
glMatrixMode(GL_MODELVIEW);

// camera orbits in the y=10 plane
// and looks at origin
double rad = M_PI / 180.0f * t;
0.0, 0.0, 0.0, // look at
0.0, 1.0, 0.0); // up

//draw cube at origin
drawCube();

glRotatef(45.0f, 0.0f, 0.0f, 1.0f);
glTranslatef(2.5f, 0.0f, 0.0f );
glScalef(0.5f, 0.5f, 0.5f);

//draw transformed cube
drawCube();
}
...
}


Example: gluLookAt

• Which transformations are applied to the vertices of the smaller cube?
glMatrixMode(GL_PROJECTION);
gluPerspective (...);

glMatrixMode(GL_MODELVIEW);
gluLookAt(...);

glRotatef(...);
glTranslatef(...);
glScalef(...);


$\mathtt{T}_{\mathrm{\small projection}}= \mathtt{I}$
$\mathtt{T}_{\mathrm{\small projection}}= \mathtt{I} \, \mathtt{A}$

$\mathtt{T}_{\mathrm{\small modelview}}= \mathtt{I}$
$\mathtt{T}_{\mathrm{\small modelview}}= \mathtt{I}\,\mathtt{T}_{\mathrm{\small cam}}^{-1}$

$\mathtt{T}_{\mathrm{\small modelview}}= \mathtt{I}\,\mathtt{T}_{\mathrm{\small cam}}^{-1} \,\mathtt{T}_r$
$\mathtt{T}_{\mathrm{\small modelview}}= \mathtt{I}\,\mathtt{T}_{\mathrm{\small cam}}^{-1} \,\mathtt{T}_r\,\mathtt{T}_t$
$\mathtt{T}_{\mathrm{\small modelview}}= \mathtt{I}\,\mathtt{T}_{\mathrm{\small cam}}^{-1} \,\mathtt{T}_r\,\mathtt{T}_t\,\mathtt{T}_s$

\begin{align} \underline{\tilde{\mathbf{P}}} &= \mathtt{T}_{\mathrm{\small projection}} \mathtt{T}_{\mathrm{\small modelview}} \, \underline{\mathbf{P}}\\ &= \mathtt{A} \, \mathtt{T}_{\mathrm{\small cam}}^{-1} \,\mathtt{T}_r\,\mathtt{T}_t\,\mathtt{T}_s \,\underline{\mathbf{P}} \end{align}

Example: gluLookAt and chains of transformations

class Renderer {
public:
float t;

public:
Renderer() : t(0.0) {}

public:
void display() {
glClearColor(0.0f, 0.0f, 0.0f, 0.0f);
glClear(GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT);

glMatrixMode(GL_MODELVIEW);

// camera orbits in the y=10 plane
// and looks at origin
double rad = M_PI / 180.0f * t;
0.0, 0.0, 0.0, // look at
0.0, 1.0, 0.0); // up

//draw model at origin
drawCubeHierarchy(0, 4);
}

void init() {
glEnable(GL_DEPTH_TEST);
}

void resize(int w, int h) {
glViewport(0, 0, w, h);
glMatrixMode(GL_PROJECTION);
gluPerspective (30.0, (float)w/(float)h, 0.1, 50.0);
}

private:
void drawCube() {
...
}

void drawCubeHierarchy(int depth, int neighbors) {
drawCube(); // draw parent
depth +=1;
if (depth < 6){
for (int n = 0; n < neighbors; n++){
glPushMatrix();
glRotatef(n*90.0f-90.0f, 0.0f, 0.0f, 1.0f);
glTranslatef(2.5f, 0.0f, 0.0f );
glScalef(0.5f, 0.5f, 0.5f);
drawCubeHierarchy(depth, 3); // draw children
glPopMatrix();
}
}
}
};


Per-Vertex Operations

When using the fixed-function pipeline the following transformations are applied on the vertex data

Perspective Division

• The so-called "Perspective Division" transfers the projected points in homogeneous coordinates into the Cartesian coordinate system by dividing by the last coordinate:

$\underline{\mathbf{P}} = \begin{pmatrix}p_x\\p_y\\p_z\\p_w\end{pmatrix} \in \mathbb{H}^3 \quad \longmapsto \quad \mathbf{P}= \begin{pmatrix}\frac{p_x}{p_w}\\\frac{p_y}{p_w}\\\frac{p_z}{p_w} \end{pmatrix} \in \mathbb{R}^3$

Clipping

• The projection matrix was designed such that after projection and perspective division all $x$, $y$ and $z$-coordinates within the visible volume are mapped to the range $-1$ to $1$
• All primitives that are completely outside this range must not be drawn
• By testing for the range $[-1;1]$ it would be easy to implement clipping after the perspective division step
• In OpenGL, the clipping is carried out before the perspective division. Why?
• Instead of testing the range $[-1;1]$ the range $[-p_w;p_w]$ can be checked just as quickly

$-p_w < p_x < p_w \quad \longmapsto \quad -1 < \frac{p_x}{p_w} < 1$

• This has the advantage that
• for the case $p_w=0$ no special treatment is needed and
• the division computation for clipped coordinates is no longer needed

Viewport Transformation

$x$
$y$
1
-1
1
-1
screen
width
height
• In a final transformation step , the coordinates in the range $[-1;1]$ are scaled to the screen coordinates
• To this end, OpenGL provides the command:
glViewport(int ix, int iy, int width, int height)
• The variables ix and iy define the lower-left corner of the viewport and width and height the screen size (the unit is pixels)

Example: glViewport

class Renderer {

public:
float t;

public:
Renderer() : t(0.0), width(0), height(0) {}

public:
void display() {
glClearColor(0.0f, 0.0f, 0.0f, 0.0f);
glClear(GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT);

// top right viewport (look from front)
glViewport(width/2, height/2, width/2, height/2);
glMatrixMode(GL_MODELVIEW);
drawFrame();
// set camera (look from positive x-direction)
gluLookAt(10.0, 0.0, 0.0,
0.0, 0.0, 0.0,
0.0, 0.0, 1.0);
// draw scene
drawSceneGrid();
drawRotatingPyramid();

// bottom left viewport (look from left)
glViewport(0, 0, width/2, height/2);
glMatrixMode(GL_MODELVIEW);
drawFrame();
// set camera (look from negative y-direction)
gluLookAt(0.0, -10.0, 0.0,
0.0,   0.0, 0.0,
0.0,   0.0, 1.0);
// draw scene
drawSceneGrid();
drawRotatingPyramid();

// top left viewport (look from top)
glViewport(0, height/2, width/2, height/2);
glMatrixMode(GL_MODELVIEW);
drawFrame();
// set camera (look from positive z-direction)
gluLookAt(0.0, 0.0, 10.0,
0.0, 0.0,  0.0,
-1.0, 0.0,  0.0);
// draw scene
drawSceneGrid();
drawRotatingPyramid();

// bottom right viewport (perspective)
glViewport(width/2, 0, width/2, height/2);
glMatrixMode(GL_MODELVIEW);
drawFrame();
// set camera
gluLookAt(8.0, -2.0, 5.0,
0.0,  0.0, 0.0,
0.0,  0.0, 1.0);
// draw scene
drawSceneGrid();
drawRotatingPyramid();
}

void init() {
glEnable(GL_DEPTH_TEST);
//glEnable(GL_CULL_FACE);
}

void resize(int w, int h) {
width = w;
height = h;
glMatrixMode(GL_PROJECTION);
gluPerspective (30.0,
(float)width/(float)height,
0.1, 50.0);
}

private:
int width;
int height;

private:
void drawFrame() {
glLineWidth(2.0);
glMatrixMode(GL_PROJECTION);
glPushMatrix();
glColor3f(1.0f, 1.0f, 1.0f);
glBegin(GL_LINE_LOOP);
glVertex3f(-1.0f, 1.0f, 0.0f);
glVertex3f( 1.0f, 1.0f, 0.0f);
glVertex3f( 1.0f,-1.0f, 0.0f);
glVertex3f(-1.0f,-1.0f, 0.0f);
glEnd();
glPopMatrix();
glMatrixMode(GL_MODELVIEW);
glLineWidth(1.0);
}

void drawSceneGrid() {
glColor3f(0.3f, 0.3f, 0.3f);
glBegin(GL_LINES);
for(unsigned i=0; i<=10; i++) {
glVertex3f(-5.0f+i, -5.0f,   0.0f);
glVertex3f(-5.0f+i,  5.0f,   0.0f);
glVertex3f(-5.0f,   -5.0f+i, 0.0f);
glVertex3f( 5.0f,   -5.0f+i, 0.0f);
}
glEnd();

glColor3f(0.0f, 0.0f, 1.0f);
drawCoordinateAxisZ();
glColor3f(0.0f, 1.0f, 0.0f);
drawCoordinateAxisY();
glColor3f(1.0f, 0.0f, 0.0f);
drawCoordinateAxisX();
}

void drawCoordinateAxisZ() {
glLineWidth(2.0);
glBegin(GL_LINES);
glVertex3f(0.0f, 0.0f, 0.0f); // z-axis
glVertex3f(0.0f, 0.0f, 2.0f);
glEnd();
glLineWidth(1.0);

// z-axis tip
glBegin(GL_TRIANGLES);
glVertex3f( 0.0f, 0.0f, 2.0f);
glVertex3f(-0.05f, 0.05f, 1.9f);
glVertex3f( 0.05f, 0.05f, 1.9f);
glVertex3f( 0.0f,  0.0f, 2.0f);
glVertex3f( 0.05f, -0.05f, 1.9f);
glVertex3f(-0.05f, -0.05f, 1.9f);
glVertex3f( 0.0f,  0.0f, 2.0f);
glVertex3f( 0.05f,  0.05f, 1.9f);
glVertex3f( 0.05f, -0.05f, 1.9f);
glVertex3f( 0.0f,  0.0f, 2.0f);
glVertex3f(-0.05f, -0.05f, 1.9f);
glVertex3f(-0.05f,  0.05f, 1.9f);
glEnd();
glBegin(GL_POLYGON);
glVertex3f( 0.05f, -0.05f, 1.9f);
glVertex3f( 0.05f,  0.05f, 1.9f);
glVertex3f(-0.05f,  0.05f, 1.9f);
glVertex3f(-0.05f, -0.05f, 1.9f);
glEnd();
}

void drawCoordinateAxisX() {
glPushMatrix();
glRotatef(90.0f, 0.0f, 1.0f, 0.0f);
drawCoordinateAxisZ();
glPopMatrix();
}

void drawCoordinateAxisY() {
glPushMatrix();
glRotatef(-90.0f, 1.0f, 0.0f, 0.0f);
drawCoordinateAxisZ();
glPopMatrix();
}

void drawRotatingPyramid() {
glRotatef(t, 0.0f, 0.0f, 1.0f);
drawPyramid();
}

void drawPyramid() {
glColor3f(1.0,0.0,0.0);
glBegin(GL_TRIANGLES);
glVertex3f( 0.0f, 0.0f, 1.5f);
glVertex3f(-1.0f, 1.0f, 0.0f);
glVertex3f( 1.0f, 1.0f, 0.0f);
glEnd();
glColor3f(0.0,1.0,0.0);
glBegin(GL_TRIANGLES);
glVertex3f( 0.0f,  0.0f, 1.5f);
glVertex3f( 1.0f, -1.0f, 0.0f);
glVertex3f(-1.0f, -1.0f, 0.0f);
glEnd();
glColor3f(0.0,0.0,1.0);
glBegin(GL_TRIANGLES);
glVertex3f( 0.0f,  0.0f, 1.5f);
glVertex3f( 1.0f,  1.0f, 0.0f);
glVertex3f( 1.0f, -1.0f, 0.0f);
glEnd();
glColor3f(1.0,1.0,0.0);
glBegin(GL_TRIANGLES);
glVertex3f( 0.0f,  0.0f, 1.5f);
glVertex3f(-1.0f, -1.0f, 0.0f);
glVertex3f(-1.0f,  1.0f, 0.0f);
glEnd();
glColor3f(0.0,1.0,1.0);
glBegin(GL_POLYGON);
glVertex3f( 1.0f, -1.0f, 0.0f);
glVertex3f( 1.0f,  1.0f, 0.0f);
glVertex3f(-1.0f,  1.0f, 0.0f);
glVertex3f(-1.0f, -1.0f, 0.0f);
glEnd();
}
};

Vanishing Points

• By using a perspective projection parallel lines in 3D space are mapped to non-parallel lines in the 2D image plane
• The 2D intersection of this line in the image plane is called vanishing point
• Each spatial direction can have its own (or no) vanishing point
• Depending on how many vanishing points exist, the projection is called a 1-, 2-, or 3-point perspective
Source: wikipedia.org; Author: Wolfram Gothe 2009; public domain

Z-Buffer

Depth Test

• In the previous examples glEnable(GL_DEPTH_TEST) and glClear(GL_DEPTH_BUFFER_BIT) were used without discussing their functionality
• The function call glEnable(GL_DEPTH_TEST) is used to activated the depth test in OpenGL
• If the depth test is disabled, the primitives are written into the framebuffer in the order in which they are passed into the OpenGL pipeline
• This means that later drawn primitives are covering the ones drawn earlier
• This is typically not the desired behavior
• Instead, primitives that are closer to the camera should cover more distant ones, regardless of the order of drawing
• Ideally, the decision for each drawn pixel should be done in the framebuffer, because the individual primitives can penetrate each other
• In OpenGL the Z-Buffer method is employed

Z-Buffer Method

$x$
Normalized device coordinates
Camera coordinates
$\mathtt{A}$
$x$
$y$
$z$
$x$
$y$
$z$
• Although actually the $z$-coordinate in the camera coordinate system is the one to consider, the depth test can be carried out after the perspective division, since the depth relations are not changed
• However, when using "Normalized device coordinates" the $z$-axis is reversed with respect to the camera coordinate system, i.e., more distant points have a larger $z$
(note, the left-handed coordinate here)
• For points on the near-plane in the camera coordinate system, now applies $\tilde{p}_z=-1$ and respectively for the ones on the far-plane $\tilde{p}_z=1$

Z-Buffer Method

• The Z-Buffer method requires (in addition to the usual framebuffer which contains the color information) a depth buffer of the same dimensions, which contains the depth values
Framebuffer        Depth Buffer

Z-Buffer Method

• At the beginning of the rendering process, the depth buffer is initialized with the z-values of the far-plane. This is done in OpenGL using the command glClear(GL_DEPTH_BUFFER_BIT)
• Writing a pixel in the frame- and depth-buffer occurs during the per-fragment operations in the OpenGL pipeline
• The depth value for each pixel is interpolated by the rasterizer using the transformed vertex information
• If the depth value for the pixel is smaller than the currently stored one in the depth buffer the color value is written into the framebuffer and the depth value into the depth buffer, otherwise both remain unchanged
FOR each primitiv
FOR each pixel of primitive at position (x,y) with colour c and depth d
IF d < depthbuffer(x,y)
framebuffer(x,y) = c
depthbuffer(x,y) = d
END IF
END FOR
END FOR

Z-Fighting

depth resolution
$z$
• The depth buffer has only a certain accuracy. Typically an integer value with 16, 24 or 32 bits of precision
• The interval [-1.0; 1.0] is mapped to [0.0, 1.0] and then to [0, MAX_INT], e.g., [0, 65535] for 16 bits
• The value is rounded to the nearest integer
• Because the "Normalized device coordinates" have already been divided by $p_w$ the rounding errors for objects close to the camera are smaller (and consequently their depth accuracy is higher)
• Therefore, at distant primitives that are close together sometimes the so-called "Z-Fighting" can be observed, which is caused by random inaccuracies in the z-values where at times the one or the other primitive is shown.
• To resolve Z-fighting, it is important to choose the near- and far-plane with care, since these ultimately define the z-range onto which the possible integer depth value are spread
• Therefore, the near- and far-plane should be selected as close together as possible, such that they just enclose the depicted 3D scene

Example: Z-Fighting

class Renderer {

public:
float t;
int width, height;
double nearPlane, farPlane;
int depthBits;

public:
Renderer() : t(0.0), nearPlane(2.0), farPlane(20.0) {}

public:

void resize(int w, int h) {
glViewport(0, 0, w, h);
width = w;
height = h;
}

void display() {
glClearColor(0.0f, 0.0f, 0.0f, 0.0f);
glClear(GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT);

glMatrixMode(GL_PROJECTION);
gluPerspective (30.0, (float)width/(float)height, nearPlane, farPlane);
glMatrixMode(GL_MODELVIEW);

// camera orbits in the y=10 plane
// and looks at origin
double rad = M_PI / 180.0f * t;
0.0, 0.0, 0.0, // look at
0.0, 1.0, 0.0); // up

//draw cube at origin
drawCube();
}

void init() {
glEnable(GL_DEPTH_TEST);
glGetIntegerv (GL_DEPTH_BITS, &depthBits);
}
private:
void drawCube() {
...
}
};



Are there any questions?

Please notify me by e-mail if you have questions, suggestions for improvement, or found typos: Contact