# 3D projection

3D projection is any method of mapping three-dimensional points to a two-dimensional plane. As most current methods for displaying graphical data are based on planar two-dimensional media, the use of this type of projection is widespread, especially in computer graphics, engineering and drafting.

## Orthographic projection

Orthographic projections are a small set of transforms often used to show profile, detail or precise measurements of a three dimensional object. Common names for orthographic projections include plan, cross-section, bird's-eye, and elevation.

If the normal of the viewing plane (the camera direction) is parallel to one of the 3D axes, the mathematical transformation is as follows; To project the 3D point ax, ay, az onto the 2D point bx, by using an orthographic projection parallel to the y axis (profile view), the following equations can be used:

bx = sxax + cx
by = szaz + cz

where the vector s is an arbitrary scale factor, and c is an arbitrary offset. These constants are optional, and can be used to properly align the viewport. The projection can be shown using Matrix notation (introducing a temporary vector d for clarity)

$\begin{bmatrix} {d_x } \\ {d_y } \\ \end{bmatrix} = \begin{bmatrix} 1 & 0 & 0 \\ 0 & 0 & 1 \\ \end{bmatrix}\begin{bmatrix} {a_x } \\ {a_y } \\ {a_z } \\ \end{bmatrix}$
$\begin{bmatrix} {b_x } \\ {b_y } \\ \end{bmatrix} = \begin{bmatrix} {s_x } & 0 \\ 0 & {s_z } \\ \end{bmatrix}\begin{bmatrix} {d_x } \\ {d_y } \\ \end{bmatrix} + \begin{bmatrix} {c_x } \\ {c_z } \\ \end{bmatrix}.$

While orthographically projected images represent the three dimensional nature of the object projected, they do not represent the object as it would be recorded photographically or perceived by a viewer observing it directly. In particular, parallel lengths at all points in an orthographically projected image are of the same scale regardless of whether they are far away or near to the virtual viewer. As a result, lengths near to the viewer appear foreshortened.

## Perspective projection

The perspective projection requires greater definition. A conceptual aid to understanding the mechanics of this projection involves treating the 2D projection as being viewed through a camera viewfinder. The camera's position, orientation, and field of view control the behavior of the projection transformation. The following variables are defined to describe this transformation:

• $\mathbf{a}_{x,y,z}$ - the point in 3D space that is to be projected.
• $\mathbf{c}_{x,y,z}$ - the location of the camera.
• $\mathbf{\theta}_{x,y,z}$ - The rotation of the camera. When $\mathbf{c}_{x,y,z}$=<0,0,0>, and $\mathbf{\theta}_{x,y,z}$=<0,0,0>, the 3D vector <1,2,0> is projected to the 2D vector <1,2>.
• $\mathbf{e}_{x,y,z}$ - the viewer's position relative to the display surface. [1]

Which results in:

• $\mathbf{b}_{x,y}$ - the 2D projection of $\mathbf{a}$.

First, we define a point $\mathbf{d}_{x,y,z}$ as a translation of point $\mathbf{a}$ into a coordinate system defined by $\mathbf{c}$. This is achieved by subtracting $\mathbf{c}$ from $\mathbf{a}$ and then applying a vector rotation matrix using $-\mathbf{\theta}$ to the result. This transformation is often called a camera transform (note that these calculations assume a left-handed system of axes): [2] [3]

$\begin{bmatrix} \mathbf{d}_x \\ \mathbf{d}_y \\ \mathbf{d}_z \\ \end{bmatrix}=\begin{bmatrix} 1 & 0 & 0 \\ 0 & {\cos -\mathbf{\theta}_x } & {\sin -\mathbf{\theta}_x } \\ 0 & { - \sin -\mathbf{\theta}_x } & {\cos -\mathbf{\theta}_x } \\ \end{bmatrix}\begin{bmatrix} {\cos -\mathbf{\theta}_y } & 0 & { - \sin -\mathbf{\theta}_y } \\ 0 & 1 & 0 \\ {\sin -\mathbf{\theta}_y } & 0 & {\cos -\mathbf{\theta}_y } \\ \end{bmatrix}\begin{bmatrix} {\cos -\mathbf{\theta}_z } & {\sin -\mathbf{\theta}_z } & 0 \\ { - \sin -\mathbf{\theta}_z } & {\cos -\mathbf{\theta}_z } & 0 \\ 0 & 0 & 1 \\ \end{bmatrix}\left( {\begin{bmatrix} \mathbf{a}_x \\ \mathbf{a}_y \\ \mathbf{a}_z \\ \end{bmatrix} - \begin{bmatrix} \mathbf{c}_x \\ \mathbf{c}_y \\ \mathbf{c}_z \\ \end{bmatrix}} \right)$

Or, for those less comfortable with matrix multiplication. Signs of angles are inconsistent with matrix form:

$\begin{array}{lcl} d_x &= &\cos \theta_y\cdot(\sin \theta_z\cdot(a_y-c_y)+\cos \theta_z\cdot(a_x-c_x))-\sin \theta_y\cdot(a_z-c_z) \\ d_y &= &\sin \theta_x\cdot(\cos \theta_y\cdot(a_z-c_z)+\sin \theta_y\cdot(\sin \theta_z\cdot(a_y-c_y)+\cos \theta_z\cdot(a_x-c_x)))+\cos \theta_x\cdot(\cos \theta_z\cdot(a_y-c_y)-\sin \theta_z\cdot(a_x-c_x)) \\ d_z &= &\cos \theta_x\cdot(\cos \theta_y\cdot(a_z-c_z)+\sin \theta_y\cdot(\sin \theta_z\cdot(a_y-c_y)+\cos \theta_z\cdot(a_x-c_x)))-\sin \theta_x\cdot(\cos \theta_z\cdot(a_y-c_y)-\sin \theta_z\cdot(a_x-c_x)) \\ \end{array}$

This transformed point can then be projected onto the 2D plane using the formula (here, x/y is used as the projection plane, literature also may use x/z):[4]

$\begin{array}{lcl} \mathbf{b}_x &= &(\mathbf{d}_x - \mathbf{e}_x) (\mathbf{e}_z / \mathbf{d}_z) \\ \mathbf{b}_y &= &(\mathbf{d}_y - \mathbf{e}_y) (\mathbf{e}_z / \mathbf{d}_z) \\ \end{array}$

Or, in matrix form using homogeneous coordinates:

$\begin{bmatrix} \mathbf{f}_x \\ \mathbf{f}_y \\ \mathbf{f}_z \\ \mathbf{f}_w \\ \end{bmatrix}=\begin{bmatrix} 1 & 0 & 0 & -\mathbf{e}_x \\ 0 & 1 & 0 & -\mathbf{e}_y \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 1/\mathbf{e}_z & 0 \\ \end{bmatrix}\begin{bmatrix} \mathbf{d}_x \\ \mathbf{d}_y \\ \mathbf{d}_z \\ 1 \\ \end{bmatrix}$

and

$\begin{array}{lcl} \mathbf{b}_x &= &\mathbf{f}_x / \mathbf{f}_w \\ \mathbf{b}_y &= &\mathbf{f}_y / \mathbf{f}_w \\ \end{array}$

The distance of the viewer from the display surface, $\mathbf{e}_z$, directly relates to the field of view, where $\alpha=2 \cdot \tan^{-1}(1/\mathbf{e}_z)$ is the viewed angle. (Note: This assumes that you map the points (-1,-1) and (1,1) to the corners of your viewing surface)

Subsequent clipping and scaling operations may be necessary to map the 2D plane onto any particular display media.

## Diagram

To determine which screen x coordinate corresponds to a point at Ax,Az multiply the point coordinates by:

$\text{screen x coordinate}(Bx) = \text{model x coordinate}(Ax) \times \frac{\text{distance from eye to screen}(Bz)}{\text{distance from eye to point}(Az)}$

the same works for the screen y coordinate:

$\text{screen y coordinate}(By) = \text{model y coordinate}(Ay) \times \frac{\text{distance from eye to screen}(Bz)}{\text{distance from eye to point}(Az)}$

(where Ax and Ay are coordinates occupied by the object before the perspective transform)

## References

1. ^ Ingrid Carlbom, Joseph Paciorek (December 1978). Planar Geometric Projections and Viewing Transformations. v.10 n.4. ACM Computing Surveys (CSUR). pp. 465–502. doi:10.1145/356744.356750.
2. ^ Riley, K F (2006). Mathematical Methods for Physics and Engineering. Cambridge University Press. pp. 931,942. doi:10.2277/0521679710. ISBN 0521679710.
3. ^ Goldstein, Herbert (1980). Classical Mechanics 2nd Edn.. Reading, Mass.: Addison-Wesley Pub. Co.. pp. 146–148. ISBN 0201029189.
4. ^ Sonka, M; Hlavac, V; Boyle, R (1995), Image Processing, Analysis & Machine Vision 2nd Edn., Chapman and Hall, pp. 14, ISBN 0412455706