The concepts presented in this document are mostly based on the theory developed in Criminisi's thesis and other publications and proceed from a strong and reliable mathematical basis, Projective Geometry.
In particular, 2D-2D homographic transformations and more general 3D-2D projectivities are used. The algorithms presented in the later sections require no knowledge of the camera's internal parameters (focal length, aspect ratio, principal point) or external ones (position and orientation). Camera calibration is replaced by the use of scene constraints such as planarity of points and parallelism of lines and planes.
In this section we describe the basic equations and mathematical structures for making measurements on the image. They are used in the algorithms presented in section 4.
The camera model we used is known as pinhole camera (central projection). Every point in 3D space is projected onto the image plane by a straight visual ray that connects it with the camera center, also the center of projection (figure 2).
The projection can be formulated mathematically with the help of a matrix,
called projection matrix and is denoted as .
Given the projection matrix , we can map a world point 3
to an image point
using the equation (1).
The camera model is defined if we know the matrix .
In single view reconstruction we use a specialization of the image to world mapping described above. Given a known plane on the world, we can map each point on this plane to the corresponding point on the image. Note that the coordinates of the world point are coordinates relative to the mapped plane and not to the wold coordinate system. This mapping is called Planar homography and is done using a matrix called Homography matrix, denoted as
This mapping is useful, in our case , because we have a plane on the world with known attributes. This is the ground plane. Suppose that the coordinate system is set in a way that the vertical direction is the axis. In a three-dimensional scene the ground plane can be defined as the plane, whose points have the same coordinate4.
The Homography matrix can be computed from the relative positioning of the two planes and camera center, and the camera internal parameters [5, p. 33]. It can also be computed by at-least four image-to-world point correspondences [5, p. 48]. Since in most cases of SVR we don't know the camera parameters, and the acquisition of image-to-world mappings is not always possible (i.e. when trying to reconstruct a painting), the above techniques are not easily applicable. Fortunately the matrix can be computed using another method.
In 2.2 the concepts of vanishing points and vanishing lines where introduced. These geometric cues, convey a lot of information about the direction of lines and the orientation of planes. Given the vanishing line of the ground plane for a scene as well as the vertical vanishing point, can obtain an up-to-scale version of the homography matrix [8].
Once the matrix is known, we can map any ground point on the image back to
the real world, by computing the matrix and applying the equation 4
In this work we make use of the cross-ratio invariant for lines.
Given four points on a line, their cross-ratio is preserved under projective transformation thought the ratio of distances is not preserver. The cross-ratio is defined by
The two points and correspond to points in the real scene and are of the form
The length of the segment is