Implementation

jSVR is an implementation of the single view reconstruction technique in Java. Its main functionality is the reconstruction of a 3D scene from an image given there is enough perspective information in it. All the algorithms described previously are implemented in jSVR. In this section the implementation details and the usage of the application are described..

External packages used in the process are presented, as well as a brief description of the classes and the way the source code is organized. Finally there is a step by step demonstration of reconstructing an scene, using the program.

General implementation information.

jSVR was developed in a research-test-integrate way. In the first stages there was no intention to integrating all the algorithms in a complete program. The techniques described in various publications where implemented, tested and finally intergraded with the previous result. Then this code base was used for the next stage of the process. This is one of the reasons why in the final result, there are more actions available to the user than what is needed just for the reconstruction.

The development platform chosen is Java2 (www.java.sun.com). Java is a versatile language with very good object oriented futures. It is excellent for building small modules, and then integrating them to a central application. Also Java is platform independent and has a vast code base, available as open source, which is good when ``reinventing the wheel'' is not the purpose of the project.

External source packages used.

The implementation uses the following external packages:

minpack: This is the minimization package rewritten in Java [22]. It is a translation of the original FORTRAN package. Minpack contains an implementation of the Levenberg-Marquardt algorithm that was used in the ML estimation of a straight edge from a set of pixels.
Java3d: This package is part of the Java-media APIs developed by SUN. It is used for displaying the results of the reconstruction in an interactive way [18,23].
CyberVRML97: This library provides an API to convert Java3d models to VRML files [11].
canny,convolution,matrix: The canny edge detector the supporting Convolution class and the Matrix class are modified versions from various open source archives.

Source organization

The whole project was developed using the Eclipse IDE [19]. The source code is separated into three main packages:

svr.gui

This package, as the name implies, holds all the Graphical user interface classes

SVRui class is the main class of the application and is also the main frame (window) of the user interface. It allows the user to execute any algorithm implemented as well as to define their execution variables as shown below in the application demonstration section.
ImagePane class is the area where the image transformations are shown before the reconstruction phase.
ReconstructionUI. This class is the user interface for the scene reconstruction. This UI implements the functionality needed for the interactive definition of lines and planes on the image. It also allows the modification of the scale factors calculated in the previous phases.
ImDeconstructionPane is the part of the ReconstructionUI where the image is displayed.
svr.gui.j3d. This sub-package of svr.gui holds the classes that generate and display the 3d model of the scene in Java3d.
- SVR3Dview. This class implements the window where the model is displayed. It also contains the methods that create the models in the scene from the data produced in the reconstruction process.
- Ground. This is a utility class for generating the ground plane of the model in Java3d.

svr.logic

All algorithm implementations are contained in this package. The most important classes of this package are:

VanishingPointCalculator. This class has the implementations (as static methods) of the algorithms needed for determining the vanishing points.
Reconstruction. The equations for planar measurements are in this class as static methods.
LmderWrapperBestLine. This class implements the MLE estimation discussed in section 4.1.2 using the minpack implementation of lmder.

svr.util

Utility classes used by the classes in svr.logic and the svr.gui packages are stored here.

ImageReconstruction. Everything relevant to the reconstruction of an image is stored in an instance of ImageReconstruction. Also it holds the user defined lines and planes with their textures. It is used for saving (and loading from) the `.svr' files which are serializations of this class.
Plane. This is the superclass of all `XPlane' named classes. They are used for storing the various plane types.
Matrix. It was originally written by Mick Flanagan and is used with very small modifications
MatrixAlgebra. All methods for matrix manipulation are contained in the MatrixAlgebra class in the svr.util.math package.

Reconstructing a scene

Now we will go through the reconstruction process starting from a single image and finally producing a 3d model. The steps are divided into subgroups for better presentation. In the and of this section some more examples of 3d reconstruction are presented. The image we will reconstruct (shown in figure 4) is a photograph of building from the ruins of the Knossos's palace in Heraklion Crete. All photographs from the palace are shot with a common digital camera, with resolution 1024 $\times$ 768 pixels.

**Figure 4:** Photograph from the ruins of Knossos palace.
$\includegraphics[width=8cm]{images/wall.eps}$

Gathering information about the scene

This is done in three phases. First we detect sets of straight edges in the scene. Secondly we try to identify the vanishing points. Finally the user enters characteristic lengths and heights in the scene.

**Figure 5:** The *SVR* application user interface. **(a)** Main *SVR* window. All actions are initiated from the buttons in this window. The red buttons control basic application functions (i.e. load, save, exit,etc.). The green buttons allow access to algorithms used in the early stages of development and are no longer used in the process of reconstruction. The black buttons are for initiating each step of the process. The orange check boxes open windows with the results of the algorithms executed, as well as the *main image window* and the *console*. The text areas on the bottom of the window, allow the user to change the thresholds of the algorithms. **(b)** Main image window. The results in each stage of the process are displayed here. **(c)** Console window. It is open by the ``show SVR console'' check box. All the output from the algorithms executed is displayed here (i.e. execution time, results, reconstruction information etc.).
$\includegraphics[width=\textwidth height=!]{images/app.eps}$

Detecting straight edges

We start the application and load the image through the user interface (figure 5).

The button ``Canny Edge Detection'' on the SVR main window executes the smoothing ⁷ algorithm and then sends the output in the canny edge detector, which also breaks the detected edges in the image onto points of high curvature. The result of this process is shown on figure 6. The edge pixels found are colored to distinguish the edges.

**Figure 6:** Result of canny edge detection. The adjacent edges are merged if they are closer than a threshold. In the picture, adjacent pixels with the same color belong to the same detected edge.
$\includegraphics[width=8cm]{images/cannyres.eps}$

The next step is to calculate the straight edges from the above pixel sets. The button labeled ``Line MLE with lmder'' starts the execution of the MLE algorithm (described in the previous section) and fits the best line to each pixel set (figure 7).

**Figure 7:** Result of MLE line fitting on the detected edges.
$\includegraphics[width=8cm]{images/mleres.eps}$

Identifying the vanishing points

In order to find the correct three vanishing points we use the algorithms described in section 4.2. This is step is not automatic. When the user presses the ``Calculate Vanishing points'' the most supported vanishing point is found. Then the user has to decide if the proposed point is correct, as shown on figure 8

**Figure 8:** Selecting the correct VPs. The supporting lines for the detected point are shown on the image. The intersection point maybe laying outside the image (as is the case in our example). The coordinates of the point are shown in the *choice window* on the bottom right of the image. The user can either select to accept the point (by clicking **Yes**), to discard it (No) , or to discard the point *and* remove the supporting lines from the set of straight edges (by clicking **Ignore**. In the latest case, the lines will not effect the computation of the next VP. Finally the user can choose to stop the selection process by clicking **Stop** (i.e. if all three VPs are detected).
$\includegraphics[width=8cm]{images/selectp.eps}$

After all three VPs are selected the user must also define which two are on the Vanishing Line (figure 9).

**Figure 9:** Identify Vanishing line. Two of the three VPs lie on the ground vanishing line. In our example this these are the first and the second (selection **1-2**).
$\includegraphics[width=4cm]{images/vlselect.eps}$

The result of the above process is shown on figure 10.

**Figure 10:** Vanishing line. **(a)** The three sets of parallel lines that give the three vanishing points. A small number is enough to give a correct reconstruction, but identifying the correct parallel lines from a big set, can be very difficult. This is why human input is necessary. **(b)** The ground vanishing line is shown in red. The other two VLs ( $\mathbf{v_xv_z}$ and $\mathbf{v_yv_z}$ ) lie outside of the image. The other three green lines are the verticals from a vanishing point to the corresponding vanishing line.
$\includegraphics[width=\textwidth height=!]{images/vls.eps}$

Defining scale factors

The application has now enough information to calculate an up to scale Homography matrix. In order to be able to produce a 3D model of the scene with logical proportions, as well as to make measurements on the image the user must supply the program with a reference width and height.

The buttons ``Set reference height'', ``Set Reference Length on X axis'' and ``Set reference Length on Y axis'' allow this operation. All three buttons work in similar way. When clicked, the program expects the user to select a point ( $\mathbf{p}$ ) on the image by left clicking on it ⁸. The next point is restricted in the direction of the line $\mathbf{pv_z}$ , $\mathbf{pv_x}$ or $\mathbf{pv_y}$ respectively for each button. When the second point ( $\mathbf{p'}$ ) is clicked, the user is asked with a dialog to enter the length that correspond to the distance between the two points in the world (figure 11).

**Figure 11:** Setting reference height
$\includegraphics[width=8cm]{images/ldialog.eps}$

For example, if we select the ``Set reference height'' button we perform the following steps:

click on a point that is the image of a ground point in the scene.
click on a point that is on a known height $\mathbf{h}$ from the ground in the scene.
enter the value of $\mathbf{h}$ in the dialog box.

Making measurements in the image

Now that we have enough information, we can make various measurements in the image, using the theory described in section 4.2. As it shown on figure 12(a and b) this can be done using the buttons ``Find relevant Length'' and ``Find relevant Height''.

The user may also choose to continue the reconstruction process by clicking the ``Start Reconstruction'' button (figure 12c). The next section is describing how the user interface of this phase works.

**Figure 12:** Measurements on the image. **(a)** Find relevant height. The user selects two points on the image. The first should be on the ground. The equation 6 is used to calculate the distance. Various length measurements are shown in the image. **(b)** Find relevant length. It works similarly with **(a)** and uses the equations 7 and 8. **(c)** Start reconstruction. This button opens the ``Reconstruction Interface'' whindow. The usage is demonstrated in section 5.4.2.
$\includegraphics[width=\textwidth height=!]{images/height-recon.eps}$

User assisted reconstruction

The second phase of the reconstruction begins by clicking the ``Start Reconstruction'' button. This opens the ``Reconstruction Interface'' window which allows the user to define planes on the image. As it was also described earlier in this document, this process requires a quite complicated interface. In the development process it became clear that in order to keep the application as user friendly as possible the controls for this phase should be on a separate window (figure 13). Note that despite the fact that the next steps are done through the "Reconstruction Interface" window, the buttons in the main SVR window are still functional. The user can save his/her work at any time.

**Figure 13:** Reconstruction user interface. The *actions* set of buttons allow the user to define primitive objects on the image. The *ground mode* button sets the model either indoors or outdoors. The *text areas* in the bottom right of the window, allow the user to change the scale factors calculated in the previous steps. This is because Java3D has its own scale factors and it is possible to get a very big or small model, depending on the reference lengths entered.
$\includegraphics[width=\textwidth height=!]{images/reconWin.eps}$

The Single view reconstruction techniques, described so far, provide a way for calculating the metrics of the scene. This allows the mapping of a point on the image plane to a point on the world. Although this seems, at first, to be enough to make a complete reconstruction of the scene, it is not. Given a point on the image, without some kind of shape and pattern recognition there is no way of deciding if this point belongs to the ground or to an object in the scene (i.e. a person in a photograph). The main problem, at this point is to find an easy way to let the user define the primitive shapes discussed in section 4.2.3. In order to reconstruct the model with Java3d we need the following:

at least one plane vertical to the ground.
planes not vertical to the ground if any , with at least 3 points on vertical or parallel planes.
textures for each plane.

The user can provide the above information through the UI of figure 13.

Defining lines

Lines vertical to the ground can be defined by clicking the ``Vertical Lines'' button. The first point clicked on the image, must be on the ground level (figure 14(a)).
Define free lines (``Free Lines'' button). A free line is a line on a vertical plane (figure 14(b)). If we know that a line is on a vertical plane we can map every point of the line, back to the real world using a simple algorithm shown in table 6.
Finally the user must define ``Ground lines''. The constrain in this line category is that they can only begin and end on a ground point of a vertical line (figure 14(c)).

**Table 6:** Computing world points from a line and a vertical plane.
$\begin{table} % latex2html id marker 804 \hrule \begin{itemize} \item Let $... ...athbf{h} \end{array}]$. \end{enumerate} \end{itemize} \hrule \end{table}$

**Figure 14:** Defining Lines on the image. **(a)** The defined *vertical* lines are shown in red. **(b)** The *free* lines are colored blue. Note that the set starts and ends on the high point of a vanishing line. The user interface will not allow the definition of a free line set, if it is not connected in this way. **(c)** The *ground* line is shown in green. The user can abort a line selection by right clicking on the image plane.
$\includegraphics[width=\textwidth height=!]{images/linedefine.eps}$

Defining planes

The above line sets, must now be grouped into planes. For the SVR application there are three categories of planes.

Vertical planes. By pressing the ``Vertical Planes'' button the user is expected to select a subset of the previously defined lines on the image which form a closed polygon. This polygon must consist of:
1. one ground line
2. two vertical lines which start on each point of the above ground line.
3. a set of free lines connected in a way that the first line starts at the highest point of one vertical line, and the last ends at the highest point of the other vertical line.
Parallel planes. This planes consist of a vertical line which defines the distance of the plane from the ground, and a free line. They where intended to be used as supporting structures in order to allow the user to define base planes other than the ground (i.e in order to calculate the $\mathrm{H'}$ matrix from equation 9 or to help define free planes). Since they are only for support , these planes are always set invisible in this version of the application.
Free planes. In order to define a plane we need three known points. A plane on the image can be defined by points on other `known' planes (vertical or parallel). The only problem is to define the appropriate polygon in the image in order to make correct texture extraction from it. In the SVR application a free plane is defined by at least two edges on vertical planes (an example with a free plane is shown on figure 18).

In figure 15 is shown the image, with three vertical planes defined.

**Figure 15:** All planes defined. The reconstruction process is completed. We have defined three vertical planes on the image. The wall on the left, the wall on the right and the door. Outdoors ground mode is selected and a tile on the ground is selected for texture of the ground plane.
$\includegraphics[width=10cm]{images/wallcomplete.eps}$

When a plane is defined, the SVR extracts the texture of the polygon from the image and stores it with all the other information, to a `plane' object. Then the `SVR3DView' class uses these object to create the corresponding Java3D models and render them on the screen.

In order to make the scene more realistic we can define a ground texture by clicking the ``Ground Tile'' button and selecting two points on the image. The image part in the square defined by these two points will be used as texture for the ground plane. We can also define the size of the ground plane. If the scene is ``outdoors'', then clicking on the ground mode button will instruct the `SVR3DView' class to create a ground plane bigger than the model. If the scene is inside a room, then we can use the ``indoors'' mode for the ground, which will render a floor in the interior of the model.

The last step is to create the model. The button ``Create model'' opens the 3D view window which allows the user to navigate inside the scene.

**Figure 16:** Views of the reconstructed wall. **(a)** looking straight through the door. **(b)** front view. **(c)** looking behind the wall with the door. Note that the door in the image opens to the inside of the right wall.
$\includegraphics[width=\textwidth height=!]{images/wall3d.eps}$

The model can be saved in the application format, or exported to VRML using the ``Export scene to VRML97 file'' button in the bottom of the 3D view window.

**Figure 17:** Reconstruction of the queens room in the palace of Knossos. **(a)** The original image. **(b)** The model viewed from an angle.
$\includegraphics[width=\textwidth height=!]{images/queen3d.eps}$

**Figure 18:** Reconstruction of a shed. This picture is a very good model for 3d reconstruction. It is used as a sample in many publications about reconstruction from single and multiple views. The resolution of the picture used for reconstruction is 700 $\times$ 438. **(a)** The original image. **(b)** The plane of the roof of the shed is modeled as a free plane. **(c)** A side view of the shed.
$\includegraphics[width=\textwidth height=!]{images/house3d.eps}$

Footnotes

... smoothing ⁷: The Gaussian smoothing algorithm can be also executed separately from the corresponding button.
... it ⁸: The user can cancel the process by right-clicking on the image.

jSVR

Single View Reconstruction

User assisted reconstruction

Footnotes