3d modelling mems

7/28/2019 3D Modelling MEMS

1/10

Three-dimensional models consisting of thegeometry and texture of urban surfacescould aid applications such as urban planning, training,disaster simulation, and virtual-heritage conservation. A standard technique used to create large-scale city models automatically or semi-automatically is to apply stereo vision on aerial or satellite imagery. 1 In recent

years, advances in resolution andaccuracy have also made airbornelaser scanners suitable for generat-ing digital surface models (DSM)and 3D models. 2 Although you canperform edge detection more accu-rately with aerial photos, airbornelaser scans require no camera-parameter estimation and featuredetection to obtain 3D geometry.Previous work has attempted toreconstruct polygonal models by using a library of predened build-ing shapes or by combining the DSM with digital ground plans or aerialimages. While you can achieve sub-

meter resolution with this technique, you can only cap-ture the building roofs and not their facades.

Several research projects have attempted to createmodels from ground-based views at a high level of detailto enable virtual exploration of city environments. Whilemost of these approaches result in visually pleasing mod-els, they involve an enormous amount of manual work,such as importing the geometry obtained from construc-tion plans or selecting primitive shapes and correspon-dence points for image-based modeling or complex dataacquisition. Other attempts to acquire close-range datain an automated fashion have used image-based 3 or 3Dlaser scanner approaches. 4,5 However, these approachesdont scale to more than a few buildings because they acquire data in a slow, stop-and-go fashion.

In previous work, we proposed an automated methodthat would rapidly acquire 3D geometry and texturedata for an entire city. 6,7 This method uses a vehicleequipped with 2D laser scanners and a digital camera

to acquire data while driving at normal speeds on pub-lic roads. Other researchers proposed a similar systemusing 2D laser scanners and line cameras. 8 In both sys-tems, the researchers acquire data continuously andquickly. In another article, we presented automatedmethods to process this type of data efciently to obtaina detailed model of the building facades in downtownBerkeley. 9 However, these facade models dont provideinformation about roofs or terrain shape because they consist only of surfaces visible from the ground level.

In this article, we describe an approach to register andmerge our detailed facade models with a complemen-tary airborne model. Figure 1 shows the data-ow dia-gram of our method. The airborne modeling process onthe left in Figure 1 provides a half-meter resolutionmodel with a birds-eye view of the entire area, con-taining terrain prole and building tops. The ground-based modeling process on the right results in a detailedmodel of the building facades. Using the DSM obtainedfrom airborne laser scans, we localize the acquisition vehicle and register the ground-based facades to the air-borne model by means of Monte Carlo localization(MCL). We merge the two models with different reso-lutions to obtain a 3D model.

Textured surface mesh from airbornelaser scans

We use the DSM for localizing the ground-based data-acquisition vehicle and for adding roofs and terrain tothe ground-based facade models. In contrast to previ-ous approaches, we dont explicitly extract geometricprimitives from the DSM. While we use aerial laser scansto create the DSM, its equally feasible to use a DSMobtained from other sources, such as stereo vision.

Scan point resampling and DSM generationDuring airborne laser scan acquisition using a 2D scan-

ner mounted on an airplane, the unpredictable roll andtilt motion of the plane destroyed the inherent row-column order of the scans. Thus, the scans might be inter-preted as an unstructured set of 3D vertices in space, withthe x, y coordinates specifying the geographical location

3D Reconstruction and Visualization

The authors present an

approach to automatically

creating textured 3D city

models using laser scans and

images taken from ground

level and a birds-eye

perspective.

Christian Frh and Avideh ZakhorUniversity of California, Berkeley

Constructing 3D

City Models byMerging Aerial andGround Views

52 November/December 2003 Published by the IEEE Computer Society 0272-1716/03/$17.00 2003 IEEE


2/10

and the z coordinate the altitude. Toprocess the scans efciently, its help-ful to resample the scan points to arow-column structure even thoughthis step could reduce the spatial res-olution. To transfer the scans into aDSMa regular array of altitude valueswe dene a row-columngrid in the ground plane and sort thescan points into the grid cells. Thedensity of scan points is not uniform, which means there are grid cells with no scan point and others withmultiple scan points. Because thepercentage of cells without scanpoints and the resolution of the DSMdepend on grid cell size, we mustcompromise by leaving few cells without a sample while maintainingthe resolution at an acceptable level.

In our case, the scans have anaccuracy of 30 centimeters in thehorizontal and vertical directionsand a raw spot spacing of 0.5 metersor less. Measuring both the rst andthe last pulses of the returning laserlight, we select a square cell size of 0.5 0.5 meters, resulting in abouthalf the cells being occupied. Wecreate the DSM by assigning to eachcell the highest z value among its member points to pre-serve overhanging rooftops while suppressing the wallpoints. We ll the empty cells using nearest-neighborinterpolation to preserve sharp edges. We can interpreteach grid cell as a vertex, where the x, y location is thecell center and the z coordinate is the altitude valueoras a pixel at x , y with a gray intensity proportional to z.

Processing the DSM The DSM contains the plain rooftops and terrain shape

as well as objects such as cars and trees. Roofs, in partic-ular, look bumpy because of a large number of smallerobjects on them, such as ventilation ducts, antennas, andrailings, which are impossible to reconstruct properly atthe DSMs resolution. Furthermore, scan points belowoverhanging roofs cause ambiguous altitude values,resulting in jittery edges. To obtain more visually pleasingroof reconstructions, we apply several processing steps.

We first target flattening the bumpy rooftops. Weapply a segmentation algorithm to all nonground pix-els on the basis of the depth discontinuity between adja-cent pixels. We replace small, isolated regions withground-level altitude to remove objects such as cars ortrees in the DSM. We subdivide the larger regions fur-ther into planar subregions by means of planar seg-mentation. Then we unite small regions and subregions with larger neighbors by setting their z values to the larg-er regions corresponding plane. This procedure helpsremove undesired small objects from the roofs and pre- vents the separation of rooftops into too many clutteredregions. Figure 2a shows the original DSM and Figure2b the resulting processed DSM.

IEEE Computer Graphics and Applications 53

Monte Carlolocalization

DSMgeneration and

post-processing

Scan matchingand initial pathcomputation

Airborne mesh

generation andtexture mapping

Fused

citymodel

Imageregistration

(manual)

Facade model

Airbornemodel

Cleanedup DSM Facade

meshgeneration

Mark facadesand

foreground

P o s e

i m a g e r y

Blend meshgeneration

Ground based Airborne

Texturemapping

Cameraimages

Aerialimages

Airbornelaser scans

Horizontallaser scan

Verticallaser scan

1 Dataow diagram of our 3D city-modeling approach. Airborne modeling steps are highlight-ed in green, ground-based modeling steps in yellow, and model-fusion steps in white.

2 Processingsteps for DSM:(a) DSMobtained fromscan-pointresampling;(b) DSM afterattening roofs;and (c) seg-ments withRansac lines inwhite.

(a)

(b)

(c)


3/10

The second processing step straightens the jittery edges. We resegment the DSM into regions, detect theboundary points of each region, and use Ransac 10 to ndline segments that approximate the regions. For the con-sensus computation, we also consider boundary pointsof surrounding regions to detect even short linear sidesof regions and to align them consistently with sur-rounding buildings. Furthermore, we allocate bonusconsensus if a detected line is parallel or perpendicularto a regions most dominant line. For each region, weobtain a set of boundary line segments that representthe most important edges. For all other boundary parts where we have not found a proper line approximation, we dont change the original DSM. Figure 2c shows theregions resulting from processing Figure 2b superim-posed with the corresponding Ransac lines. Compared with Figure 2b, most edges look straightened out.

Textured mesh generation Airborne models are commonly generated from scans

by detecting features such as planar surfaces in the DSMor matching a predened set of possible rooftop and build-ing shapes. 2 In other words, these models decompose thebuildings found in the DSM into polygonal 3D primitives.While the advantage of these model-based approaches istheir robust reconstruction of geometry in spite of erro-neous scan points and low sample density, they are high-ly dependent on shape assumptions. In particular, theresults are poor if many unconventional buildings are pre-sent or if buildings are surrounded by treesconditionsthat exist on the UC Berkeley campus. Although the result-ing models might appear clean and precise, the recon-structed buildings geometry is not necessarily correct if the underlying shape assumptions are invalid.

Our application makes an accurate model of the

building facades readily available from the ground-based acquisition. As such, we are primarily interestedin adding the complementary roof and terrain geome-try, so we apply a different strategy to create a modelfrom the airborne view. We transform the cleaned-upDSM directly into a triangular mesh and reduce thenumber of triangles by simplication. The advantage of this method is that we can control the mesh generationprocess on a per-pixel level. We exploit this property inthe model fusion procedure. Additionally, this methodhas a low processing complexity and is robust. Becausethis approach requires no predened building shapes, we can apply it to buildings with unknown shapeseven in presence of trees. Admittedly, doing so typical-ly comes at the expense of a larger number of triangles.

Because the DSM has a regular topology, we candirectly transform it into a structured mesh by connect-ing each vertex with its neighboring ones. The DSM fora city is large, and the resulting mesh has two trianglesper cell, yielding 8 million triangles per square kilome-ter for the 0.5 0.5 meter grid size. Because many ver-tices are coplanar or have low curvature, we candrastically reduce the number of triangles without sig-nicant loss of quality. We use the Qslim mesh simpli-cation algorithm 11 to reduce the number of triangles.Empirically, we have found it possible to reduce the ini-tial surface mesh to about 100,000 triangles per squarekilometer at the highest level of detail without notice-able loss in quality.

Using aerial images taken with an uncalibrated cam-era from an unknown pose, we texture map the reducedmesh semi-automatically. We manually select a few cor-respondence points in both the aerial photo and theDSM, taking a few minutes per image. Then, we com-pute both internal and external camera parameters andtexture map the mesh. A location in the DSM corre-sponds to a 3D vertex in space, letting us project it intoan aerial image if we know the camera parameters. Weuse an adaptation of Lowes algorithm to minimize thedifference between selected correspondence points andcomputed projections. After we determine the cameraparameters for each geometry triangle, we identify thecorresponding texture triangle in an image by project-ing the corner vertices. Then, for each mesh triangle, we select the best image for texture mapping by takinginto account resolution, normal vector orientation, andocclusions.

Ground-based modeling and modelregistration

In previous research, we developed a mobile ground-based data acquisition system consisting of two Sick LMS 2D laser scanners and a digital color camera witha wide-angle lens. 6 We perform the data acquisition ina fast drive-by rather than in a stop-and-go fashion.Doing so enables short acquisition times limited only by trafc conditions. As shown in Figure 3, we mount ouracquisition system on a rack on top of a truck, letting usobtain measurements not obstructed by objects such aspedestrians and cars.

Both 2D scanners face the same side of the street; oneis mounted horizontally, the other vertically, as shown


54 November/December 2003

3 Acquisition vehicle.


4/10

in Figure 4. We mount the camera toward the scanners, with its line of sight parallel to the intersection betweenthe orthogonal scanning planes. Hardware signals syn-chronize laser scanners and the camera. In our mea-surement setup, we use the vertical scanner to scanbuilding facade geometry as the vehicle moves, andhence its crucial to accurately determine the vehiclelocation for each vertical scan.

We developed algorithms to estimate relative vehicleposition changes according to the horizontal scans, and we estimate the driven path as a concatenation of rela-tive position changes. Because errors in the estimatesaccumulate, we apply a global correction. Rather thanusing a GPS sensor, which is not sufciently reliable inurban canyons, we use an aerial photo as a 2D globalreference in conjunction with MCL. We extend the appli-cation of MCL to a global map derived from the DSM todetermine the vehicles pose in nonplanar terrain andto register the ground-based facade models with respectto the DSM.

Edge map and DTM The MCL approach requires us to create two additional

maps: an edge map that contains the location of heightdiscontinuities and a digital terrain model (DTM) thatcontains terrain altitude. We apply a Sobel edge detectorto a gray-scale aerial image to nd edges in the city forlocalizing the ground-based data acquisition vehicle. Forthe DSM, rather than using the Sobel edge detector, wedene a discontinuity-detection lter that marks a pixelif at least one of its eight adjacent pixels is more than apreset threshold below it. We are dealing with 3D heightmaps rather than 2D images. Hence, we only mark theoutermost pixels of the taller objects, such as buildingtops, creating a sharper edge map than a Sobel lter. Infact, the resulting map is a global occupancy grid for walls. While for aerial photos, shadows of buildings ortrees and perspective shift of building tops cause numer-ous false edges in the image, neither problem exists forthe edge map from airborne laser scans.

The DSM contains building facade locations as heightdiscontinuities and the street altitudes for streets on which the vehicle is driven. As such, we can assign thisaltitude to a vehicles z coordinate. Its not possible touse the z value of a DSM location directly because theLidar captures cars and overhanging trees during air-borne data acquisition, resulting in z values up to sev-eral meters above the actual street level for somelocations. For a particular DSM location, we estimatestreet level altitude by averaging the z coordinates of available ground pixels within a surrounding window, weighing them with an exponential function decreas-ing with distance. The result is a smooth, dense DTM asan estimate of the ground level near roads. Figures 5aand 5b show edge maps and DTMs computed from theDSM shown in Figure 2b.

Model registration with MCLMCL is a particle-ltering-based implementation of

the probabilistic Markov localization, and was intro-duced by another research team for tracking the posi-tion of a vehicle in mobile robotics. 12 Given a series of

relative motion estimates and corresponding horizon-tal laser scans, the MCL-based approach we have pro-posed elsewhere can determine the accurate position within a city. The principle of the correction is to adjustinitial vehicle motion estimates so that scan points fromthe ground-based data acquisition match the edges inthe global edge map. The scan-to-scan matching canonly estimate a 3-DOF relative motionthat is, a 2Dtranslation and rotation in the scanners coordinate sys-tem. If the vehicle is on a slope, the motion estimates aregiven in a plane at an angle with respect to the global xy -plane, and the displacement should in fact be cor-rected with the cosine of the slope angle.

However, because this effect is small0.5 percent fora 10-degree slopewe can safely neglect it and use therelative scan-to-scan matching estimates as if the truckscoordinate system were parallel to the global coordinatesystem. Using MCL with the relative estimates from scanmatching and the edge map from the DSM, we canarrive at a series of global pose probability density func-


Buildings

Truck

2D laser

y

z u

v

x

4 Ground-based acquisition setup.

5 Map genera-tion for MCL:(a) edge mapand (b) DTM.For the whitepixels, there isno ground-levelestimateavailable.

(a)

(b)


5/10

tions and correction vectors for x , y , and yaw motion.We can then apply these corrections to the initial pathto obtain an accurate localization of the acquisition vehi-cle. Using the DTM, we obtain an estimate of two moreDOF: For the rst, the nal z( i) coordinate of an inter-mediate pose Pi in the path is set to DTM level at ( x ( i), y (i)) location. For the second, we compute the pitch anglerepresenting the slope as

That is, we compute the pitch by using the height dif-ference and the traveled distance between successivepositions. Because the resolution of the DSM is only 1meter and the ground level is obtained via a smoothingprocess, the estimated pitch contains only the low-frequency components and not highly dynamic pitchchanges, such as those caused by pavement holes andbumps. Nevertheless, the obtained pitch is an accept-able estimate because the size of the truck makes it rel-atively stable along its long axis.

We do not estimate the last missing DOFthe rollangleusing airborne data. Rather, we assume build-ings are generally built vertically, and we apply a his-togram analysis on the angles between successive vertical scan points. If the average distribution peak isnot centered at 90 degrees, we set the roll angle estimateto the difference between histogram peak and 90 degree.

At the end of these steps, we obtain 6-DOF estimatesfor the global pose and apply a framework of automat-ed processing algorithms to remove foreground andreconstruct facade models. We divide the path into easy-to-handle segments that we process individually. Addi-tional steps include generating a point cloud, classifyingareas such as facade and foreground, removing fore-ground geometry, lling facade holes and windows, cre-ating a surface mesh, and mapping the textures. As aresult, we obtain texture-mapped facade models like theone shown in Figure 6. We cant texture map the upperparts of tall buildings that fall outside the cameras eld

of view during data acquisition. A path segment texture is typical-

ly several tens of megabytes, whichexceeds the rendering capabilities of todays graphics cards. Therefore, weoptimize the facade models for ren-dering by generating multiple levels-of-detail (LOD) so that only a smallportion of the entire model is ren-dered at the highest LOD at any given time. We subdivide the facademeshes along vertical planes andgenerate lower LODs for each sub-mesh using the Qslim simplicationalgorithm for geometry and bicubicinterpolation for texture reduction.We combine all submeshes in a scenegraph that controls the switching of the LODs according to the viewersposition. This helps us render the

large amounts of geometry and texture with standardtools such as VRML players.

Model mergingWe automatically generate both ground-based faced

models and the aerial surface mesh from the DSM.Given the complexity of a city environment, itsinevitable that we capture some parts only partially orerroneously, which can result in substantial discrepan-cies between the two meshes. Our goal is a photorealis-tic virtual exploration of the city, so creating models with visually pleasing appearances is more important thanCAD properties such as water tightness. Commonapproaches for fusing meshes, such as sweeping andintersecting contained volume 5 or mesh zippering, 13

require a substantial overlap between the two meshes.This is not the case in our application because the two views are complementary. Additionally, the two mesh-es have entirely different resolutions: the resolution of the facade modelsat about 10 to 15 cmis substan-tially higher than that of the airborne surface mesh. Fur-thermore, enabling interactive rendering requires thatthe two models t together even when their parts havedifferent levels of detail.

Due to its higher resolution, its reasonable to givepreference to the ground-based facades wherever avail-able and to use the airborne mesh only for roofs and ter-rain shape. Rather than replacing triangles in theairborne mesh for which ground-based geometry isavailable, we consider the redundancy before the meshgeneration step in the DSM. For all vertices of theground-based facade models, we mark the correspond-ing cells in the DSM. This is possible because ground-based models and DSM have been registered throughthe localization techniques described earlier.

We further identify and mark those areas that ourautomated facade processing has classified as fore-ground details, such as trees and cars. These marks con-trol the subsequent airborne mesh generation fromDSM. Specically, during the generation of the airbornemesh, we replace the z value for the foreground withthe ground-level estimate from the DTM, and we dont

pitch z z

x x y y i

i i

i i i i( )

( ) ( )

( ) ( ) ( ) ( )arctan

( ) ( )=

+

1

1 2 1 2



6 Ground-based facade models.


6/10

create triangles at ground-based facade positions. Therst step is necessary to enforce consistency and removethose foreground objects in the airborne mesh that havealready been deleted in the facade models. Figure 7ashows the DSM with marked facade areas and fore-ground; Figure 7b shows the resulting airborne surfacemesh with the corresponding facade triangles removedand the foreground areas leveled to DTM altitude.

The facade models that we will put in place of theremoved ones dont perfectly match the airborne meshbecause of their different resolutions and capture view-points. Generally, this procedure makes the removedgeometry slightly larger than the actual ground-basedfacade to be placed in the corresponding location. Tosolve this discrepancy and to make mesh transitions lessnoticeable, we ll the gap with additional triangles to join the two meshes. Figure 8a shows the initial regis-tered meshes. Our approach to creating such a blendmesh is to extrude the buildings along an axis perpen-dicular to the facades, as shown in Figure 8b. We thenshift the loose end vertices location to connect to theclosest airborne mesh surface, as shown in Figure 8c.This technique is similar to the way a plumb line canhelp close gaps between windows and roof tiles. Wenally texture map these blend triangles with the tex-ture from the aerial photo. As they attach at one end tothe ground-based model and at the other end to the air-borne model, blend triangles reduce visible seams atmodel transitions.

ResultsWe have applied our proposed algorithms on a data set

covering downtown Berkeley. We acquired airborne laserscans through the company Airborne 1 Inc. We resam-pled the entire data set consisting of 60 million scanpoints to a 0.5 x 0.5 meter grid and applied the process-ing steps described previously to obtain a DSM, an edgemap, and a DTM for the entire area. We selected featurepoints in 5-megapixel digital images taken from a heli-copter. This registration process took about an hour forthe 12 images we used for the downtown Berkeley area.Then, our system automatically triangulated the DTM,simplied the mesh, and texture-mapped the model.

Figure 9a (next page) shows the surface meshobtained from directly triangulating the DSM. Figure9b shows the triangulated DSM after the processingsteps, and 9c shows the texture-mapped model. Its dif-ficult to evaluate the accuracy of this airborne modelbecause no ground truth with sufficient accuracy is


(a)

(b)7 Removing facades from the airborne model: (a) DSMwith facade areas marked in red and foreground in

yellow and (b) resulting mesh with correspondingfacades and foreground objects removed. The arrows in(a) and (b) mark corresponding locations in the DSMand mesh.

Extrusion

Ground

(airborne)

Facade(ground based)

Rooftop(airborne)

Ground(airborne)

Facade(airborne)


(a)

Blend mesh

Ground(airborne)


(c)

(b)

8 Creation of ablend meshbased on avertical cutthrough a build-ing facade.(a) Initial regis-tered airborneand ground-based model;(b) facade of airborne modelreplaced andground-basedmodel extrud-ed; and(c) blending thetwo meshes byadjusting looseends of extru-sions to air-borne meshsurface andmappingtexture.


7/10

readily availableeven at the citys planning depart-ment. However, we admittedly sacriced accuracy forthe sake of visual appearance of the texture-mappedmodel. Thus, our approach combines elements of model- and image-based rendering. While this is unde-sirable in some applications, we believe its appropriatefor interactive visualization applications.

We acquired the ground-based data during two mea-surement drives in Berkeley. The rst drive took 37 min-utes and was 10.2 kilometers long, starting from alocation near the hills, going down Telegraph Avenue,and in loops around the central downtown blocks. Thesecond drive was 41 minutes and 14.1 km, starting fromCory Hall at UC Berkeley and looping around theremaining downtown blocks. We captured a total of 332,575 vertical and horizontal scans consisting of 85million scan points along with 19,200 images.

In previous MCL experiments based on edge mapsfrom aerial images with 30-centimeter resolution, wefound an enormous localization uncertainty at somelocations because of false edges and perspective shifts.In the past, we had to use 120,000 particles during MCLto approximate the spread-out probability distributionappropriately and track the vehicle reliably. For the edgemap derived from airborne laser scans, we found thatdespite its lower resolution, the vehicle could be tracked with as few as 5,000 particles.

As Figure 10 shows, we applied the global correctionfirst to the yaw angles as shown in Figure 10a, then

recomputed the path and applied the correction to the x and y coordinates, as shown in Figure 10b. As expect-ed, the global correction substantially modies the ini-tial pose estimates, thus reducing errors in subsequentprocessing. Figure 10c plots the assigned z coordinate,clearly showing the slope from our starting position ata higher altitude near the Berkeley Hills down towardthe San Francisco Bay, as well as the ups and downs onthis slope while looping around the downtown blocks.

Figure 11a shows uncorrected paths one and twosuperimposed on the airborne DSM. Figure 11b showsthe paths after global correction, and Figure 12 showsthe ground based horizontal scan points for the cor-rected paths. As the gures demonstrate, the path andhorizontal scan points match the DSM closely afterapplying the global corrections.



9 Airbornemodel: (a) DSMdirectly triangu-lated, (b) trian-gulated afterpostprocessing,and (c) modeltexturemapped.

(a)

(b)

(c)

10,0009,0008,0007,0006,0005,000

Path length (m)

4,0003,0002,0001,0000

50

40

30

20

10

0

10

A b s o

l u t e o r i e n

t a t i o n

d i f f e r e n c e

( d e g r e e )

10,0009,0006,0005,000 8,0007,000

Path length (m)

4,0003,0002,0001,0000

10

5

0

10

5

15

A b s o

l u t e c o o r d

i a t e d i f f e r e n c e

( m )

10,0009,0006,0005,000 8,0007,000Path length (m)

4,0003,0002,0001,0000

120

110

100

40

50

60

70

80

90

A l t i t u

d e

( m )

x

y

(a)

(b)

(c)

10 Global correction for path one: (a) yaw angle dif-ference between initial path and global estimatesbefore and after correction; (b) differences of x and y coordinates before and after correction; and(c) assigned z coordinates.


8/10

Due to the MCL correction, all scans and images are geo-referenced and we generate a facade model for the 12street blocks of the downtown area using the processingsteps described earlier. Figure 13 shows the resultingfacades. The acquisition time for the 12 downtown Berke-ley blocks was only 25 minutes. This is the time portion of both paths that it took to drive the total of 8 kilometersaround these 12 blocks under city trafc conditions.

Because the DSM acts as the global reference for MCL, we registered the DSM and facade models with eachother, meaning we can apply the model merging steps asdescribed previously. Figure 14a shows the resulting com-


1

2

1

2

(b)

(a)

11 Driven paths superimposed on top of the DSM(a) before correction and (b) after correction. The cir-cles denote the starting position for paths one and two.

12 Horizontalscan points forcorrected paths.

13 Facade model for the downtown Berkeley area.

(a)

(b )

14 Walk-through view of the model (a) asseen from theground leveland (b) as seenfrom therooftop of abuilding.


9/10

bined model for the looped downtown Berkeley blocksas viewed in a walk-through or drive-through, and Fig-ure 14b shows a view from the rooftop of a downtownbuilding. Because of the limited eld of view of theground-based camera, we texture mapped the upperparts of the building facades with aerial imagery. Thenoticeable difference in resolution between the upperand lower parts of the texture on the building in Figure14b emphasizes the necessity of ground-based facademodels for walk-through applications. Figure 15 showsthe same model in a view from the top as it appears in ay through. You can download the model for interactive visualization from www-video.eecs.berkeley.edu/~frueh/3d/.

Our approach to city modeling is automated as well asquick computationally. As Table 1 shows, the total timefor the automated processing and model generation forthe 12 downtown blocks is around ve hours on a 2-GHzPentium 4 PC. Because the complexity of all developedalgorithms is linear in area and path length, our methodis scalable to large environments.

Discussion and future workWhile we have shown that our approach results in

visually acceptable models for downtown environments,one of its limitations is the way it handles foregroundobjects, such as cars and trees. In particular, we cur-rently remove such objects to avoid the problem of reconstructing and rendering them. If too many treesare presentas is often the case in residential areasour underlying assumptions about dominant buildingplanes are often not met and filling in facades is nolonger reliable. Therefore, in residential areas, theresulting model contains some artifacts. We would like

to include common city objects,such as cars, street lights, signs, andtelephone lines because they sub-stantially contribute to a high levelof photo realism. Our future work will address reconstructing andadding 3D and 4D foreground com-ponents by using multiple scannersat different oblique directions.

Selecting correspondence pointsfor registering the aerial imagery isthe only manual step in our entireprocessing chain. We plan to auto-mate this process as well, a goal wecould achieve by using an accurateGPS unit or applying model-based vision methods, such as nding van-ishing points or matching features inDSM and images. Finally, the highlevel of detail of our method resultsin enormous amounts of data. Forsome applications, a lower resolutioncould be sufcient. Furthermore, thelarge data size, in particular theamount of high-resolution texture,makes rendering a challenging task.Future work must address data man-agement issues related to rendering

models that are many orders of magnitude larger thanthe memory of existing computer systems. I

AcknowledgmentThis work was sponsored by Army Research Office

contract DAAD19-00-1-0352.

References1. Z. Kim, A. Huertas, and R. Nevatia, Automatic Descrip-

tion of Buildings with Complex Rooftops from MultipleImages, Proc. Computer Vision and Pattern Recognition ,IEEE CS Press, vol. 2, 2001, pp. II-272-279.

2. C. Brenner, N. Haala, and D. Fritsch: Towards Fully Auto-mated 3D City Model Generation, Proc. Workshop Auto-matic Extraction of Man-Made Objects from Aerial and Space Images III , Int'l Soc. Photogrammetry and Remote Sensing,2001.

3. A. Dick et al., Combining Single View Recognition andMultiple View Stereo for Architectural Scenes, Proc. IEEE Intl Conf. Computer Vision , IEEE CS Press, vol. 1, 2001, pp.268-274.

4. V. Sequeira and J.G.M. Goncalves, 3D Reality Modeling:Photo-Realistic 3D Models of Real World Scenes, Proc. 1st Intl Symp. 3D Data Processing Visualization and Transmis- sion, IEEE CS Press, 2002, pp. 776-783.

5. I. Stamos and P.K. Allen, Geometry and Texture Recovery of Scenes of Large Scale, Computer Vision and Image Under- standing (CVIU), vol. 88, no. 2, Nov. 2002, pp. 94-118.

6. C. Frh and A. Zakhor, Fast 3D Model Generation in UrbanEnvironments, Proc. IEEE Conf . Multisensor Fusion and Integration for Intelligent Systems , IEEE CS Press, 2001, pp.165-170.



15 Birds eyeview of themodel.

Table 1. Processing times for the downtown Berkeley blocks.

Process Time (minutes) Vehicle localization and registration with DSM 164Facade model generation and optimization for rendering 121DSM computation and projecting facade locations 8Generating textured airborne mesh and blending 26Total processing time 319


10/10

7. C. Frh and A. Zakhor, 3D Model Generation of CitiesUsing Aerial Photographs and Ground Level Laser Scans, Proc. Computer Vision and Pattern Recognition , IEEE CSPress, vol. 2.2, 2001, p. II-31-8.

8. H. Zhao and R. Shibasaki, Reconstructing a Textured CADModel of an Urban Environment Using Vehicle-Borne LaserRange Scanners and Line Cameras, Machine Vision and Applications , vol. 14, no. 1, 2003, pp. 35-41

9. C. Frh and A. Zakhor, Data Processing Algorithms for Gen-

erating Textured 3D Building Faade Meshes From LaserScans and Camera Images, Proc. 3D Processing, Visualiza-tion and Transmission , IEEE CS Press, 2002, pp. 834-847.

10. M.A. Fischler and R.C. Bolles, Random Sample Consen-sus: A Paradigm for Model Fitting with Application toImage Analysis and Automated Cartography, Communi-cation Assoc. and Computing Machine , vol. 24, no. 6, 1981,pp. 381-395.

11. M. Garland and P. Heckbert, Surface Simplication UsingQuadric Error Metrics, Proc. Siggraph 97 , ACM Press,1997, pp. 209-216.

12. S. Thrun et al., Robust Monte Carlo Localization forMobile Robots, Articial Intelligence , vol. 128, nos. 1-2,2001, pp.99-141.

13. G. Turk and M. Levoy, Zippered Polygon Meshes fromRange Images, Proc. Siggraph 94 , ACM, 1994, pp. 311-318.

Christian Frh is a postdoctoralresearcher at the Video and Image Processing Lab at the University of California, Berkeley. His researchinterests include mobile robots, laser scanning, and automated 3D mod-eling. Frh has a PhD in electrical

engineering and information technology from the Uni-

versity of Karlsruhe, Germany.

Avideh Zakhor is a professor in thedepartment of electrical engineeringand computer science at the Univer- sity of California, Berkeley. Herresearch interests include image and video processing, compression, andcommunication. Zakhor received a

PhD in electrical engineering from MIT.

Readers may contact the authors at Univ. of Calif., Berke-ley; 507 Cory Hall; Berkeley, CA 94720; {frueh,avz}@eecs.berkeley.edu.


SETINDUSTRY

STANDARDS

computer.org/standards/

HELP SHAPE FUTURE TECHNOLOGIES JOIN AN IEEE COMPUTER SOCIETY STANDARDS WO

IEEE Computer Society members work together to dene standards likeIEEE 802, 1003, 1394, 1284, and many more.

802.11 FireWiretoken rings

gigabit Ethernetwireless networks

enhanced parallel ports

3d modelling mems

Documents