Processing Terrain Point Cloud Data

Transcription

1 SIAM J. IMAGING SCIENCES Vol. 6, No. 1, pp c 2013 Society for Industrial and Applied Mathematics Processing Terrain Point Cloud Data Ronald DeVore, Guergana Petrova, Matthew Hielsberg, Luke Owens, Billy Clack, and Alok Sood Abstract. Terrain point cloud data are typically acquired through some form of Light Detection And Ranging sensing. They form a rich resource that is important in a variety of applications including navigation, line of sight, and terrain visualization. Processing terrain data has not received the attention of other forms of surface reconstruction or of image processing. The goal of terrain data processing is to convert the point cloud into a succinct representation system that is amenable to the various application demands. The present paper presents a platform for terrain processing built on the following principles: (i) measuring distortion in the Hausdorff metric, which we argue is a good match for the application demands, (ii) a multiscale representation based on tree approximation using local polynomial fitting. The basic elements held in the nodes of the tree can be efficiently encoded, transmitted, visualized, and utilized for the various target applications. Several challenges emerge because of the variable resolution of the data, missing data, occlusions, and noise. Techniques for identifying and handling these challenges are developed. Key words. surface reconstruction, point clouds, Hausdorff metric, compression, adaptive splines AMS subject classifications. 65D17, 65D18, 41A15 DOI / Introduction. Terrain point clouds are now quite ubiquitous and are used in a variety of applications including autonomous navigation, change detection, and field of view calculations. The point clouds themselves are too cumbersome and large to be used for these purposes. They need to be converted to a simpler platform that is more efficient and still contains all of the features of the terrain, present in the point cloud, that are needed for these applications. A naive approach would be to take local averages of data heights to obtain pixel intensities (and therefore a pixelized image) and then employ the techniques of image processing to make a conversion into a wavelet or other multiscale representation. However, this approach is not successful for several reasons. Foremost among these is that terrains are not images. They have certain topology and geometry that must be extracted and maintained for successful Received by the editors November 21, 2011; accepted for publication (in revised form) August 20, 2012; published electronically January 10, This research was supported by the ARO/DoD contract W911NF ; the NSF grants DMS and DMS ; the Office of Naval Research contracts ONR-N , ONR-N , and ONR-N ; the AFOSR contract FA ; and the DARPA grant HR This publication is based in part on work supported by award KUS-C , made by King Abdullah University of Science and Technology (KAUST). Department of Mathematics, Texas A&M, College Station, TX tamu.edu). Institute of Scientific Computation, Texas A&M, College Station, TX Automated Trading Desk, Mount Pleasant, SC Department of Computer Science and Engineering, Texas A&M, College Station, TX edu, 1

2 2 DEVORE, PETROVA, HIELSBERG, OWENS, CLACK, AND SOOD applications. A second related point is that the usual least squares metrics used in image processing do not match the intended applications for terrain maps. For example, capturing long thin structures such as poles, towers, and wires is essential for navigation, but is not given priority in Peak Signal-to-Noise Ratio (PSNR) metrics employed for images. Another important point is that terrain point clouds usually have missing data, occlusions, and noise which do not appear in most other applications. While surface reconstruction from point clouds is now a dominant theme in computer graphics, very little of this work addresses terrain data per se. The notable exception is the paper [26], which proposes a Morse tree structure to represent terrain but falls short of providing an implemented processing platform. Of course, one could argue that one can simply apply one of the vast number of surface processing algorithms in computer graphics. However, these algorithms are typically built for high resolution data, which is not the case for general terrain data, which suffers from occlusions, missing data, noise, and variable resolution. The purpose of the present paper is to give a terrain point cloud processing algorithm. The algorithm outputs both a piecewise polynomial surface and the global Hausdorff distance between this surface and the (denoised) point cloud. It is based on the following basic principles: Distortion is measured in the Hausdorff metric, which we argue closely matches the intended applications. The decomposition is organized in a multiscale octree giving coarse to fine resolution. Each node of the tree corresponds to a dyadic cube and is ornated with a low degree polynomial fit to the point cloud on this cube and with other attributes, such as the local Hausdorff error of this fit. The tree is organized into subtrees each of which corresponds to a certain accuracy of resolution in the Hausdorff metric. The tree and nodal information can be efficiently encoded using predictive encoding. Upon receiving the tree and nodal information, the user can easily convert this information to a format that matches the intended application. Primitives such as normals, curvature, and other information can be extracted from the tree and nodal information. Multiscale decompositions and local polynomial fits are often used in surface fitting (see [8, 15, 16, 17, 19, 25, 27, 28]). Among the things that separate our work from others are the measure and guarantee of performance in the Hausdorff metric and the fact that our methods can be applied to nonhomogeneous and nondense data. A terrain point cloud D is a finite set of three-dimensional data points x := (x, y, z). Such point clouds are typically obtained from a sensor or from images using Structure From Motion calculations. In contrast to image processing or computer graphics, a canonical collection of point clouds that could be used to develop and test algorithms for terrain data is not available. We propose such a collection here, which can be used and added to by other researchers. These data sets as well as an implementation of the proposed algorithms can be downloaded from the web site hielsber/muri PointCloud/index.html (some data sets are not available to the general public because of priority restrictions). All terrain point clouds in the collection are obtained via Light Detection And Ranging (LiDAR) sensing (real data) or a simulation of LiDAR sensing (synthetic data). Some of the data sets are part of

3 PROCESSING TERRAIN POINT CLOUD DATA 3 Maple Mountain Building Light Pole Figure 1. Three examples of denoised LiDAR point clouds. the AFRL/MNG VEAA Data Set #1, obtained by taking a LiDAR scan of some real world topology. Others are derived using a synthetic LiDAR sensor which allows us to create data sets with representative features. Our data sets contain high-altitude terrain scan (Test 1: Maple Mountain), portion of building (Test 2: Eglin data), light pole (Test 3: Eglin data), and several complicated point clouds containing buildings, light poles, fences, and wires (Tests 4 6). The purpose of the first three sets is to test the algorithm on standard and simple mountain and urban (buildings and thin structures) landscape. The last three data sets test the ability of the algorithm to represent and reconstruct a complicated urban scene without special parameter tuning for thin structure detection. An illustration of the point clouds that we use in order to demonstrate the results of our algorithm is given in Figure 1. Details for each tested data set are provided in the description of our numerical results (see Tests 1 6). Note that our algorithms are developed for data sets which contain only the point cloud. In some settings one also has available the position and orientation of the sensor, which makes the task of processing the point cloud much simpler. This paper is organized as follows. In sections 2 6, we shall discuss the various tasks that have to be completed in our terrain processing engine. For each of these tasks, we describe the current algorithm which completes the task. After this, in section 7, we describe our terrain processing algorithm. In sections 8 9, we discuss how the terrain processing algorithm is used for some of the directed applications. 2. Dyadic cubes and octrees. Given a point cloud D, we process D by using a dilation (with each coordinate dilated by the same factor) and a shift of the data in D so that the output data D lies in the unit cube Ω = [0, 1] 3. There are many such transformations from which we choose the following one. After the transformation, the range of the new x i should be [1/2 δ 1, 1/2 +δ 1 ]with0 δ 1 1/2. Similarly, the range of the y i should be [1/2 δ 2, 1/2 +δ 2 ]with0 δ 2 1/2. For the new z i, we require that the range be of the form [2 κ 1 δ 3, 2 κ 1 + δ 3 ], where 2 κ 2 <δ 3 2 κ 1. Thus the z values will all lie in [0, 2 κ ]. Finally, we require one of the δ i, i =1, 2, 3, to equal 1/2. We denote by INIT the subroutine which takes as input D and has as output INIT(D) =(D,κ). A dyadic subcube Q of Ω of side length 2 j (volume 2 3j )isoftheform2 j ([k 1,k 1 +1) [k 2,k 2 +1) [k 3,k 3 + 1)) with the integers k 1,k 2,k 3 satisfying 0 k 1,k 2,k 3 < 2 j.weusethe

4 4 DEVORE, PETROVA, HIELSBERG, OWENS, CLACK, AND SOOD standard convention that the above intervals are closed on the right when the right endpoint is one. The children of Ω are the eight dyadic subcubes with side length 2 1. Each of these children has eight children itself and so on. Thus, the dyadic cubes organize themselves into an octree with root Ω. We denote the set of all dyadic subcubes of Ω by Q. Some of the cubes in Q will not contain data from D.WedenotebyQ the set of all cubes in Q that are occupied and, for each cube Q Q,wedenotebyD Q := D Q the set of all data points in Q. ThecubesQ Q determine a subtree of the full dyadic octree. A finite collection T Q is called an octree if whenever a cube is in T, then its parent and all of its siblings are in T. Our processing algorithm will input the data set D and a tolerance η and output a finite octree T = T η. 3. Distortion metrics. The development and assessment of terrain processing algorithms requires a metric which measures the distortion between two surfaces. In the case of image processing, this metric is usually chosen as a least squares metric and is tabulated in the PSNR. We argue in this section that least squares metrics are not appropriate for the intended applications of terrain maps and that a more suitable way to measure distortion is through the Hausdorff metric. Given two sets A, B in R 3, the one-sided Hausdorff distance from A to B is given by (3.1) δ(a, B) :=sup a A inf a b, b B where is the standard Euclidean distance in R 3. Obviously, this guarantees that for any a A, thereisb B whose Euclidean distance to a does not exceed δ(a, B). The Hausdorff distance between two sets A and B is then defined as (3.2) δ H (A, B) :=max{δ(a, B),δ(B,A)}. Notice that δ H is a metric on sets and in particular satisfies the triangle inequality (3.3) δ H (A, C) δ H (A, B)+δ H (B,C) for any sets A, B, C. A depiction of the Hausdorff distance between a point cloud D and a curve C in two dimensions is given in Figure 2, whereδ(d, C) is the one-sided distance from D to C and δ(c, D) is the one-sided distance from C to D. Given two finite sets A, B Ω, we denote by DIST(A, B) the subroutine that returns the one-sided Hausdorff distance from A to B, andbyhaus(a, B) the subroutine that computes max{dist(a, B), DIST(B,A)}, which is the Hasudorff distance between A and B. For our implementation of HAUS, we utilized the spatial searching features in the CGAL library [2] to reduce the computational complexity of this algorithm from quadratic to log-linear. We shall frequently need to compute the Hausdorff distance between a finite set A and a continuum surface S, such as a plane or quadric surface. There are fast algorithms for computing the Hausdorff distance between two surfaces [3, 6, 14, 29], most of which involve the discretization of the surfaces in question. Normally, as is traditional in numerical analysis, we do not indicate the fact that a subroutine does not provide exact computation. However, we will make an exception in the

5 PROCESSING TERRAIN POINT CLOUD DATA 5 Figure 2. Hausdorff distance from point cloud to curve and curve to point cloud. case of computing Hausdorff distances between a point set A and a surface S, since this subroutine plays a special role in our algorithm. With this in mind, we choose a numerical tolerance γ =2 m with m a positive integer. We consider the tiling Q m of Ω into dyadic cubes of side length γ and create the point set S γ, which consists of all points x such that x is the center of a cube Q Q m which contains points of the surface S; seefigure3. Then DIST γ (A, S) :=DIST(A, S γ ) takes as input the finite set A, thesurfaces, and the numerical tolerance γ and returns the distance between A and S γ. Since δ H (S, S γ ) 3 2 γ,this computation is an approximation to δ(a, S), which is accurate to tolerance γ. We can similarly compute DIST γ (S, A). The subroutine HAUS γ takes as input any pair of A, S and the tolerance γ and returns HAUS γ (A, S) :=max{dist γ (A, S), DIST γ (S, A)}. 3 2 Figure 3. Discretization of a two-dimensional curve for Hausdorff computation. To understand why the Hausdorff metric is appropriate for terrain applications, let us consider two such applications. First consider the problem of navigating an unmanned vehicle, for example, a Micro Air Vehicle. Sensors extract a point cloud, which describes surfaces that the vehicle must avoid (no fly zones). Suppose that we know that the true surface S has a Hausdorff distance ɛ from the point cloud; this is an assumption on the quality of the data which is necessary to proceed with any certainty. Suppose that we use the point cloud to find asurfaceŝ which is within η of the point cloud in the Hausdorff metric. Then, whenever the vehicle remains a distance greater than η + ɛ from the approximate surface, it is guaranteed

6 6 DEVORE, PETROVA, HIELSBERG, OWENS, CLACK, AND SOOD L 2 L Hausdorff Figure 4. Resulting line of sight between the true and approximate surfaces as observed from street level moving north along an urban canyon. Field of view metric comparison: false positives (yellow), false negatives (red), correctly classified (gray), unobserved regions (black); see [30]. to avoid the true surface. Moreover, no weaker metric (such as least squares) can guarantee such a performance. As a second example, consider observing a terrain surface S from an observation point x. The field of view describes what the observer can see from this vantage point taking occlusions into consideration. If we construct an approximation Ŝ to S from given point cloud data, then the field of view will not be accurate but will have false positives and false negatives. The quality of the approximate field of view will depend on the metric used to compute the approximation Ŝ. In Figure 4, we give a comparison of computing the field of view using three metrics with a comparable error. The left image uses the least squares metric, the center image uses a maximum z-value deviation, and the right uses the Hausdorff metric. The points colored yellow are false positives, and those colored red are false negatives. The gray colored points are correctly classified. One sees that when distortion is measured in the Hausdorff metric we obtain the largest agreement with the true field of view. 4. Multiscale decompositions. Our processing algorithm is based on a multiscale decomposition of the surface using low degree algebraic surfaces to locally approximate the given point cloud D. In the algorithms implemented in this paper, the algebraic surfaces are either planes (corresponding to linear polynomials) or certain quadric surfaces (corresponding to special choices of quadratic polynomials described below). In this section, we explain the multiscale structure which is based on dyadic cubes and also how we extract the polynomial fits. We wish to associate a local polynomial fit to the point cloud on the cube Q. Todothis, we need to assume that Q contains sufficiently many points from the point cloud. In our algorithms, we assign a tolerance K and require Q to have at least K points from D whenever we ask our algorithm to assign a polynomial fit. The value of K can be set by the user. Of course, it should be larger than the number of degrees of freedom in the local polynomial fits, but it also should be large enough to avoid fitting noise in the data. In our implementation, we set K = 10 for the examples in this paper. This value of K was motivated by considerations from learning theory; see [9, 10] Planar fits to the data. Suppose that D 0 is a subset of D. We look to fit the data D 0 by a plane. Any plane can be described as the zero set of a linear function L(x) =n x + c on R 3,wheren is a unit normal to the plane and c is a suitable constant. Also L(x) is the

7 PROCESSING TERRAIN POINT CLOUD DATA 7 distance of any given point x to this plane. While we have emphasized that we measure the distortion of our data fitting in the Hausdorff metric, it turns out that finding the best planar fit to the data D 0 in the Hausdorff metric is too computationally intensive. So we will take another approach to finding a linear fit by using Principal Component Analysis (PCA). We emphasize, however, that we continue to evaluate the performance of this fit in the Hausdorff metric. PCA finds a linear function L(x) =L D0 (x) =n D0 x + c D0 from the set L of all linear functions such that (4.1) L D0 := argmin L L x D 0 L(x) 2. Using Lagrange multipliers, one can solve this minimization problem by first finding the eigenvalues and eigenvectors of the covariance matrix for the components x, y, z of the data D 0 : (4.2) Σ D0 := cov(x, x) cov(x, y) cov(x, z) cov(x, y) cov(y,y) cov(y,z) cov(x, z) cov(y,z) cov(z,z) where cov(x, y) := (x,y,z) D 0 (x x D0,y ȳ D0 ), with x D0 the mean of the x components of the data D 0, and the sum is taken over all the data points in D 0. Then n D0 is a multiple of the eigenvector corresponding to the smallest eigenvalue of Σ D0,andc D0 is chosen so that x D 0 L D0 (x) = 0. Note that there is not necessarily a unique solution to (4.1) when the two smallest eigenvalues are equal. Since the matrix Σ D0 is 3 3 and positive semidefinite, this eigenvalue problem can be solved efficiently using any standard solver such as LAPACK, CGAL, or Geometric Tools [1, 2, 4]. The plane associated to L D0 is our default planar fit to the data D 0, and in most instances it performs satisfactorily. In summary, we define a subroutine PCA that takes as input the data set D 0, with the property #(D 0 ) K, and returns PCA(D 0 )=(λ 1,λ 2,λ 3 ; v 1,v 2,v 3 ), which are three eigenvalues (written in decreasing order) and corresponding eigenvectors of the matrix Σ D0. From the algorithm PCA, we define a new algorithm PLANE, which takes as input any axis-oriented three-dimensional rectangular box B (parallelpiped). It outputs the set Ŝ B := PLANE(B), which is the restriction to B of the plane obtained from PCA(D B ), where D B := D B Quadratic fits to the data. The quality of approximation to the data on a box B can generally be improved by replacing planes by algebraic surfaces that are zero sets of higher degree polynomials. We use quadratic polynomials in our processing algorithm described below. If one utilizes general quadratic functions in three variables, then the zero sets are quadric surfaces that may have branches. To avoid this, we limit the types of quadratic polynomials, and hence the quadric surfaces that can be used in our algorithms. To describe this limitation, we return to the coordinate system given by PCA(D B ). Given a box B, we use a change of coordinates that replaces the canonical x, y, z coordinates by the coordinates given by the basis of the three eigenvectors found by PCA written in order of increasing size of the eigenvalues (with ties handled arbitrarily). This maps the,

8 8 DEVORE, PETROVA, HIELSBERG, OWENS, CLACK, AND SOOD coordinates x =(x, y, z) to new coordinates u =(u, v, w). We denote by D B the transformed data points. We shall use quadratics P (u, v) P 2,whereP 2 is the set of all quadratics in u, v. We define (4.3) P B := argmin P P 2 (u,v,w) D B P (u, v) w 2, which is a least squares fit to the data on B. WedenotebyŜB the quadric surface, described as the set of all (u, v, w) B such that w = P B (u, v). The solution to the least squares problem (4.3) is easy to find from the Moore Penrose formula for least squares problems. We define a subroutine QUAD that takes as input a box B with #(D B ) K and returns the quadric surface ŜB. We want to emphasize once again that ŜB is not the best Hausdorff fit to the data from quadric surfaces of the above type. This would be too expensive to compute. However, we shall still measure the quality of fit of ŜB to our point cloud by using the Hausdorff metric. Figure 5. Comparison of planar and quadratic fits. Figure 5 shows the planar and quadratic fits to a portion of test data from Figure 1, representing part of the building s wall and nearby ground. Table 1 gives the one-sided Hausdorff distances between the point cloud and the fits. Clearly, for these data points a quadratic fit substantially outperforms the planar fit. Table 1 Comparison of one-sided Hausdorff distances for Figure 5 (values are given with respect to the original data units). PLANE QUAD δ(d Q, ŜQ) δ(ŝq,dq) General fitting. We found that in applications such as compression, the additional overhead required to encode the output of QUAD sometimes outweighed any benefit. It should be noted that while we found this to be true in several experiments, exhaustive testing has not been done. Our algorithms allow the user to specify the use of either PLANE or QUAD for any given experiment. We define the subroutine FIT as the generic delegate

9 PROCESSING TERRAIN POINT CLOUD DATA 9 for the PLANE and QUAD subroutines. It takes as input the box B and outputs ŜB := FIT(B). 5. Improving Hausdorff fits. The planar and quadric surfaces ŜQ onacubeq Q, described in the previous sections, are generated from least squares methods and may not be accurate in the Hausdorff metric. One way to remedy this situation is to subdivide Q into its children and repeat the fitting on each child. However, often the reason for the poor fit is simply because the planar/quadratic fit spreads over the entire cube, whereas the data is very localized on the cube. This manifests itself in having δ(d Q, ŜQ) small, but δ(ŝq,d Q ) large; see Figure 6. There are ways to remedy this situation by localizing the surface. In this section, we describe some methods for implementing this, as well as testing and applying such methods when δ(d Q, ŜQ) is smaller than our preassigned Hausdorff error threshold, but δ(ŝq,d Q )isnot Bounding boxes. Our first technique is to find one or two clusters of the data in Q, described by bounding boxes, in order to more accurately represent the underlying geometry. Certainly more than two clusters can exist in Q, and slight modifications in the proposed algorithm could be done to handle these cases. However, our experience shows that due to the nature of the terrain data and the high resolution of the underlying tree structure, these cases do not happen too often. This is the reason to limit our search to two such clusters. The first step of the bounding box method is to find the smallest axis-aligned box that contains D Q. The octree data structure keeps track of the minimum and maximum x, y, andz coordinates for D Q in each cube Q dynamically as the tree is being built. These minimum and maximum coordinates provide three intervals I x,i y,i z representing the smallest possible fit of an axis-aligned bounding box around the data. Thus, the single bounding box on Q is simply B Q := I x I y I z. We now compute δ(ŝb Q,D Q )andδ(d Q, ŜB Q ), where ŜB Q := FIT(B Q ). If both quantities are smaller than our preassigned Hausdorff error tolerance, we accept ŜB Q and do not further refine Q. Figure 6 shows a synthetic example (in two dimensions), where the algorithm terminates with an accurate Hausdorff fit using the surface ŜB Q in B Q ; therefore in this case, the use of bounding boxes eliminates the need to continue subdividing Q. Figure 6. An example of a single bounding box. We define the subroutine BOX γ that takes as input a dyadic cube Q and the numerical tolerance γ. It finds the bounding box B Q, the fit ŜB Q, and the Hausdorff distance ˆη Q := HAUS γ (ŜB Q,D Q ). Thus BOX γ (Q) =(B Q, ŜB Q, ˆη Q ).

10 10 DEVORE, PETROVA, HIELSBERG, OWENS, CLACK, AND SOOD Figure 7. An example of the sliding-partition method. If ˆη Q is bigger than the prescribed Hausdorff tolerance, then we attempt to cluster the data into two groups. There are several existing algorithms in the literature for clustering data sets that could be used for this purpose. We have implemented two methods: the sliding-partition method and the k-means clustering algorithm. The sliding-partition method described here does not operate on Q, but instead on the single bounding box B Q found in the previous step. The reason for doing this is that B Q is already the tightest axis-aligned fit around the points, thus eliminating external empty space. This algorithm requires a user-defined parameter N specifying the number of partitions to use. The sliding-partition method proceeds as follows. Let x i, i =1,...,M,withM =#(D Q ), be the projection of the points in D Q onto the x-axis written in increasing order x i x i+1, i =1,...,M 1. Let P ˆd := [x 1,x 1 + ˆd L] andp ˆd := (x 1 + ˆd L, x M ], ˆd =1,...,N,bethe interval choices along the axis, where L := (x M x 1 )/(N + 1). If we fix a value of ˆd, thend Q can be partitioned into two sets D 1 ( ˆd) andd 2 ( ˆd), where D 1 ( ˆd) contains all the data points x D Q whose projections are in P ˆd, andd 2 ( ˆd) contains all the data points whose projections are in P ˆd. We find the two bounding boxes B 1 ( ˆd), B 2 ( ˆd) ford 1 ( ˆd), D 2 ( ˆd), respectively, each by the single bounding box method BOX γ. We also compute the best linear (or quadratic) fit to the data on each of these bounding boxes using FIT. The Hausdorff errors on each box are also computed, and the maximum of these two errors is stored as the error for this partition. We then choose the minimum of these errors over all partitions and denote this. We do a similar calculation starting with the y (respectively, z) values of the data and thereby obtain error η (2) Q (respectively, η(3) Q ). We now choose the bounding boxes corresponding to the smallest of the three errors η (1) Q, η(2) Q, η(3) Q, which we denote by ˆη Q. We denote by BOX2 γ the sliding-partition method, which takes as input a dyadic cube Q and the numerical tolerance γ and outputs two boxes B (1),theirsurfacefitsŜB(1) = value by η (1) Q Q, B(2) Q = FIT(B (2) Q ), and the Hausdorff error ˆη Q, which is the maximum of the FIT(B (1) Q ), Ŝ (2) B Q HAUS γ error on each of these boxes. Thus, we have BOX2 γ (Q) =(B (1) Q,B(2) Q, ŜB (1), Q Q ŜB (2), ˆη Q ). Q An example of the sliding-partition method for a synthetic two-dimensional data set is given in Figure 7, where the result of BOX γ is shown on the left, an intermediate partition from the sliding-partition method is shown in the middle, and the result of BOX2 γ is shown on the right.

11 PROCESSING TERRAIN POINT CLOUD DATA 11 An alternative to the sliding-partition method is the k-means algorithm; see [22]. This is the most widely used unsupervised clustering algorithm, and it shows good results in our experiments with terrain data. However, it is more computationally intensive than the slidingpartition method described above. The ideal k-means clustering algorithm, when applied to three-dimensional point cloud data, groups points into k distinct clusters based on their Euclidean distances. Given a set of data points x := (x, y, z) and a cluster count k (we choose k = 2 in our case to split the points into two groups), the algorithm outputs a cluster assignment where each input point is assigned to one of the k clusters. The mean of each cluster is computed. To evaluate a given cluster assignment, the sum of the squares of distances from the points to their corresponding cluster mean is computed. One then determines the k clusters that minimize this sum over all possible assignments. The ideal k-means algorithm is too computationally intensive, so it is typically replaced by an iterative algorithm. A typical iterative algorithm begins by choosing k points P 1,...,P k at random. The clusters C (0) 1,...,C (0) k are determined by assigning a point from the data set to cluster C (0) j if its distance to P j is the smallest. Now the means M (1) j of all clusters C (0) j are computed, j =1,...,k. A new cluster assignment C (1) 1,...,C (1) k is determined as above, where a point from the data set is assigned to cluster C (1) (1) j when its distance to M j is smallest. This cluster assignment and mean computation are iterated until a user-defined number of iterations is reached. Our implementation uses the k-means++ algorithm [5], where, rather than simply picking the points P 1,...,P k at random, one uses a greedy method that guarantees their maximum separation. Our implementation of the above ideas is the subroutine 2MEANS γ (i.e., k =2). This algorithm takes as input a dyadic cube Q and the numerical tolerance γ and applies the 2- means++ algorithm to find two clusters, C 1 and C 2. We then define the smallest axis-aligned boxes B (1) Q, B(2) Q,theirsurfacefitsŜB(1) := FIT(B (1) Q ), ŜB (2) := FIT(B (2) Q ), and the Hausdorff Q Q error ˆη Q, which is the maximum of the result of HAUS γ on each of these boxes. Thus, 2MEANS γ (Q) =(B (1) Q,B(2) Q, ŜB (1), ŜB (2), ˆη Q ). We also define the subroutine CLUSTER γ Q Q as the generic delegate for the BOX2 γ and 2MEANS γ subroutines. 6. Preprocessing to remove noise and outliers. The LiDAR acquired terrain point cloud data are typically noisy. The type and amount of noise vary depending on the sensor and data collected. In this section, we discuss a few preprocessing algorithms we have implemented to identify and remove some of the present noise and outliers. Locally optimal projection. We have implemented the locally optimal projection method from [21]. We refer the reader to [21] for its description and motivation. We have applied this method locally on D Q. This method performs rather well on point clouds coming from a C 2 surface, but when this is not the case, e.g., for terrain surfaces, it leads to unwanted artifacts and missing detail. Outlier measure. We have also implemented the outlier measure method from [32], where a measure for determining whether or not a data point is noise is presented. This method assigns to each point P ascoreχ(p ). A simple thresholding of these scores produces the

12 12 DEVORE, PETROVA, HIELSBERG, OWENS, CLACK, AND SOOD denoised point cloud. The score χ for P is computed as χ(p )=ω 1 χ 1 (P )+ω 2 χ 2 (P )+ω 3 χ 3 (P ), where ω 1, ω 2, ω 3 are scalar weights defined by the user, and χ 1, χ 2 and χ 3 are determined as follows. First, to compute χ 1 (P ), a value of k is chosen by the user, then a least squares plane is fit through the point s k-nearest neighbors, and the distances of all points in that neighborhood to the plane are calculated. Then, χ 1 (P )=d P (d P + d) 1,whered P is the distance from P to the plane and d is the mean distance of all points in that neighborhood to the plane. Next, χ 2 is computed by χ 2 (P )=q P (q P +2r/ k) 1,whereq P is the distance from P to the center of the sphere, determining the k-neighborhood, and r is its radius. Last, χ 3 (P )=N P k 1,whereN P is the number of k-neighborhoods created from the k-neighbors of P that do not contain P. Our tests show that this method performs well at identifying noise along smooth surfaces as well as outliers. However, we have noticed that it may treat edges, corners, and boundaries between regions with different sampling densities as noise. One-sided Hausdorff outlier measure. To enhance denoising with respect to the Hausdorff method, we have developed our own denoising method called the one-sided Hausdorff outlier measure. It is based on the idea that outliers should be relatively far from any local fitting of the data. This method depends on a user-defined tolerance ζ and assigns to each point P ascoreξ(p ). As in the previous method, a simple thresholding of the calculated scores produces the denoised point cloud. To compute ξ(p ), we do the following with a user-defined value of k. ForeachpointR we consider the set N(R) ofitsk-nearest neighbors and we fit a least squares plane through this set. For every point P N(R), we consider a sphere with center P and radius ζ. If this sphere intersects the plane, we assign d R (P )=0. Otherwise,wesetd R (P ) to be the reciprocal of the number of data points in this sphere. Thus, we assign a number d R (P ) to each point P, viewed as a point from the k-nearest neighbors of R. In general, a point P belongs to several k-nearest neighbors and has associated to it several numbers d R (P ). Now, the score ξ(p )is the average of d R (P ) taken over all k-nearest neighborhoods N(R) thatcontainp. This algorithm can be viewed as an extension of the outlier measure method with weights (ω 1,ω 2,ω 3 )=(1, 0, 0). We have found in our tests that it performs similarly to the outlier measure method. However, it is more successful than the outlier measure method when processing corners or edges. It does not treat them as noise, especially when the user-defined tolerance ζ is bigger than the local sampling rate. 7. Processing algorithm. In this section, we shall describe our processing algorithm MAIN. The input to this algorithm consists of the output data set D of INIT, a desired target Hausdorff error η, a tree depth l, a numerical tolerance γ, such that γ 1 3 η, and a user choice of whether the algorithm is to use planar or quadric surfaces in the subroutine FIT and whether the algorithm is to use the sliding-partition method or k-means in the subroutine CLUSTER γ. The output is an octree T with depth at most l. Every node of the tree is adorned with either a local polynomial fit to the data or a flag that says no local fit is available. We define the surface S out to be the union of the local polynomial fits on the leaf nodes of T. S out is typically not the type of surface we see in terrain processing because it is fragmented (primarily due to the use of bounding boxes) and discontinuous and is not oriented with definable inside and outside. In section 9, we will explain how to obtain smooth oriented surfaces from the output of MAIN.

13 PROCESSING TERRAIN POINT CLOUD DATA 13 Each node of T that has a polynomial fit also contains an upper bound ˆη Q for the local Hausdorff error. The Hausdorff error ˆη Q will be less than our threshold η for each leaf node Q which contains a polynomial fit, except in the case where further subdivision of the node Q is artificially halted by the user s choice of l. Note that the global Hausdorff error η Sout between S out and D, computed with accuracy γ, is also an output of MAIN. Our processing algorithm is quite simple; it recursively applies a single subroutine called PROCQ γ. To describe this subroutine, let us first define η := η 2 to be our computational tolerance. The subroutine PROCQ γ takes as input a cube Q, the computational tolerance η, the numerical tolerance γ 1 3 η, and the specification of linear or quadric fit in FIT. It outputs the current best fit ŜD Q to the data D Q on Q and the Hausdorff error ˆη Q. Thus if Q is not flagged, then PROCQ γ (Q, η )=(ŜD Q, ˆη Q ), where ŜD Q is either ŜQ, Ŝ BQ or (ŜB (1), Q ŜB (2)). The subroutine is described as follows: Q if #(D Q ) <K then Flag Q and exit. end if Call FIT(Q) =ŜQ, andhaus γ (ŜQ,D Q )=ˆη Q,setŜD Q = ŜQ. if DIST γ (ŜQ,D Q ) >ηand DIST γ (D Q, ŜQ) η then Call BOX γ (Q) =(B Q, ŜB Q, ˆη Q ), set ŜD Q = ŜB Q. if ˆη Q >ηthen Call CLUSTER γ (Q) =(B (1) Q,B(2) Q, ŜB (1), Q end if end if if ˆη Q >ηthen Set ŜD Q = FIT(Q), and ˆη Q = HAUS γ (ŜQ,D Q ). end if ŜB (2), ˆη Q ), set ŜD Q =(B (1) Q,B(2) Q ). Q Notice that if the local fit does not provide Hausdorff error η, thenanattemptismadeto fit the data on Q through one or two bounding boxes. However, if these fail and we are not at finest level l, then we return to the entire cube Q in the further processing. That is, we never subdivide bounding boxes, only the entire cube Q. Given the above specification of PROCQ γ, the main processing algorithm MAIN(D,η, l, γ) =(T,η Sout ) maintains a set of cubes to be processed and a set of cubes that have been assigned to the octree. The method is described as follows: Initialize the cube processing list as {Ω} and the octree T as empty. for Each cube Q in the processing list do Call PROCQ γ (Q, η ). if Q is flagged by PROCQ γ then Add Q and the flag to T. else Add Q and its adornments ŜD Q and ˆη Q to T. if Q > 2 3l and ˆη Q >ηthen Add the children of Q to the end of the processing list.

14 14 DEVORE, PETROVA, HIELSBERG, OWENS, CLACK, AND SOOD end if end if Remove Q from the processing list. end for Assign η Sout = HAUS γ (S out,d ). We next make some observations on the output T of MAIN. We know that each nonflagged terminal node Q of T is adorned by a local polynomial surface produced by FIT whose Hausdorff distance to the data D Q in Q is at most ˆη Q. Generally, we have ˆη Q η. The only exception to this is if Q is at dyadic level l and the subdivision was stopped because of the user-imposed level restriction. In the case that no leaf cube containing points is flagged or artificially stopped by the condition on the maximal depth of T, then the surface S out will have Hausdorff distance η Sout to the point cloud, and η Sout η, which is the original goal of the algorithm. We have found that with properly denoised data it is almost always the case that η Sout is close to the target accuracy η. Indeed, a cube containing points is flagged because it has too few points, but it generally has a neighboring cube which is not flagged and therefore has a local polynomial fit on that neighbor. When this is not the case, one could say that the points in the flagged cube are outliers and most of them have been removed by the preprocessing algorithms from section Encoding. We anticipate that one of the main applications of our fitting algorithm will be compression and encoding of the point cloud. The output of MAIN is an octree whose nodes are ornated with coefficients of polynomials. This is a common setting in image and surface compression, and there are several approaches to converting such an ornated tree to a bit stream which the user can decode to find the tree and quantized coefficients [25, 27]. For our implementation, we have utilized the predictive encoder developed by the Rice group [31]. Note that here a simple switch of data structures is necessary, since the Rice encoder works on an octree whose nodes are ornated with a single polynomial surface. In our case, the terminal nodes may have bounding boxes and may have two polynomials. We therefore have to encode this additional structure. The compression figures given below include the extra bits needed to encode this extra structure but done currently in a rather naive way. With Rice, we are improving on this encoder to include burn-in and progressivity and will report on this in a forthcoming paper. For large terrain data sets the ranges in x and y are typically significantly larger than the range in the vertical direction z. This is the reason we have introduced the integer κ and the initialization step INIT. This step shifts the data in the z direction and computes the largest nonnegative integer κ such that D [0, 1] 2 [0, 2 κ ]. This fact was not utilized in MAIN but will be exploited in the encoding. Namely, if κ is large, then there are a lot of cubes in the tree at levels coarser than or equal to κ whichhavenodatabutwouldbeencodedifwe began the encoding at the root [0, 1] 3. For this reason, we will instead encode a forest starting with the occupied cubes at level κ. This will improve the encoding because we remove κ levels from the encoding of the single octree T, including the intermediate fits, and replace those by encoding κ and the root of each of the subtrees. This allows us to restrict the data to at most 2 2κ subtrees whose separate encoding results in higher compression rates. This step generally improves the compression rate over simply starting with Ω = [0, 1] 3 as the root of an octree.

15 PROCESSING TERRAIN POINT CLOUD DATA 15 The encoding of κ and the subtree roots is not free, and thus some overhead is required. To ensure that this overhead does produce lower overall compression than simply encoding the original octree, we impose a minimum value κ 0 and require κ>κ 0 before the subtrees are considered separately for encoding. In all our experiments, we have κ 0 =1. Next, we give some numerical results which tabulate the rate distortion performance of our algorithm MAIN when coupled with the Rice encoder. Note that one can utilize MAIN with different Hausdorff target tolerances η in order to create a progressive encoder analogous to wavelet-based image encoding; see, for example, [11, 18]. For our experiments, we first denoise all point cloud data sets, using the Outlier Measure algorithm described in section 6. Test 1. Our first test is the Maple Mountain data set [7], generated by high-altitude terrain scans. The original data contains 62,500 points as 16-bit unsigned integer height values, with a file size of 125,000 bytes. η =0.05 η =0.02 η =0.01 Figure 8. Maple Mountain reconstruction using PLANE for different values of η. η =0.05 η =0.02 η =0.01 Figure 9. Maple Mountain reconstruction using QUAD for different values of η. In Figures 8 and 9, we display the local plane and quadratic fits, respectively, that ornate the final leaves of the octree T for various values of the Hausdorff tolerance η.oneobserves the progressive structure of the octree, in that smaller η values correspond to more detail on the surface. In Table 2, we show various statistics for the corresponding octrees, such as tree depth, number of nodes, number of polynomial fits, boxes, and clusters.

16 16 DEVORE, PETROVA, HIELSBERG, OWENS, CLACK, AND SOOD Table 2 Tree statistics using CLUSTER γ = sliding-partition method, with parameters γ =2 10, N =10, K =10, and l =20. Maple η Tree #Nodes #Leaf #Basic Mountain depth in tree nodes fits #BOX #CLUSTER PLANE QUAD In Table 3, we present the Hausdorff error between the points in the nodes of the octree T and the corresponding fit in these nodes (Hausdorff error on fits), and the Hausdorff error between the whole point cloud D and the resulting fragmented surface S out (Hausdorff error global). Table 3 Maple Mountain. The Hausdorff errors and η are given with respect to the unit cube. PLANE QUAD η Hausdorff error Hausdorff error Hausdorff error Hausdorff error on fits global on fits global Notice that in this example, the global Hausdorff error is sometimes smaller than the local Hausdorff error. The reason for this is that the point cloud to surface distance, as well as the surface to point cloud distance, may improve globally by using the portions of the surface on neighboring cubes. Table 4 contains the compression results obtained from the application of our algorithm and the Rice encoder (with parameters Smoothness = s, WindowSize = 16, and T aperlevel = T reedepth). The computed Hausdorff error is the global error between the point cloud and the decoded planar fits. The compression ratios in the last two columns are the results reported by the Rice encoder without and with the additional bits needed for the encoding of the data structure switch, respectively. Table 4 Compression results, Maple Mountain. η s Encoded size Hausdorff Comp. ratio Comp. ratio (bytes) error (Rice) (Rice and extra) Figure 10 shows a comparison between the surface S out, representing the Maple Mountain and its decoded counterpart, after the Rice encoder compression. Clearly, both Table 4 and Figure 10 demonstrate the high compression rates without sacrificing the quality of the surface.

17 PROCESSING TERRAIN POINT CLOUD DATA 17 Original surface Decoded surface Figure 10. Maple Mountain, η =0.01 η =0.06 η =0.04 η =0.03 Figure 11. Building reconstruction using PLANE for different values of η. Test 2. Our next data set (see Figures 11 12) is a portion of a building taken from the Eglin data (AFRL/MNG VEAA Data Set #1). It contains typical features present in real world urban terrain, such as windows, doors, corners, thin structures, and bushes. Resolving these structures is a challenge for any reconstruction algorithm. Note that this is a true threedimensional point cloud data, where each point coordinate is represented by a 32-bit floating point value. The denoised data contains 22,058 points and has a file size of 264,696 bytes. Tables 5 and 6 describe the statistics and results for the Building data set. Note that, as shown in Table 6, the local Hausdorff error for quadratics could be larger than the error for planes due to the fact that these fits are obtained via L 2 and not Hausdorff optimization. Compression results for this example, as well as for those that follow, are given in Table 15. Test 3. Next, we process another portion of the Eglin data (see Figures 13 14): a single light pole and a portion of a vehicle (in the upper right corner). This point cloud is a computational challenge since it contains a thin structure with no discernible interior. The denoised data consists of 5,901 points and has a file size of 70,812 bytes.

18 18 DEVORE, PETROVA, HIELSBERG, OWENS, CLACK, AND SOOD η =0.06 η =0.04 η =0.03 Figure 12. Building reconstruction using QUAD for different values of η. Table 5 Tree statistics using CLUSTER = sliding-partition method, with parameters γ =2 10, N =10, K =10, and l =20. Building η Tree #Nodes #Leaf #Basic depth in tree nodes fits #BOX #CLUSTER PLANE QUAD Table 6 Building. The Hausdorff errors and η are given with respect to the unit cube. PLANE QUAD η Hausdorff error Hausdorff error Hausdorff error Hausdorff error on fits global on fits global Tables 7 and 8 show the octree statistics and Hausdorff errors for the Light Pole data. Notice the good approximation of the original point cloud using a succinct representation. Test 4. We now consider a more complicated data set (see Figures 15 17), obtained using simulated LiDAR [7, 13]. It was created from a simulated low-altitude flight through a CAD representation of an actual Military Operations in Urban Terrain (MOUT) site. We present three pairs of images showing the point cloud and reconstructed surface from different vantage points. We have selected these views to emphasize the various urban terrain structures that are present in the data. The denoised data contains 632,448 points and has a file size of 7,589,376 bytes. Tables 9 and 10 show the octree statistics and Hausdorff errors for the MOUT data. Let us note that, in this example and some of the ones that follow, the global Hausdorff distance is larger than the local distance. The reason for this is the appearance of flagged cubes which

20 20 DEVORE, PETROVA, HIELSBERG, OWENS, CLACK, AND SOOD Point Cloud Reconstruction using PLANE, η =0.002 Figure 15. MOUT, View 1. Point Cloud Reconstruction using PLANE, η =0.002 Figure 16. MOUT, View 2. contain points but do not have a local surface fit. The points in these flagged cubes have to be included when computing the global distance but, of course, were not included in the computation of the local distance. Test 5. We now consider a larger portion of the Eglin data set containing both thin structures and buildings; see Figure 18. The purpose of this test is to demonstrate that no special tuning of the algorithm is needed to handle both thin structures and buildings. The denoised data contains 382,143 points and has a file size of 4,585,716 bytes. Tables 11 and 12 show the octree statistics and Hausdorff errors for the Eglin1 data. Test 6. We now consider a second portion of the Eglin data set consisting of several portions of multiple buildings; see Figure 19. The denoised data contains 192,021 points and has a file

Introduction Data production rate has been increased dramatically (Big Data) and we are able store much more data than before E.g., purchase data, social media data, mobile phone data Businesses and customers

Lecture 10: Regression Trees 36-350: Data Mining October 11, 2006 Reading: Textbook, sections 5.2 and 10.5. The next three lectures are going to be about a particular kind of nonlinear predictive model,

MA 323 Geometric Modelling Course Notes: Day 02 Model Construction Problem David L. Finn November 30th, 2004 In the next few days, we will introduce some of the basic problems in geometric modelling, and

Chapter 4 The Scientific Data Mining Process When I use a word, Humpty Dumpty said, in rather a scornful tone, it means just what I choose it to mean neither more nor less. Lewis Carroll [87, p. 214] In

Decision Trees as a Predictive Modeling Method Gerry Hobbs, Department of Statistics, West Virginia University Abstract Predictive modeling has become an important area of interest in tasks such as credit

Chapter 5 Clustering & Visualization Clustering in high-dimensional databases is an important problem and there are a number of different clustering paradigms which are applicable to high-dimensional data.

SCOPE This course is divided into two semesters of study (A & B) comprised of five units each. Each unit teaches concepts and strategies recommended for intermediate algebra students. The first half of

.6 Data Mining: Algorithms and Applications Matrix Math Review The purpose of this document is to give a brief review of selected linear algebra concepts that will be useful for the course and to develop

Surface Reconstruction from a Point Cloud with Normals Landon Boyd and Massih Khorvash Department of Computer Science University of British Columbia,2366 Main Mall Vancouver, BC, V6T1Z4, Canada {blandon,khorvash}@cs.ubc.ca

The Calculus of Functions of Several Variables Section. Introduction to R n Calculus is the study of functional relationships and how related quantities change with each other. In your first exposure to

Development and Evaluation of Point Cloud Compression for the Institute for Media Technology, TUM, Germany May 12, 2011 Motivation Point Cloud Stream Compression Network Point Cloud Stream Decompression

International Journal of Current Engineering and Technology E-ISSN 2277 4106, P-ISSN 2347 5161 2015 INPRESSCO, All Rights Reserved Available at http://inpressco.com/category/ijcet Research Article Rahul

Academic Content Standards Grade Eight Ohio Pre-Algebra 2008 STANDARDS Number, Number Sense and Operations Standard Number and Number Systems 1. Use scientific notation to express large numbers and small

Terms and Definitions Absolute Value the magnitude of a number, or the distance from 0 on a real number line Additive Property of Area the process of finding an the area of a shape by totaling the areas

VISUAL ALGEBRA FOR COLLEGE STUDENTS Laurie J. Burton Western Oregon University VISUAL ALGEBRA FOR COLLEGE STUDENTS TABLE OF CONTENTS Welcome and Introduction 1 Chapter 1: INTEGERS AND INTEGER OPERATIONS

Medical Information Management & Mining You Chen Jan,15, 2013 You.chen@vanderbilt.edu 1 Trees Building Materials Trees cannot be used to build a house directly. How can we transform trees to building materials?

Math Review for the Quantitative Reasoning Measure of the GRE revised General Test www.ets.org Overview This Math Review will familiarize you with the mathematical skills and concepts that are important

DATA MINING CLUSTER ANALYSIS: BASIC CONCEPTS 1 AND ALGORITHMS Chiara Renso KDD-LAB ISTI- CNR, Pisa, Italy WHAT IS CLUSTER ANALYSIS? Finding groups of objects such that the objects in a group will be similar

EM Clustering Approach for Multi-Dimensional Analysis of Big Data Set Amhmed A. Bhih School of Electrical and Electronic Engineering Princy Johnson School of Electrical and Electronic Engineering Martin

A Short Introduction to Computer Graphics Frédo Durand MIT Laboratory for Computer Science 1 Introduction Chapter I: Basics Although computer graphics is a vast field that encompasses almost any graphical

Infinite Algebra 1 Kuta Software LLC Common Core Alignment Software version 2.05 Last revised July 2015 Infinite Algebra 1 supports the teaching of the Common Core State Standards listed below. High School

11 Vectors and the Geometry of Space 11.1 Vectors in the Plane Copyright Cengage Learning. All rights reserved. Copyright Cengage Learning. All rights reserved. 2 Objectives! Write the component form of

Chapter 1 DEGREE OF A CURVE Road Map The idea of degree is a fundamental concept, which will take us several chapters to explore in depth. We begin by explaining what an algebraic curve is, and offer two

This document is designed to help North Carolina educators teach the Common Core (Standard Course of Study). NCDPI staff are continually updating and improving these tools to better serve teachers. Algebra

CHAPTER 4 CURVES 4.1 Introduction In order to understand the significance of curves, we should look into the types of model representations that are used in geometric modeling. Curves play a very significant

Interpolating and Approximating Implicit Surfaces from Polygon Soup 1. Briefly summarize the paper s contributions. Does it address a new problem? Does it present a new approach? Does it show new types

Big Data: Rethinking Text Visualization Dr. Anton Heijs anton.heijs@treparel.com Treparel April 8, 2013 Abstract In this white paper we discuss text visualization approaches and how these are important

Chapter 8 Information, Entropy, and Coding 8. The Need for Data Compression To motivate the material in this chapter, we first consider various data sources and some estimates for the amount of data associated

CHAPTER 2 The Basics of FEA Procedure 2.1 Introduction This chapter discusses the spring element, especially for the purpose of introducing various concepts involved in use of the FEA technique. A spring

An Optical Sudoku Solver Martin Byröd February 12, 07 Abstract In this report, a vision-based sudoku solver is described. The solver is capable of solving a sudoku directly from a photograph taken with

2.3 Scheduling jobs on identical parallel machines There are jobs to be processed, and there are identical machines (running in parallel) to which each job may be assigned Each job = 1,,, must be processed

Face Recognition using Principle Component Analysis Kyungnam Kim Department of Computer Science University of Maryland, College Park MD 20742, USA Summary This is the summary of the basic idea about PCA

Numerical Analysis Lecture Notes Peter J. Olver 5. Inner Products and Norms The norm of a vector is a measure of its size. Besides the familiar Euclidean norm based on the dot product, there are a number

Mesh Discretization Error and Criteria for Accuracy of Finite Element Solutions Chandresh Shah Cummins, Inc. Abstract Any finite element analysis performed by an engineer is subject to several types of

7 Getting Started 7.1 Construction of a Simple Diagram Scicos contains a graphical editor that can be used to construct block diagram models of dynamical systems. The blocks can come from various palettes

Understanding Basic Calculus S.K. Chung Dedicated to all the people who have helped me in my life. i Preface This book is a revised and expanded version of the lecture notes for Basic Calculus and other

Chapter 3 Data Storage Objectives After studying this chapter, students should be able to: List five different data types used in a computer. Describe how integers are stored in a computer. Describe how

Approximation Algorithms Chapter Approximation Algorithms Q Suppose I need to solve an NP-hard problem What should I do? A Theory says you're unlikely to find a poly-time algorithm Must sacrifice one of