What's the best way to store 3D point cloud data, optimising for time it takes to find all the points in a sphere of 3D space, and also for time it takes to insert new data points into the data set?

Background:

I have a cloud of 3D point data (with each point also including some further attached data), and need to be able to query it to quickly find all the points within a user-specified sphere of 3D space, as well as to quickly add points to the data set.

My current implementation uses a naive linear array of data points, but unsurprisingly, that's not scaling well as my data set grows.

My priority is speed, both for finding points in the cloud and for inserting new points into the cloud (I'd really like to avoid needing to rebalance trees after every insertion, if I can). I never delete points from the data set, so speed for point removal isn't important at all. Memory usage also isn't important -- I'm quite happy to throw memory at this problem, if it'll speed things up!

To clarify, are the points themselves the data? As in, (3,5,7) might exist, but (3,4,8) does not? Are you just trying to find what points exist within the sphere? Or does every point exist and have some data associated with it and you need to figure out which ones to query? I don't think I'll have a good answer for you, but that question will probably make a difference as to the data structure and algorithm.
–
kylbenOct 8 '11 at 7:16

That's correct. The points are the data -- (3,5,7) might exist in the point cloud, while (3,4,8) might not. (There's other data associated with each point as well, but that's not really relevant to the question, but it's the reason that I need to be able to retrieve all points within range, rather than just the closest)
–
Trevor PowellOct 8 '11 at 9:22

3 Answers
3

Look into Binary Space Partitioning trees. Bonus: most DBs have some sort of geometrical indexing, which means queries like 'find all points within distance D from point' (equivalent to a sphere) will be very fast. IOW probably no need to implement this yourself.

An octree is a tree data structure in which each internal node has
exactly eight children. Octrees are most often used to partition a
three dimensional space by recursively subdividing it into eight
octants. Octrees are the three-dimensional analog of quadtrees. The
name is formed from oct + tree, and normally written "octree", not
"octtree". Octrees are often used in 3D graphics and 3D game engines.

Computer languages translate multi-diminsional arrays to single diminsional arrays so that elements can be addressed effectively. For example (assuming row-major, base is not important in your case, so assume it is zero):

A(0,0,0) has address base+(single dim index=0)

A(0,0,1) has address base+(single dim index=1)

A(0,0,2) has address base+(single dim index=2)

A(1,0,0) has address base+(single dim index=3)
...
The single dim index value can be calculated from the array x,y values above using a formula given in references below.

So given value like x, y, z you can calculate f(x,y,z)=single dim index.

You could use this method to store the data in a dictionary structure (or similar) where the key would be f(x,y,z).

Also, given f(x,y,z) as a value (say 3), you could calculate x,y,z to be 1,0,0.

The technique is simpler than devising a hashing function, requires no search and preserves order, so it may help you find other near by elements fast. I am not quite clear on your definition of a sphere, do you mean geometric sphere?

My first attempt at optimising this spatial lookup was dividing space into a 3D grid of boxes, and storing each point in just one box. I stored these 3D boxes in a linear array in exactly the manner you mention here, but since my data was clustered very strongly, that approach ended up using a silly amount of memory in storing hundreds of thousands of empty boxes. Moving to an octree structure which could dynamically expand as my data set grew turned out to be a better trade-off of memory vs. performance, in my specific case.
–
Trevor PowellDec 9 '11 at 5:44

@Trevor Powell, Thank you very much for your feedback, I am glad you got it solved eventually!
–
Emmad KareemDec 9 '11 at 8:48