If you were to attempt to archive your scan project(s) for future generations, what would be important? What format would you archive your data in? What products would you archive?

About a year ago, I authored the Guides to Good Practice for Laser Scanning http://guides.archaeologydataservice.ac.uk/g2gp/LaserScan_Toc (predominantly for Archaeology/Heritage applications) as a first attempt to answer some of these questions. These guides are based firmly on the previous specifications set forth by English Heritage and are meant to encompass long and short range scanning applications. You will see the suggestions for metadata are quite broad because 1) the document should ultimately be divided into short range (object scanning) and mid/long range (building/environment scanning) and 2) Until major software vendors provide the ability to export reports that summarize major processing operations/parameters, the amount of information that we can provide pertaining to data processing and the creation of derived products is quite limited.

With these issues set aside, I have been assigned to "eat my own dog food" and deposit several large scan projects into the national archive, tDAR http://www.tdar.org/.

The REASON why I am writing today is to discuss file formats. I will be working predominately with spherical based scanners (the Z+F Imager 5006i and the Optech ILRIS 3D). Initially, I advised to deposit these types of datasets in the original native format as well as in ASCII TXT format. However, as with all archives, space is an issue and more importantly, costs $$$. So what should I do?

By depositing spherical datasets (i'm talking only about original scan files at this point) as ASCII TXT files, are we losing too much information? Is there an alternative solution that would ensure the longevity of these datasets? I am aware that there is a strong push for the E57 format (I'm also aware of LAS) - but there is no guarantee that these formats will be alive in 100 years.

I know there is no easy answer to this question but I am hoping to at least open a discussion to see what other's thoughts are. These are issues that I feel that we as a community need to address/discuss.

angeliapayne wrote:I am aware that there is a strong push for the E57 format (I'm also aware of LAS) - but there is no guarantee that these formats will be alive in 100 years.

My first choice would probably be E57 as this is exactly the type of use case it was intended for. Both E57 and LAS are documented and have open-source implementations so you can be as confident as reasonably possible that the files will still be readable on whatever computers we are using 100 years from now. I would tend to stay away from LAS because of it's inability to retain the spherical data structure in a standard way. Although somewhat unproven, in my opinion E57 is the best candidate we have for a useful vendor neutral file format and getting it supported by these types of archives is a good way to assure it's adoption.

I'm not sure about Z+F but, unfortunately, going from Optech to E57 could present some practical challenges as I haven't seen out of the box support from them yet. It's certainly doable but could require some effort and intermediate formats.

Thank you both for your replies. With a bit more research, we are leaning towards archiving data in the e57 format. I think it is the best fit for this situation (and certainly much better than ASCII) and I agree that a project such as this could help in promoting the format even more. I have spoke with Polyworks technical support (used to process the Optech data) and the support of the e57 format is on the future release wishlist. While Polyworks IMAlign does offer a PTX export (which can be converted to e57 in other applications such as Cyclone), it appears that the PTX export only exports the interpolated Optech data, which in our case is not really useful. Therefore, the only option at this point is to parse to a PTX file directly from the Optech parsing program (and then convert to e57 using Cyclone or possibly FME). This will give us an original, unaltered copy of an Optech scan that should preserve the spherical coordinates.

While this process is a headache now, hopefully more software support of open formats in the future should make this ALL alot easier! Thanks for everyone's input. If you have any additional thoughts on the matter, please continue to post them as this topic will continue to evolve.

angeliapayne wrote:I have spoke with Polyworks technical support (used to process the Optech data) and the support of the e57 format is on the future release wishlist.

Yeah, it's to bad this didn't make it into version 12.

angeliapayne wrote:Therefore, the only option at this point is to parse to a PTX file directly from the Optech parsing program (and then convert to e57 using Cyclone or possibly FME). This will give us an original, unaltered copy of an Optech scan that should preserve the spherical coordinates.

Of course, this wouldn't include the transformation matrices you've calculated in PolyWorks during the alignment, unless you want to add them to the PTX files in a separate step.

Another possibility to think about might be writing your own pf2e57 converter. PolyWorks comes with a library for reading their pf files, although the format (described in their documentation) is so simple you probably don't even need it. Combine that with the writing functions of libe57 and it might not be to bad of a job. I've been thinking about doing this myself but so far haven't had the need, or time, to pursue it.

Another possibility to think about might be writing your own pf2e57 converter. PolyWorks comes with a library for reading their pf files, although the format (described in their documentation) is so simple you probably don't even need it. Combine that with the writing functions of libe57 and it might not be to bad of a job. I've been thinking about doing this myself but so far haven't had the need, or time, to pursue it.

But once PW imports and grids the Pf file, is it still a Pf file? In other words, if I convert pf2e57, wouldn't I want to do that BEFORE I import the Pf to IMAlign (and therefore before any scan algnment)?

Of course, this wouldn't include the transformation matrices you've calculated in PolyWorks during the alignment, unless you want to add them to the PTX files in a separate step.

I was thinking I might export the transformation matrices from the alignment and provide them (not necessarily apply them) to each scan. With Optech data in particular where you often obtain alot of extraneous data in the distance that you edit and delete, it is difficult to say whether you should archive the original, unedited, unregistered scans (straight from Parser) or the cleaned, registered version of the scan (out of IMAlign)? I guess if we are trying to ARCHIVE ORIGINAL SCANS that it should be the former...

ugh, so many theoretical issues to consider. I do appreciate your dialog on this though.

angeliapayne wrote:But once PW imports and grids the Pf file, is it still a Pf file? In other words, if I convert pf2e57, wouldn't I want to do that BEFORE I import the Pf to IMAlign (and therefore before any scan algnment)?

pf files can contain either interpolate grids, what you work with in IMAlign, or raw scan points, what the Optech parser generates and what you work with if you import those pf files directly into IMSurvey or IMInspect. Appendix E in the PolyWorks Reference manual completely describes the format.

Our workflow typically involves importing the original pf files into IMAlign, doing the registration, and exporting the alignment matrices. The original pf files are also imported into a separate IMSurvey project where any editing occurs. The alignment matrices are applied to the scans in IMSurvey to transfer the registration results from IMAlign. Therefore I would probably write a plugin for IMSurvey that exports the pf files (edited or raw) with associated transformation matrices to an e57 file.