ASPRS LAS is probably the most commonly used LiDAR format, and PDAL’s support
of LAS is important for many users of the library. This tutorial describes and
demonstrates some of the capabilities the drivers provide, points out items to
be aware of when using the drivers, and hopefully provides some examples you
can use to get what you need out of the LAS drivers.

There are five LAS versions – 1.0 to 1.4. Each iteration added some
complexity to the format in terms of capabilities it supports, possible data
types it stores, and metadata. Users of LAS must balance the features they need
with the use of the data by downstream applications. While LAS support in some
form is quite widespread throughout the industry, most applications do not
support every feature of each version. PDAL works to provide many of
these features, but it is also incomplete. Specifically, PDAL doesn’t support
point formats that store waveform data.

We can use the minor_version option of writers.las to set the
version PDAL should output. The following example will write a 1.1 version LAS
file. Depending on the features you need, this may or may not be what you want.

LAS 1.0 to 1.3 used GeoTIFF keys for storing coordinate system information,
while LAS 1.4 uses Well Known Text. GeoTIFF is well-supported by most software
that read LAS, but it is not possible to express some coordinate system
specifics with GeoTIFF. WKT is more expressive and supports
more coordinate systems than GeoTIFF, but vendor-specific and later versions
(WKT 2) may not be handled well.

The PDAL writers.las allows you to override or assign the coordinate
system to an explicit value if you need. Often the coordinate system defined by
a file might be incorrect or non-existent, and you can set this with PDAL.

The following example sets the a_srs option of the writers.las to
EPSG:4326.

Remember to set offset_x, offset_y, scale_x, and
scale_y values to something appropriate if your are storing decimal
degree data in LAS files. The special value auto can be used for the
offset values, but you should set an explicit value for the scale values
to prevent overdriving the precision of the data and disrupting
Compression with LASzip.

Vertical coordinate control is important in LiDAR and PDAL supports
assignment
and reprojection/transform of vertical coordinates using Proj.4 and GDAL.
The coordinate system description magic happens in GDAL, and you assign a
compound coordinate system (both vertical and horizontal definitions) using
the following syntax:

EPSG:4326+3855

This assignment states typical 4326 horizontal coordinate system plus a vertical one that
represents EGM08. In Well Known Text, this coordinate system is described by:

As in Assignment Example, it is common to need to reassign the coordinate
system. The following example defines the both the horizontal and vertical
coordinate system for a file to UTM Zone 15N NAD83 for horizontal and
NAVD88 for the vertical.

Any coordinate system description format supported by GDAL’s
SetFromUserInput
method can be used to assign or set the coordinate system in PDAL.
This includes WKT, Proj.4 definitions, or OGC URNs. It is your
responsibility to escape or massage any input data to make it be
valid JSON, however.

When you are transforming coordinates, you might need to set the
scale_x, scale_y, offset_x, and offset_y values to
something reasonable for your output coordinate system.

Note

If the input data doesn’t specify a projection, you must specify the
in_srs option of filters.reprojection. in_srs can also
be used to override an existing spatial reference attached to the input
point set.

As each revision of LAS was released, more point formats were added. A Point
Format is
the fixed set of Dimensions that a LAS file stores for each point in
the file. For any given point format, the size and composition of dimensions is
consistent across versions, but users should be aware of some minor
interpretation changes based on LAS file version. For example, a classification
value of 11 in version 1.4 indicates “Road Surface”, while that value is
reserved in version 1.1.

Because LAS files have no built-in compression, it’s important to use a
point format that stores the fewest fields possible that store the desired
data. For example, point format 10 uses 45 more bytes per point than point
format zero.

If one wanted remove the Red/Green/Blue fields from a LAS file
(one using point format 2), one could simply set the dataformat_id
option to 0. The forward option can also be set to carry forward
all possible header values from the source file to the new, smaller file.

A LAS Point Format ID defines the fixed sent of Dimensions a file must
store, but softwares are allowed to store extra data beyond that fixed set.
This feature of the format was regularized in LAS 1.4 as something called
“extra bytes” or “extra dims”, but previos versions can
also store these extra per-point attributes.

Readers of the ASPRS LAS Specification will see there are many fields that
softwares are required to write, with their content mandated by various options
and configurations in the format. PDAL does not assume responsibility for
writing these fields and coercing meaning from the content to fit the
specification. It is the PDAL users’ responsibility to do so. Fields where
this might matter include:

The forward option of writers.las is the easiest way to get most of
what you might want in terms of header settings copied from an input to an
output file upon processing. Imagine the scenario of zero’ing out the
classification values for an LAS file in preparation for using
filters.pmf to reassign them. During this scenario, we’d like to keep
all of the other LAS header information, such as Variable Length Records,
extent information, and format settings.

If multiple input LAS files are being written to an output file, the
forward option can only preserve values when they are the same in
all input files. If the values differ, a default will be used (as it
would if the forward option weren’t supplied). You can specify
specific option values for output that will also override any forwarded
data.

LAS stores coordinates as 32 bit integers. It is the user’s responsibility to
ensure that the coordinate domain required by the data in the file fits within
the 32 bit integer domain. Most coordinate values have digits to the right
of the decimal point that must be preserved for sufficient accuracy.
Using the scale factor allows for integers to be interprested as floating
point values when read by software.

When writing data to LAS, choosing an appropriate scale factor should take
into account not just the maximum precision that can be accommodated by the
format, but the actual precision of the data. Using a precision greater than
the resolution of the data collection can mislead users as to the actual
measurement precision of the data. In addition, it can lead to larger files
when writing compressed data with LASzip.

Users can allow PDAL select scale and offset values for data with the auto
option. This can have some detrimental effects on downstream processing.
auto for scale values will use the entire 32-bit integer domain.
This maximizes the precision available to
store the data, but this will have a detrimental effect on LASzip storage
efficiency. auto for offset calculation is just fine, however. When given
the option, choose to store ASPRS LAS data with an explicit scale for the X,
Y, and Z dimensions that represents actual expected data precision, not
artificial storage precision or maximal storage precision.

LASzip is an open source, lossless compression technique for ASPRS LAS data.
It is supported by two different software libraries, and it can be used in both
the C/C++ and the JavaScript execution environments. LAZ support is provided
by both readers.las and writers.las. It can be enabled by
setting the compression option to laszip.

Variable Length Records, or VLRs, are blobs of binary data that the LAS format
supports to allow applications to store their own data. Coordinate system
information is one type of data stored in VLRs, and many different LAS-using
applications store data and metadata with this format capability. PDAL allows
users to access VLR information, forward it along to newly written files, and
create VLRs that store processing history information.

Variable Length Records (VLRs) are how applications can insert their own data
into LAS files. Common VLR data include:

Coordinate system

Metadata

Processing history

Indexing

Note

There are VLRs that are defined by the specification, and they
have the VLR user_id of LASF_Spec or LASF_Projection.
LASF_Spec VLRs provide a description of the data beyond that
available in the header. LASF_Projection VLRs store the spatial
coordinate system of the data.

For LAS 1.0-1.3, the VLR length could be no larger than
65535 bytes. For EVLRs, stored at the end of the file in LAS 1.4, this limit
was increased to 4gb.

You can add your own VLRs to files to store processing information or
whatever you
want by providing a JSON block via writers.lasvlrs option that
defines the user_id and data items for the VLR. The data option
must be base64-encoded string output. The data will be converted to binary
information and stored in the VLR when the file is written.

The writers.las driver supports an option, pdal_metadata, that writes
two PDAL VLRs to LAS files. The first is the equivalent of
info’s
--metadata output. The second is a dump of the --pipeline serialization
option that describes all stages and options of the pipeline that created
the file. These two VLRs may
be useful in tracking down processing history of data, allow you to determine
which versions of PDAL may have written a file and what filter options were set
when it was written, and give you the ability to store metadata and other
information via pipeline user_data from your own applications.