Global Options

x is a positive integer specifying the amount of increased logging, 0 is equivalent to the -v option alone.

-q

-quiet

Specifies that reduced logging should be enabled.

-q x

-quiet x

x is a positive integer specifying the amount of increased logging, 0 is equivalent to the -q option alone.

-p <plugin_class>

-plugin <plugin_class>

Allows an external plugin to be loaded. <plugin_class> is the name of a class implementing the com.bretth.osmosis.core.plugin.PluginLoader interface. This option may be specified multiple times to load multiple plugins.

Default Arguments

Some tasks can accept un-named or "default" arguments. In the tasks description, the argument name will be followed by "(default)".

For example, the --read-xml task has a file argument which may be unnamed. The following two command lines are equivalent.

osmosis --read-xml file=myfile.osm --write-null

osmosis --read-xml myfile.osm --write-null

Built-In Tasks

All tasks default to 0.6 versions from release 0.31 onwards.

0.6 tasks were first introduced in release 0.30. They can be explicitly specified by adding a "-0.6" suffix.
0.5 tasks can be specified by adding a "-0.5" suffix.
0.4 tasks were dropped as of version 0.22.

API Database Tasks

The tasks are to be used with the schema that backs the OSM API. These tasks support the 0.6 database only, and support both PostgreSQL and MySQL variants. It is highly recommended to use PostgreSQL due to the better testing it receives.

If yes is specified, the task will validate the current schema version before accessing the database.

yes, no

yes

allowIncorrectSchemaVersion

If validateSchemaVersion is yes, this option controls the result of a schema version check failure. If this option is yes, a warning is displayed and execution continues. If this option is no, an error is displayed and the program aborts.

yes, no

yes

readAllUsers

If set to yes, the user public edit flag will be ignored and user information will be attached to every entity.

yes, no

no

snapshotInstant

Defines the point in time for which to produce a data snapshot.

format is "yyyy-MM-dd_HH:mm:ss"

(now)

--read-apidb-current (--rdcur)

Reads the current contents of an API database. Note that this task cannot be used as a starting point for replication because it does not produce a consistent snapshot.

If yes is specified, the task will validate the current schema version before accessing the database.

yes, no

yes

allowIncorrectSchemaVersion

If validateSchemaVersion is yes, this option controls the result of a schema version check failure. If this option is yes, a warning is displayed and execution continues. If this option is no, an error is displayed and the program aborts.

yes, no

yes

readAllUsers

If set to yes, the user public edit flag will be ignored and user information will be attached to every entity.

If yes is specified, the task will validate the current schema version before accessing the database.

yes, no

yes

allowIncorrectSchemaVersion

If validateSchemaVersion is yes, this option controls the result of a schema version check failure. If this option is yes, a warning is displayed and execution continues. If this option is no, an error is displayed and the program aborts.

yes, no

yes

lockTables

If yes is specified, tables will be locked during the import. This provides measurable performance improvements but prevents concurrent queries.

yes, no

yes

populateCurrentTables

If yes is specified, the current tables will be populated after the initial history table population. If only history tables are required, this reduces the import time by approximately 80%.

If yes is specified, the task will validate the current schema version before accessing the database.

yes, no

yes

allowIncorrectSchemaVersion

If validateSchemaVersion is yes, this option controls the result of a schema version check failure. If this option is yes, a warning is displayed and execution continues. If this option is no, an error is displayed and the program aborts.

yes, no

yes

readAllUsers

If set to yes, the user public edit flag will be ignored and user information will be attached to every entity.

yes, no

no

intervalBegin

Defines the beginning of the interval for which to produce a change set.

format is "yyyy-MM-dd_HH:mm:ss"

(1970)

intervalEnd

Defines the end of the interval for which to produce a change set.

format is "yyyy-MM-dd_HH:mm:ss"

(now)

readFullHistory

0.6 only. If set to yes, complete history for the specified time interval is produced instead of a single change per entity modified in that interval. This is not useful for standard changesets, it is useful if a database replica with full history is being produced. Change files produced using this option will likely not be able to be processed by most tools supporting the *.osc file format.

If yes is specified, the task will validate the current schema version before accessing the database.

yes, no

yes

allowIncorrectSchemaVersion

If validateSchemaVersion is yes, this option controls the result of a schema version check failure. If this option is yes, a warning is displayed and execution continues. If this option is no, an error is displayed and the program aborts.

yes, no

yes

populateCurrentTables

If yes is specified, the current tables will be populated after the initial history table population. This is useful if only history tables were populated during import.

If yes is specified, the task will validate the current schema version before accessing the database.

yes, no

yes

allowIncorrectSchemaVersion

If validateSchemaVersion is yes, this option controls the result of a schema version check failure. If this option is yes, a warning is displayed and execution continues. If this option is no, an error is displayed and the program aborts.

yes, no

yes

MySQL Tasks

The MySQL tasks are to be used with the MySQL schema that backs the OSM API. Please note that there are no 0.6 versions of these tasks. Instead, they are replaced with the "apidb" tasks.

If yes is specified, the task will validate the current schema version before accessing the database.

yes, no

yes

allowIncorrectSchemaVersion

If validateSchemaVersion is yes, this option controls the result of a schema version check failure. If this option is yes, a warning is displayed and execution continues. If this option is no, an error is displayed and the program aborts.

yes, no

yes

readAllUsers

If set to yes, the user public edit flag will be ignored and user information will be attached to every entity.

yes, no

no

snapshotInstant

Defines the point in time for which to produce a data snapshot.

format is "yyyy-MM-dd_HH:mm:ss"

(now)

--read-mysql-current (--rmcur)

Reads the current contents of a MySQL database. Note that this task cannot be used as a starting point for replication because it does not produce a consistent snapshot.

If yes is specified, the task will validate the current schema version before accessing the database.

yes, no

yes

allowIncorrectSchemaVersion

If validateSchemaVersion is yes, this option controls the result of a schema version check failure. If this option is yes, a warning is displayed and execution continues. If this option is no, an error is displayed and the program aborts.

yes, no

yes

readAllUsers

If set to yes, the user public edit flag will be ignored and user information will be attached to every entity.

If yes is specified, the task will validate the current schema version before accessing the database.

yes, no

yes

allowIncorrectSchemaVersion

If validateSchemaVersion is yes, this option controls the result of a schema version check failure. If this option is yes, a warning is displayed and execution continues. If this option is no, an error is displayed and the program aborts.

yes, no

yes

lockTables

If yes is specified, tables will be locked during the import. This provides measurable performance improvements but prevents concurrent queries.

yes, no

yes

populateCurrentTables

If yes is specified, the current tables will be populated after the initial history table population. If only history tables are required, this reduces the import time by approximately 80%.

If yes is specified, the task will validate the current schema version before accessing the database.

yes, no

yes

allowIncorrectSchemaVersion

If validateSchemaVersion is yes, this option controls the result of a schema version check failure. If this option is yes, a warning is displayed and execution continues. If this option is no, an error is displayed and the program aborts.

yes, no

yes

readAllUsers

If set to yes, the user public edit flag will be ignored and user information will be attached to every entity.

yes, no

no

intervalBegin

Defines the beginning of the interval for which to produce a change set.

format is "yyyy-MM-dd_HH:mm:ss"

(1970)

intervalEnd

Defines the end of the interval for which to produce a change set.

format is "yyyy-MM-dd_HH:mm:ss"

(now)

readFullHistory

0.6 only. If set to yes, complete history for the specified time interval is produced instead of a single change per entity modified in that interval. This is not useful for standard changesets, it is useful if a database replica with full history is being produced. Change files produced using this option will likely not be able to be processed by most tools supporting the *.osc file format.

If yes is specified, the task will validate the current schema version before accessing the database.

yes, no

yes

allowIncorrectSchemaVersion

If validateSchemaVersion is yes, this option controls the result of a schema version check failure. If this option is yes, a warning is displayed and execution continues. If this option is no, an error is displayed and the program aborts.

yes, no

yes

populateCurrentTables

If yes is specified, the current tables will be populated after the initial history table population. This is useful if only history tables were populated during import.

If yes is specified, the task will validate the current schema version before accessing the database.

yes, no

yes

allowIncorrectSchemaVersion

If validateSchemaVersion is yes, this option controls the result of a schema version check failure. If this option is yes, a warning is displayed and execution continues. If this option is no, an error is displayed and the program aborts.

yes, no

yes

XML Tasks

The xml tasks are used to read and write "osm" data files and "osc" changeset files.

--read-xml (--rx)

Reads the current contents of an OSM XML file.

Pipe

Description

outPipe.0

Produces an entity stream.

Option

Description

Valid Values

Default Value

file (default)

The name of the osm file to be read, "-" means STDIN.

dump.osm

enableDateParsing

If set to yes, the dates in the osm xml file will be parsed, otherwise all dates will be set to a single time approximately equal to application startup. Setting this to no is only useful if the input file doesn't contain timestamps. It used to improve performance but date parsing now incurs low overhead.

yes, no

yes

compressionMethod

Specifies the compression method that has been used to compress the file. In most cases this isn't required because the compression method will be automatically determined from the file name (*.gz=gzip, *.bz2=bzip2).

--fast-read-xml (no short option available)

0.6 only. As per the --read-xml task but using a STAX XML parser instead of SAX for improved performance. This has undergone solid testing and should be reliable but all xml processing tasks have not yet been re-written to use the new implementation thus is not the default yet.

--write-xml (--wx)

Writes data to an OSM XML file.

Pipe

Description

inPipe.0

Consumes an entity stream.

Option

Description

Valid Values

Default Value

file (default)

The name of the osm file to be written, "-" means STDOUT.

dump.osm

compressionMethod

Specifies the compression method that has been used to compress the file. In most cases this isn't required because the compression method will be automatically determined from the file name (*.gz=gzip, *.bz2=bzip2).

none, gzip, bzip2

none

--read-xml-change (--rxc)

Reads the contents of an OSM XML change file.

Pipe

Description

outPipe.0

Produces a change stream.

Option

Description

Valid Values

Default Value

file (default)

The name of the osm change file to be read, "-" means STDIN.

change.osc

enableDateParsing

If set to yes, the dates in the osm xml file will be parsed, otherwise all dates will be set to a single time approximately equal to application startup. Setting this to no is only useful if the input file doesn't contain timestamps. It used to improve performance but date parsing now incurs low overhead.

yes, no

yes

compressionMethod

Specifies the compression method that has been used to compress the file. In most cases this isn't required because the compression method will be automatically determined from the file name (*.gz=gzip, *.bz2=bzip2).

none, gzip, bzip2

none

--write-xml-change (--wxc)

Writes changes to an OSM XML change file.

Pipe

Description

inPipe.0

Consumes a change stream.

Option

Description

Valid Values

Default Value

file (default)

The name of the osm change file to be written, "-" means STDOUT.

change.osc

compressionMethod

Specifies the compression method that has been used to compress the file. In most cases this isn't required because the compression method will be automatically determined from the file name (*.gz=gzip, *.bz2=bzip2).

none, gzip, bzip2

none

Area Filtering Tasks

These tasks can be used to retrieve data by filtering based on the location of interest.

Include all available nodes for ways which have at least one node in the bounding box.

yes, no

no

completeRelations

Include all available relations which are members of relations which have at least one member in the bounding box.

yes, no

no

idTrackerType

Specifies the memory mechanism for tracking selected ids. BitSet is more efficient for very large bounding boxes (where node count is greater than 1/32 of maximum node id), IdList will be more efficient for all smaller bounding boxes. Dynamic breaks the overall id range into small segments and chooses the most efficient of IdList or BitSet for that interval.

BitSet, IdList, Dynamic

Dynamic

clipIncompleteEntities

0.6 only. Specifies what the behaviour should be when entities are encountered that have missing relationships with other entities. For example, ways with missing nodes, and relations with missing members. This occurs most often at the boundaries of selection areas, but may also occur due to referential integrity issues in the database or inconsistencies in the planet file snapshot creation. If set to true the entities are modified to remove the missing references, otherwise they're left intact.

true, false

false

If both lat/lon and slippy map coordinates are used then lat/lon coordinates are overriden by slippy map coordinates.

The clipIncompleteEntities was introduced in version 0.30 and changes default behaviour. It now defaults to false, but the previous behaviour was clipIncompleteEntities=true.

--bounding-polygon (--bp)

Extracts data within a polygon defined by series of lat/lon coordinates loaded from a polygon file.

The format of the polygon file is described at the MapRoom website, with two exceptions:

A special extension has been added to this task to support negative polygons, these are defined by the addition of a "!" character preceding the name of a polygon header within the file.

The first coordinate pair in the polygon definition is not, as defined on the MapRoom site, the polygon centroid; it is the first polygon point. The centroid coordinates are not required by Osmosis (nor are they expected but they won't break things if present and counted as part of the polygon outline).

Include all available nodes for ways which have at least one node in the bounding polygon.

yes, no

no

completeRelations

Include all available relations which are members of relations which have at least one member in the bounding polygon.

yes, no

no

idTrackerType

Specifies the memory mechanism for tracking selected ids. BitSet is more efficient for very large bounding boxes (where node count is greater than 1/32 of maximum node id), IdList will be more efficient for all smaller bounding boxes. Dynamic breaks the overall id range into small segments and chooses the most efficient of IdList or BitSet for that interval.

BitSet, IdList, Dynamic

Dynamic

clipIncompleteEntities

0.6 only. Specifies what the behaviour should be when entities are encountered that have missing relationships with other entities. For example, ways with missing nodes, and relations with missing members. This occurs most often at the boundaries of selection areas, but may also occur due to referential integrity issues in the database or inconsistencies in the planet file snapshot creation. If set to true the entities are modified to remove the missing references, otherwise they're left intact.

true, false

false

The clipIncompleteEntities was introduced in version 0.30 and changes default behaviour. It now defaults to false, but the previous behaviour was clipIncompleteEntities=true.

Changeset Derivation and Merging

These tasks provide the glue between osm and osc files by allowing changes to be derived from and merged into osm files.

--derive-change (--dc)

Compares two data sources and produces a changeset of the differences.

Note that this task requires both input streams to be sorted first by type then by id.

Pipe

Description

inPipe.0

Consumes an entity stream.

inPipe.1

Consumes an entity stream.

outPipe.0

Produces a change stream.

Option

Description

Valid Values

Default Value

no arguments

--apply-change (--ac)

Applies a change stream to a data stream.

Note that this task requires both input streams to be sorted first by type then by id.

Pipe

Description

inPipe.0

Consumes an entity stream.

inPipe.1

Consumes a change stream.

outPipe.0

Produces an entity stream.

Option

Description

Valid Values

Default Value

no arguments

Pipeline Control

These tasks allow the pipeline structure to be manipulated. These tasks do not perform any manipulation of the data flowing through the pipeline.

--write-null (--wn)

Discards all input data. This is useful for osmosis performance testing and for testing the integrity of input files.

Pipe

Description

inPipe.0

Consumes an entity stream.

Option

Description

Valid Values

Default Value

no arguments

--write-null-change (--wnc)

Discards all input change data. This is useful for osmosis performance testing and for testing the integrity of input files.

Pipe

Description

inPipe.0

Consumes a change stream.

Option

Description

Valid Values

Default Value

no arguments

--buffer (--b)

Allows the pipeline processing to be split across multiple threads. The thread for the input task will post data into a buffer of fixed capacity and block when the buffer fills. This task creates a new thread that reads from the buffer and blocks if no data is available. This is useful if multiple CPUs are available and multiple tasks consume significant CPU.

Pipe

Description

inPipe.0

Consumes an entity stream.

outPipe.0

Produces an entity stream.

Option

Description

Valid Values

Default Value

bufferCapacity (default)

The size of the storage buffer. This is defined in terms of the number of entity objects to be stored. An entity corresponds to an OSM type such as a node.

100

--buffer-change (--bc)

As per --buffer but for a change stream.

Pipe

Description

inPipe.0

Consumes a change stream.

outPipe.0

Produces a change stream.

Option

Description

Valid Values

Default Value

bufferCapacity (default)

The size of the storage buffer. This is defined in terms of the number of change objects to be stored. A change object consists of a single entity with an associated action.

100

--log-progress (--lp)

Logs progress information using jdk logging at info level at regular intervals. This can be inserted into the pipeline to allow the progress of long running tasks to be tracked.

Pipe

Description

inPipe.0

Consumes an entity stream.

outPipe.0

Produces an entity stream.

Option

Description

Valid Values

Default Value

interval

The time interval between updates in seconds.

5

--log-progress-change(--lpc)

Logs progress of a change stream using jdk logging at info level at regular intervals. This can be inserted into the pipeline to allow the progress of long running tasks to be tracked.

Pipe

Description

inPipe.0

Consumes a change stream.

outPipe.0

Produces a change stream.

Option

Description

Valid Values

Default Value

interval

The time interval between updates in seconds.

5

--tee (--t)

Receives a single stream of data and sends it to multiple destinations. This is useful if you wish to read a single source of data and apply multiple operations on it.

Pipe

Description

inPipe.0

Consumes an entity stream.

outPipe.0

Produces an entity stream.

...

outPipe.n-1 (where n is the number of outputs specified)

Produces an entity stream.

Option

Description

Valid Values

Default Value

outputCount (default)

The number of destinations to write this data to.

2

--tee-change (--tc)

Receives a single stream of change data and sends it to multiple destinations. This is useful if you wish to read a single source of change data and apply multiple operations on it.

Pipe

Description

inPipe.0

Consumes a change stream.

outPipe.0

Produces a change stream.

...

outPipe.n-1 (where n is the number of outputs specified)

Produces a change stream.

Option

Description

Valid Values

Default Value

outputCount (default)

The number of destinations to write this data to.

2

Set Manipulation Tasks

These tasks allow bulk operations to be performed which operate on a combination of data streams allowing them to be combined or re-arranged in some way.

--sort (--s)

Sorts all data in an entity stream according to a specified ordering. This uses a file-based merge sort keeping memory usage to a minimum and allowing arbitrarily large data sets to be sorted.

Pipe

Description

inPipe.0

Consumes an entity stream.

outPipe.0

Produces an entity stream.

Option

Description

Valid Values

Default Value

type (default)

The ordering to apply to the data.

TypeThenId - This specifies to sort by the entity type (eg. nodes before ways), then by the entity id. This is the ordering a planet file contains.

TypeThenId

--sort-change (--sc)

Sorts all data in a change stream according to a specified ordering. This uses a file-based merge sort keeping memory usage to a minimum and allowing arbitrarily large data sets to be sorted.

Pipe

Description

inPipe.0

Consumes a change stream.

outPipe.0

Produces a change stream.

Option

Description

Valid Values

Default Value

type (default)

The ordering to apply to the data.

streamable - This specifies to sort by the entity type (eg. nodes before ways), then by the entity id. This allows a change to be applied to an xml file.

seekable - This sorts data so that it can be applied to a database without violating referential integrity.

streamable

--merge (--m)

Merges the contents of two data sources together.

Note that this task requires both input streams to be sorted first by type then by id.

Pipe

Description

inPipe.0

Consumes an entity stream.

inPipe.1

Consumes an entity stream.

outPipe.0

Produces an entity stream.

Option

Description

Valid Values

Default Value

conflictResolutionMethod

The method to use for resolving conflicts between data from the two sources.

version - 0.6 only. Choose the entity with the highest version, and second input source if both versions are identical.

timestamp - Choose the entity with the newest timestamp.

lastSource - Choose the entity from the second input source.

timestamp (0.5), version (0.6)

--merge-change (--mc)

Merges the contents of two changesets together.

Note that this task requires both input streams to be sorted first by type then by id.

Pipe

Description

inPipe.0

Consumes a change stream.

inPipe.1

Consumes a change stream.

outPipe.0

Produces a change stream.

Option

Description

Valid Values

Default Value

conflictResolutionMethod

The method to use for resolving conflicts between data from the two sources.

version - 0.6 only. Choose the entity with the highest version, and second input source if both versions are identical.

timestamp - Choose the entity with the newest timestamp.

lastSource - Choose the entity from the second input source.

timestamp (0.5), version (0.6)

--append-change (--apc)

Combines multiple change streams into a single change stream. The data from each input is consumed in sequence so that the result is a concatenation of data from each source. This output stream stream will be unsorted and may need to be fed through a --sort-change task.

This task is intended for use with full history change files. If delta change files are being used (ie. only one change per entity per file), then the --merge-change task may be more appropriate.

Pipe

Description

inPipe.0

Consumes a change stream.

...

inPipe.n-1

Consumes a change stream.

outPipe.0

Produces a change stream.

Option

Description

Valid Values

Default Value

sourceCount

The number of change streams to be appended.

A positive integer.

2

--simplify-change (--simc)

Collapses a "full-history" change stream into a "delta" change stream. The result of this operation is a change stream guaranteed to contain a maximum of one change per entity.

For example, if an entity is created and modified in a single change file, this task will modify it to be a single create operation with the data of the modify operation.

Pipe

Description

inPipe.0

Consumes a change stream.

outPipe.0

Produces a change stream.

Option

Description

Valid Values

Default Value

N/A

Data Manipulation Tasks

These tasks allow the entities being passed through the pipeline to be manipulated.

--node-key (--nk)

Given a list of "key" tags, this filter passes on only those nodes that have at least one of those tags set.

Note that this filter only operates on nodes. All ways and relations are filtered out.

This filter will only be available with version >= 0.30 (or svn).

Pipe

Description

inPipe.0

Consumes an entity stream.

outPipe.0

Produces an entity stream.

Option

Description

Valid Values

Default Value

keyList

Comma-separated list of desired keys

N/A

--node-key-value (--nkv)

Given a list of "key.value" tags, this filter passes on only those nodes that have at least one of those tags set.

Note that this filter only operates on nodes. All ways and relations are filtered out.

This filter will only be available with version >= 0.30 (or svn).

Pipe

Description

inPipe.0

Consumes an entity stream.

outPipe.0

Produces an entity stream.

Option

Description

Valid Values

Default Value

keyValueList

Comma-separated list of desired key.value combinations

N/A

--way-key (--wk)

Given a list of "key" tags, this filter passes on only those ways that have at least one of those tags set.

Note that this filter only operates on ways. All nodes and relations are passed on unmodified.

This filter is currently only available in svn.

Pipe

Description

inPipe.0

Consumes an entity stream.

outPipe.0

Produces an entity stream.

Option

Description

Valid Values

Default Value

keyList

Comma-separated list of desired keys

N/A

--way-key-value (--wkv)

Given a list of "key.value" tags, this filter passes on only those ways that have at least one of those tags set.

Note that this filter only operates on ways. All nodes and relations are passed on unmodified.

All keyword arguments are interpreted as tag parameters. They should be in the form "key=value" and specify which tags are matched by this filter. Each tag-filter task operates only on the specified entity type (passing other entity types through without touching them), and within that type accepts or rejects entities according to its tag parameters. If no tag parameters are specified, the filter matches all tags. Multiple values can be specified for one key, in a comma-separated list. A tag value list of * (a single asterisk) matches any value.

The separator character, equality character, and wildcard character ( , = * respectively) can be included in keys or values using the following escape sequences:

Escape sequence

Replaced with

%a

*

%c

,

%e

=

%s

space

%%

literal '%' symbol

In practice, there are only limited circumstances where you absolutely must escape these characters:

This will keep only ways with tag highway=(anything), then throw away those ways where tag highway is motorway or motorway_link. All relations are discarded, then all nodes which are not in the ways are discarded. The remaining entities are written out in XML.

You may need to work on two separate entity streams and merge them after filtering. If both inputs for the merge are coming from the same thread (e.g. using the tee task followed by the merge task), Osmosis will experience deadlock and the operation will never finish. One solution is to read the data in two separate tasks:

--used-node (--un)

Restricts output of nodes to those that are used in ways.

Pipe

Description

inPipe.0

Consumes an entity stream.

outPipe.0

Produces an entity stream.

Option

Description

Valid Values

Default Value

idTrackerType

Specifies the memory mechanism for tracking selected ids. BitSet is more efficient for very large bounding boxes (where node count is greater than 1/32 of maximum node id), IdList will be more efficient for all smaller bounding boxes.

BitSet, IdList

IdList

--migrate

Changes an API 0.5 entity stream into an API 0.6 entity stream.

Pipe

Description

inPipe.0

Consumes a version 0.5 entity stream.

outPipe.0

Produces a version 0.6 entity stream.

--migrate-change

Changes an API 0.5 change stream into an API 0.6 change stream.

Pipe

Description

inPipe.0

Consumes a version 0.5 change stream.

outPipe.0

Produces a version 0.6 change stream.

PostGIS Tasks

Osmosis provides a PostGIS schema for storing a snapshot of OSM data. All geo-spatial aspects of the data are stored using PostGIS geometry data types. Node locations are always stored as a point. Ways are related to nodes as in the normal API schema, however they may optionally have bounding box and/or full linestring columns added as well allowing a full set of geo-spatial operations to be performed on them.

--write-pgsql (--wp)

Populates an empty PostGIS database with a "simple" schema. A schema creation script is available in the osmosis script directory.

The 0.5 schema is a one size fits all solution. The 0.6 schema has a number of optional columns and tables that can be optionally installed with additional schema creation scripts. This task queries the schema to automatically detect which of those features is installed.

If yes is specified, the task will validate the current schema version before accessing the database.

yes, no

yes

allowIncorrectSchemaVersion

If validateSchemaVersion is yes, this option controls the result of a schema version check failure. If this option is yes, a warning is displayed and execution continues. If this option is no, an error is displayed and the program aborts.

yes, no

yes

enableBboxBuilder

0.6 only. If yes is specified, the task will build the bbox geometry column using a java-based solution instead of running a post-import query. Using this option provides significant performance improvements compared to the default query approach. Note that the task will fail if this option is enabled and the bbox column doesn't exist.

yes, no

no

enableLinestringBuilder

As per the enableBboxBuilder option but for the linestring geometry column.

yes, no

no

nodeLocationStoreType

0.6 only. This option only takes effect if at least one of the enableBboxBuilder and enableLinestringBuilder options are enabled. Both geometry builder implementations require knowledge of all node locations. This option specifies how those nodes are temporarily stored. If you have large amounts of memory (at least 6GB of system memory, a 64-bit JVM and at least 4GB of JVM RAM specified with the -Xmx option) you may use the "InMemory" option. Otherwise you must choose between the "TempFile" option which is much slower but still faster than relying on the default database geometry building implementation, or the "CompactTempFile" option which is more efficient for smaller datasets.

"InMemory", "TempFile", "CompactTempFile"

"CompactTempFile"

--fast-write-pgsql (--fwp)

Populates an empty PostGIS database with a "simple" schema. This achieves the same result as the standard --write-pgsql task but uses the recent COPY support added to Java PostgreSQL JDBC drivers to improve import speed.

If yes is specified, the task will validate the current schema version before accessing the database.

yes, no

yes

allowIncorrectSchemaVersion

If validateSchemaVersion is yes, this option controls the result of a schema version check failure. If this option is yes, a warning is displayed and execution continues. If this option is no, an error is displayed and the program aborts.

yes, no

yes

nodeLocationStoreType

0.6 only. This option only takes effect if at least one of the bbox or linestring columns exist on the ways table. Both geometry builder implementations require knowledge of all node locations. This option specifies how those nodes are temporarily stored. If you have large amounts of memory (at least 6GB of system memory, a 64-bit JVM and at least 4GB of JVM RAM specified with the -Xmx option) you may use the "InMemory" option. Otherwise you must choose between the "TempFile" option which is much slower but still faster than relying on the default database geometry building implementation, or the "CompactTempFile" option which is more efficient for smaller datasets.

"InMemory", "TempFile", "CompactTempFile"

"CompactTempFile"

--write-pgsql-dump (--wpd)

Writes a set of data files suitable for loading a PostGIS database with a "simple" schema using COPY statements. A schema creation script is available in the osmosis script directory. A load script is also available which will invoke the COPY statements and update all indexes and special index support columns appropriately. This option should be used on large import data (like the planet file), since it is much faster than --write-pgsql

Pipe

Description

inPipe.0

Consumes an entity stream.

Option

Description

Valid Values

Default Value

directory

The name of the directory to write the data files into.

pgimport

enableBboxBuilder

0.6 only. If yes is specified, the task will build the bbox geometry column using a java-based solution instead of running a post-import query. Using this option provides significant performance improvements compared to the query approach.

yes, no

no

enableLinestringBuilder

As per the enableBboxBuilder option but for the linestring geometry column.

yes, no

no

nodeLocationStoreType

0.6 only. This option only takes effect if at least one of the enableBboxBuilder and enableLinestringBuilder options are enabled. Both geometry builder implementations require knowledge of all node locations. This option specifies how those nodes are temporarily stored. If you have large amounts of memory (at least 6GB of system memory, a 64-bit JVM and at least 4GB of JVM RAM specified with the -Xmx option) you may use the "InMemory" option. Otherwise you must choose between the "TempFile" option which is much slower but still faster than relying on the default database geometry building implementation, or the "CompactTempFile" option which is more efficient for smaller datasets.

If yes is specified, the task will validate the current schema version before accessing the database.

yes, no

yes

allowIncorrectSchemaVersion

If validateSchemaVersion is yes, this option controls the result of a schema version check failure. If this option is yes, a warning is displayed and execution continues. If this option is no, an error is displayed and the program aborts.

If yes is specified, the task will validate the current schema version before accessing the database.

yes, no

yes

allowIncorrectSchemaVersion

If validateSchemaVersion is yes, this option controls the result of a schema version check failure. If this option is yes, a warning is displayed and execution continues. If this option is no, an error is displayed and the program aborts.

If yes is specified, the task will validate the current schema version before accessing the database.

yes, no

yes

allowIncorrectSchemaVersion

If validateSchemaVersion is yes, this option controls the result of a schema version check failure. If this option is yes, a warning is displayed and execution continues. If this option is no, an error is displayed and the program aborts.

yes, no

yes

API Tasks

These tasks provide the ability to interact directly with the OSM API. This is the API that is used directly by editors such as JOSM.

--read-api (--ra)

Retrieves the contents of a bounding box from the API. This is subject to the bounding box size limitations imposed by the API.

Dataset Tasks

Dataset tasks are those that act on on the generic dataset interface exposed by several data stores. For example the #PostGIS Tasks. These tasks allow data queries and data manipulation to be performed in a storage method agnostic manner.

--dataset-bounding-box (--dbb)

Extracts data within a specific bounding box defined by lat/lon coordinates. This differs from the --bounding-box task in that it operates on a dataset instead of an entity stream, in other words it uses the features of the underlying database to perform a spatial query instead of examining all nodes in a complete stream.

This implementation will never clip ways at box boundaries, and depending on the underlying implementation may detect ways crossing a box without having any nodes within that box.

Pipe

Description

inPipe.0

Consumes a dataset.

outPipe.0

Produces an entity stream.

Option

Description

Valid Values

Default Value

left

The longitude of the left edge of the box.

-180 to 180

-180

right

The longitude of the right edge of the box.

-180 to 180

180

top

The latitude of the top edge of the box.

-90 to 90

90

bottom

The latitude of the bottom edge of the box.

-90 to 90

-90

completeWays

Include all nodes for all included ways.

yes, no

no

--dataset-dump (--dd)

Converts an entire dataset to an entity stream.

Pipe

Description

inPipe.0

Consumes a dataset.

outPipe.0

Produces an entity stream.

Option

Description

Valid Values

Default Value

no arguments

Reporting Tasks

These tasks provide summaries of data processed by the pipeline.

--report-entity (--re)

Produces a summary report of each entity type and the users that last modified them.

Pipe

Description

inPipe.0

Consumes an entity stream.

Option

Description

Valid Values

Default Value

file (default)

The file to write the report to.

entity-report.txt

--report-integrity (--ri)

Produces a list of the referential integrity issues in the data source.

Pipe

Description

inPipe.0

Consumes an entity stream.

Option

Description

Valid Values

Default Value

file (default)

The file to write the report to.

integrity-report.txt

Replication Tasks

These tasks are used for replicating changes between data stores. They typically work with change streams and can therefore be coupled with other change stream tasks depending on the job to be performed.

There are two major types of change files:

Delta - Contain minimal changes to update a dataset. This implies a maximum of 1 change per entity.

Full-History - Contain the full set of historical changes. This implies that there may be multiple changes per entity.

All tasks support the "delta" style of changesets. Some tasks do not support the "full-history" change files.

--read-change-interval (--rci)

Retrieves a set of change files from a server, merges them into a single stream, and tracks the current timestamp.

The changes produced by this task are typically delta changes.

Pipe

Description

outPipe.0

Produces a change stream.

Option

Description

Valid Values

Default Value

workingDirectory (default)

The directory containing the state and config files.

(current directory)

--read-change-interval-init (--rcii)

Initialises a working directory to contain files necessary for use by the --read-change-interval task. This task must be run once to create the directory structure and the configuration file manually edited to contain the required settings.

Pipe

Description

n/a

Option

Description

Valid Values

Default Value

workingDirectory (default)

The directory to populate with state and config files.

(current directory)

initialDate

The timestamp to begin replication from. Only changesets containing data after this timestamp will be downloaded. Note that unlike most tasks accepting dates, this date is specified in UTC.

format is "yyyy-MM-dd_HH:mm:ss"

N/A

--read-replication-interval (--rri)

Retrieves a set of replication files from a server, combines them into a single stream, sorts the result, and tracks the current timestamp. Available since osmosis 0.32.

The changes produced by this task are typically full-history changes.

Pipe

Description

outPipe.0

Produces a change stream.

Option

Description

Valid Values

Default Value

workingDirectory (default)

The directory containing the state and config files.

(current directory)

--read-replication-interval-init (--rrii)

Initialises a working directory to contain files necessary for use by the --read-replication-interval task. This task must be run once to create the directory structure and the configuration file manually edited to contain the required settings.

Pipe

Description

n/a

Option

Description

Valid Values

Default Value

workingDirectory (default)

The directory to populate with state and config files.

(current directory)

Note:
This will create a configuration.txt and a download.lock file in the <workingDirectory>. Then you need to manually edit the configuration.txt file and change the url to the one of minute or hourly replicate (eg :
baseUrl=http://planet.openstreetmap.org/minute-replicate for the web or baseUrl=file:///your/replicate-folder for local filesystem)

If no state.txt file exists, the first invocation will result in the latest state file being downloaded. If you wish to start from a known point you need to download from http://planet.openstreetmap.org/minute-replicate the state file of the start date you want for your replication put it into your <workingDirectory> with name state.txt. You can use the replicate-sequences tool to find a matching file. Take one at least an hour earlier than your start date to avoid missing changes.

--merge-replication-files (--mrf)

Retrieves a set of replication files from a server, combines them into larger time intervals, sorts the result, and tracks the current timestamp.

The changes produced by this task are full-history changes.

Pipe

Description

N/A

Option

Description

Valid Values

Default Value

workingDirectory (default)

The directory containing the state and config files.

(current directory)

--merge-replication-files-init (--mrfi)

Initialises a working directory to contain files necessary for use by the --merge-replication-files task. This task must be run once to create the directory structure and the configuration file manually edited to contain the required settings.

Pipe

Description

n/a

Option

Description

Valid Values

Default Value

workingDirectory (default)

The directory to populate with state and config files.

(current directory)

Note:
This will create a configuration.txt and a download.lock file in the <workingDirectory>. Then you need to manually edit the configuration.txt file and change the url to the one of minute or hourly replicate (eg :
baseUrl=http://planet.openstreetmap.org/minute-replicate for the web or baseUrl=file:///your/replicate-folder for local filesystem)
You will need to edit the configuration file to specify the time interval to group changes by.

If no state.txt file exists, the first invocation will result in the latest state file being downloaded. If you wish to start from a known point you need to download from http://planet.openstreetmap.org/minute-replicate the state file of the start date you want for your replication put it into your <workingDirectory> with name state.txt. You can use the replicate-sequences tool to find a matching file. Take one at least an hour earlier than your start date to avoid missing changes.

--replicate-apidb (--repa)

This task provides replication files for consumers to download. It is primarily run against the production API database with the results made available on the planet server.

The first time this task runs it will initialise the working directory with the current state of the database server. Every subsequent invocation will read all changes since the last invocation and write the results to the output. All changes will be sorted by type, then id, then version.

If yes is specified, the task will validate the current schema version before accessing the database.

yes, no

yes

allowIncorrectSchemaVersion

If validateSchemaVersion is yes, this option controls the result of a schema version check failure. If this option is yes, a warning is displayed and execution continues. If this option is no, an error is displayed and the program aborts.

yes, no

yes

readAllUsers

If set to yes, the user public edit flag will be ignored and user information will be attached to every entity.

yes, no

no

directory

The working directory.

replicate

--read-replication-lag (--rrl)

This Task takes the state.txt in an replication working directory and compares its timestamp (that's the timestamp of the last chunk of that that osmosis downloaded) with the timestamp of the servers state.txt (that's the timestamp of the last chunk of that that the server has produced). It then calculates the difference and prints it to stdout. In order to get only the printed value its often useful to pipe the log messages on stderr to /dev/null.

--induce-ways-for-turnrestrictions (-iwtt)

Convert all intersections with turn-restrictions from a node into an equivalent number of oneway-streets that can only be traveled as allowed
by the turn-restriction. This is meant to be a preprocessing-step for routers that cannot deal with restrictions/cost on graph-nodes.

--simplify

The simplify plugin filters to drop some elements in order to simplify the data. Currently it does one extremely crude form of simplification. It drops all nodes apart from the start and end nodes of every way.