To run the script you need Python with Ogr/Gdal Python bindings.
On Windows, I suggest to install it from OsGeo4W setup:
- run the setup and choose Advanced Install
- step forward until Select Packages form
- from Commandline_Utilities select and install: gdal, python (meta package), shapelib
- from Libs select and install: gdal-python
- click next and finish installation

Open OsGeo4W shell, move where you downloaded shapemerger.py and type

python shapemerger.py

If everything is ok, it will return the command usage.

Examples:

python shapemerger.py -o test.shp *.shp
- merge all the shapefiles of the current directory in test.shp
- the reference geometry type is extracted from the first file.
- union of dbf schemes (default)

python shapemerger.py -o test.shp *.shp -i -r -x *_railways.shp
- merge all the shapefiles of the current directory and subdirectories in test.shp
- exclude all the shapefiles named as ..._railways.shp
- intersection of dbf schemes

22 comments:

Hi Toni,Thank you for sharing your script. I'm looking for something that can intersect 2 shape files. Seems like your script might be the solution. It is very fast but somehow the the "-i" option output is identical to the output without choosing "-i". In other words, choosing intersect or Union gives the same output. Any idea why that might be?

When I opened original data with shapefile module before using your script I don't get any error. I also don't have any problem with opening merged file with QGIS. Do have any idea what could go wrong?

Thank you for answer. I have corrected module shapefile with some additional if-clause control and now it works. I think that problem was in original data (I didn't check all 100 input files for merging) - there were values '********' and '**' instead od integer and script stoped at this situation.

Ok, don't worry about prj file. Of course you're merging files with the same crs, so what you have to do is just to copy one of your existing prj to a file whose basename is identical to the generated shapefile: eg.: cp myfirst.prj myunion.prj(in order to have myunion.shp/shx/dbf/prj all together)

The real issue is the encoding. Try to do the following (untested):before launching set this env var: export SHAPE_ENCODING=UTF-8(should set the parser encoding)

open the script and replace line 370 with this:pOgrMergeShpLayer = pOgrMergeShpDatasource.CreateLayer( strMergeName, geom_type = intGeometryTypeRef, options=["ENCODING=UTF-8"] )(should set the writer encoding)

After these two changes UTF-8 encoding was preserved, and the merge.shp even got a .cpg file :)

You can see the full script here: http://pastebin.com/8jJTq9Hk----

If you want to improve the script I think it would be a good idea for it to be "encoding aware". The script could check for .cpg file in the input, and make output in UTF-8 by default (but this can be overrridden with an argument)

I have made a very simple fix to adding a projection for the merged shapefile. I simply added that the shapefile should use WGS84 as crs. No reprojection will automatically be done, so all input should also be in WGS84.

The script can be seen here: http://pastebin.com/eu0qZdvH

Changes:L9 from osgeo import osr, ogr, gdal

L353-358#define a spatial reference for the merge shapefileoutSpatialReference = osr.SpatialReference()outSpatialReference.ImportFromEPSG(4326)

The scripts works fast but it is not giving results similar to ArcGIS Union tool. For every feature in the output shapefile should have attribute information from each input layer. But the output from your script gives attribute information only for one input layer while the other columns are left empty for a given feature.

Please suggest any solution which could be an alternate to ArcGIS union tool.

ArcGIS Union tool is doing two additional jobs as follows. Please suggest how do we implement the same through ogr/Gdal

1. Cracks and clusters the features. Cracking inserts vertices at the intersection of feature edges; clustering snaps together vertices that are within the x,y tolerance.2. Discovers geometric relationships (overlap) between features from all feature classes.

About

Hi, I am Tony. I have been working in GIS for several years. This place is a collection of ideas I came up with which I reckon good enough to share. I hope you find them of some help with your GIS teasers.