Preparation

Software requirements

Python v3

PostgreSQL v9.x

PostGIS v2.x

To repeat the steps below, it will be necessary to make a request to De Lijn for getting access to their data. The login for their FTP goes into credentials.txt. Username on the first line, password on the second line. The script expects it in the same directory the script is run from.

Creation of the database and adding data to it

The data consists of a zip file which is updated regularly. It needs to be converted to UTF-8 from latin1.

It is, of course, possible to use Filezilla, WinSCP or even wget to do this, but I wrote a Python script which automates downloading (after checking it is necessary), unzipping and recoding:

#!/bin/python# -*- coding: utf-8 -*-importos,sys,re,zipfile,ftplibimportargparsezipre=re.compile('\d\d\d\d-\d\d-\d\d\.zip')parser=argparse.ArgumentParser(description='Fetch data from FTP server of De Lijn, unzip it and recode to UTF-8')parser.add_argument('--skipdownload','-d',action='store_true',help="Don't contact the FTP server, work with the most recent local file")parser.add_argument('--dontcallsuccessor','-s',action='store_true',help="don't call NewDBfromCSV.py when done")args=parser.parse_args()""" Fetch the latest zip file from the ftp site of De Lijn """classCallback(object):'''This prints a nice progress status on the command line'''def__init__(self,totalsize,fp):self.totalsize=totalsizeself.fp=fpself.received=0def__call__(self,data):self.fp.write(data)self.received+=len(data)print('\r%i%% complete'%(100.0*self.received/self.totalsize),end='\r')ifnot(args.skipdownload):print('Reading credentials from "credentials.txt"')withopen("credentials.txt")ascredentials:username,password=credentials.readlines()#print (username, password)print("Opening connection to FTP site of De Lijn")ftp=ftplib.FTP(host='poseidon.delijn.be',user=username,passwd=password)print("CD to current")ftp.cwd('current')print("Get name of file")fn=ftp.nlst()[0]size=ftp.size(fn)ifnot(fninos.listdir()):# Only download if a newer file is availableprint(fn+" found, downloading latest version of De Lijndata")withopen(fn,'wb')asfh:w=Callback(size,fh)#ftp.set_pasv(0)ftp.retrbinary('RETR %s'%fn,w,32768)ftp.quit()else:print('Latest version already present, nothing to do')sys.exit()""" Unzip the latest file we have available in the current directory """files=os.listdir()zipfn=''forfileinfiles:ifre.match(zipre,file):iffile>zipfn:zipfn=filezfile=zipfile.ZipFile(zipfn)print();print();print("Found "+zipfn)fornameinzfile.namelist():"""Recode csv-file with textual content to UTF-8 """(dirname,filename)=os.path.split(name)print("Decompressing "+filename)fd=open(name,"wb")fd.write(zfile.read(name).decode('latin-1').replace('\r','').replace('"','').encode('utf-8'))fd.close()ifnot(args.dontcallsuccessor):importNewDBfromCSV

Populate database

Now that we have unpacked the zip file to several csv files, it's time to put them into a PostGIS database.
I created a stored procedure to take care of the conversion between Lambert72 and WGS84.

If you put the following in 'NewDBfromCSV.py', it will get started at the end of the previous script automatically if needed.

Download relevant data from Openstreetmap with Overpass API

Download all bus stops and route relations in Flanders. recurse up and down to fetch all related stops, route relations and ways for the itineraries. Be aware that this is a hefty query even for the Overpass API. It returns 90MB of data and when run at the wrong time, it occasionally fails.

We start with an XML file which can be read by JOSM. The resulting file should not be uploaded directly to the server.
Each and every stop needs to be vetted and double checked and dragged to a suitable position before uploading.

The script was extended to create a report on stops for which the names or the route_ref differ between what was calculated and what is on Openstreetmap.

Work with the data

Creation of a route relation containing all the stops in the correct order

Adding stops is all very well, but they're only a building block of the routes those buses follow. Before it was very time consuming to create those routes. When all the stops and how they are related are in a database, it becomes possible to extract them in sequence.

Adding the ways nearest to the stops in above route relations automatically

Having correct sequences of stops is a tremendous help, but having all the ways next to those stops, is even better.

To run the following script, you need to add the scripting plugin to JOSM and install Jython.

#!/bin/jython'''FindWaysBelongingToRoutesStartingFromStops.jy- Given a list of stops, find all ways belonging to the routeThis code is released under the GNU GeneralPublic License v2 or later.The GPL v3 is accessible here:http://www.gnu.org/licenses/gpl.htmlThe GPL v2 is accessible here:http://www.gnu.org/licenses/old-licenses/gpl-2.0.htmlIt comes with no warranty whatsoever.'''fromjavax.swingimportJOptionPanefromorg.openstreetmap.josmimportMainimportorg.openstreetmap.josm.commandasCommandimportorg.openstreetmap.josm.data.osm.NodeasNodeimportorg.openstreetmap.josm.data.osm.WayasWayimportorg.openstreetmap.josm.data.osm.RelationasRelationimportorg.openstreetmap.josm.data.BoundsasBoundsimportorg.openstreetmap.josm.data.osm.visitor.BoundingXYVisitorasBoundingXYVisitorimportorg.openstreetmap.josm.data.osm.TagCollectionasTagCollectionimportorg.openstreetmap.josm.data.osm.DataSetasDataSetimportorg.openstreetmap.josm.data.osm.RelationMemberasRelationMemberimportorg.openstreetmap.josm.gui.dialogs.relation.DownloadRelationMemberTaskasDownloadRelationMemberTaskimportorg.openstreetmap.josm.actions.DownloadReferrersActionasDownloadReferrersActionimportre,timeimportcodecsdummyRelation=Relation();dummyWay=Way()sideEffects={'addWayToRoute':None,'createStopAreaRelations':None,}logVerbosity=50'''10: only report problems that require attention20: report on collection30: report on network nodes40: report on which routes are being checked50: report everything'''defgetMapView():ifMain.mainandMain.main.map:returnMain.main.map.mapViewelse:returnNonedeffindConnectingWay(way1,way2):ifway1.get('junction')=='roundabout':endnodesway1=way1.getNodes()else:endnodesway1=[way1.getNode(0),way1.getNode(len(way1.getNodes())-1)]ifway2.get('junction')=='roundabout':endnodesway2=way2.getNodes()else:endnodesway2=[way2.getNode(0),way2.getNode(len(way2.getNodes())-1)]forendnodeinendnodesway1:#print dir(endnode)parentways=endnode.getReferrers()forparentwayinparentways:ifparentway.getType()==dummyWay.getType():ifparentway.get('junction')=='roundabout':endnodeInParentWays=parentway.getNodes()else:endnodeInParentWays=[parentway.getNode(0),parentway.getNode(len(parentway.getNodes())-1)]forendnodeInParentWayinendnodeInParentWays:ifendnodeInParentWayinendnodesway2:returnparentwayreturnNonedefcheckPTroute(route,aDownloadWasNeeded):ifaDownloadWasNeeded:returnNone,False,''printwaymemberslist=[]modified=False#print dir(mv)formemberinroute.getMembers():"""Algorithm: Is the node a member of a public_transport=stop_area? Grab way from stop_area Search near to the node for "highway -highway=bus_stop inview type:way -closed" If one more than one highways are found: Is one of them member of another route=bus relation? Also search for "public_transport=stop_position type:node" If found: use the parent way """ifmember.isNode():#print dir(mv)node=member.getNode()printnode.get('name')found=FalseforparentRelationOfNodeinnode.getReferrers():iffound:breakifparentRelationOfNode.getType()==dummyRelation.getType():ifparentRelationOfNode.get('type')andparentRelationOfNode.get('type')in('public_transport'):ifparentRelationOfNode.get('public_transport')in('stop_area','stop_position'):formemberinparentRelationOfNode.getMembers():# now we are sure it's the correct kind of relation, drill down to find parent way of stop_position nodeiffound:breakifmember.isNode():stopPositionNodeCandidate=member.getNode()ifstopPositionNodeCandidate.get('public_transport')in['stop_position']:forparentWayCandidateinstopPositionNodeCandidate.getReferrers():ifparentWayCandidate.getType()==dummyWay.getType():print'connected through stop_area: 'printparentWayCandidate.getKeys()waymemberslist.append(parentWayCandidate);found=True;breakifnot(found):# We couldn't determine the way by means of the stop_area relationbboxCalculator=BoundingXYVisitor()bboxCalculator.computeBoundingBox([node])bboxCalculator.enlargeBoundingBox()ifbboxCalculator.getBounds():mv.recalculateCenterScale(bboxCalculator)#mv.zoomTo(node.getEastNorth())ignorelist=[node]stopPosition=Node()foriinrange(1,20):candidates=mv.getAllNearest(mv.getPoint(node),ignorelist,Way().wayPredicate)ifcandidates:printlen(candidates)#print candidatesnodecandidates=mv.getAllNearest(mv.getPoint(node),[],Node().nodePredicate)forcandidateinnodecandidates:# is there a stop_position node in the candidates?ifcandidate.get('public_transport')in['stop_position']:stopPosition=candidateignorelist.append(candidate)breakforcandidateincandidates:ifcandidate.get('highway')in['primary','secondary','tertiary','unclassified','residential','service','living_street','trunk']:#print candidate#print candidate.getNode(0)#print stopPositionifnot(member==route.getMember(0))andcandidate.getNode(0)==stopPosition:continue# there is probably a better candidate which has this way as its end node, instead of as the starting nodeelse:waymemberslist.append(candidate)print'using 'printcandidate.getKeys()found=True;breakelse:ignorelist.append(candidate)print'ignoring 'printcandidate.getKeys()iffound:breakbboxCalculator.enlargeBoundingBox()# zoom out a bit and try againifbboxCalculator.getBounds():mv.recalculateCenterScale(bboxCalculator)ifnot(found):print'Found no suitable candidate way for this stop'else:# We found a way and added it to the relation, but is this way connected to the previous way we found?iflen(waymemberslist)>2:notConnected=Trueifwaymemberslist[-1].get('junction')=='roundabout':endnodeslatest=waymemberslist[-1].getNodes()else:endnodeslatest=[waymemberslist[-1].getNode(0),waymemberslist[-1].getNode(len(waymemberslist[-1].getNodes())-1)]ifwaymemberslist[-2].get('junction')=='roundabout':endnodesprevious=waymemberslist[-2].getNodes()else:endnodesprevious=[waymemberslist[-2].getNode(len(waymemberslist[-2].getNodes())-1),waymemberslist[-2].getNode(0)]forendnodelatestinendnodeslatest:ifendnodelatestinendnodesprevious:notConnected=False;breakifnotConnected:connectingWay=findConnectingWay(waymemberslist[-2],waymemberslist[-1])ifconnectingWay:waymemberslist.insert(-1,connectingWay)elifFalse:# Let's look for a relation containing both ways in the proper orderforparentrelationinwaymemberslist[-2].getReferrers():notThereYet=True#print parentrelationifparentrelation.getType()==dummyRelation.getType():memberwaysOfParentRelation=[]formemberinparentrelation.getMembers():ifmember.isWay():memberwaysOfParentRelation.append(member.getWay())#print membersOfParentRelation#print waymemberslist[-1] in memberwaysOfParentRelationifwaymemberslist[-1]inmemberwaysOfParentRelationandmemberwaysOfParentRelation.index(waymemberslist[-1])>memberwaysOfParentRelation.index(waymemberslist[-2]):notThereYet=TrueforwayinmemberwaysOfParentRelation:#print way#print waymemberslist[-1]#print waymemberslist[-2]ifnotThereYet:ifway==waymemberslist[-2]:notThereYet=False;continueelse:ifway==waymemberslist[-1]:breakelse:waymemberslist.insert(-2,way)ifnot(notThereYet):breakelse:print'ALREADY CONNECTED TO PREVIOUS WAY !!!!!!!!!!'#mv.zoomPrevious()#mv.repaint()#time.sleep(1)#print#print 'node:', node#print 'candidates:', candidates#print 'waymemberslist:', waymemberslist#waymemberslist.extend(nodememberslist)i=0;newRelation=Relation(route);commandsList=[];previousway=Noneforwayinwaymemberslist:newMember=RelationMember(str(i+1),way)ifnotway==previousway:#not(newMember in newRelation.getMembers()):newRelation.addMember(i,newMember)i+=1;previousway=waymodified=True#print dir(node)#bboxCalculator = BoundingXYVisitor()#bboxCalculator.computeBoundingBox([node])#print bboxCalculator#bboxCalculator.enlargeBoundingBox()#if bboxCalculator.getBounds():# mv.recalculateCenterScale(bboxCalculator)#mv.zoomTo(node.getEastNorth())#candidates = mv.getNearestNodes(mv.getPoint(node),node.nodePredicate)ifmodified:commandsList.append(Command.ChangeCommand(route,newRelation))Main.main.undoRedo.add(Command.SequenceCommand("Adding ways directly adjacent to stop nodes",commandsList))commandsList=[]modified=FalseaDownloadWasNeeded=False'''Since Downloading referrers or missing members happens asynchronously in a separate worker threadthe script can run in three modes1. No downloads allowed/offline run; output mentions that data was incomplete in its reports.2. Download run; When incomplete items are encountered, they are scheduled to be downloaded. From then on, no more quality checks are performed on the data. All hierarchies are still checked, looking for more incomplete data for which more downloads need to be scheduled.3. Normal run; All data is available and proper reporting can be performed.'''dummy_way=Way()dummy_relation=Relation()mv=getMapView()ifmvandmv.editLayerandmv.editLayer.data:selectedRelations=mv.editLayer.data.getSelectedRelations()ifnot(selectedRelations):JOptionPane.showMessageDialog(Main.parent,"Please select a route relation")else:forrelationinselectedRelations:iflogVerbosity>49:printrelationifrelation.hasIncompleteMembers():if'downloadIncompleteMembers'insideEffects:aDownloadWasNeeded=Trueprint'Downloading referrers for ',str(relation.get('name')),' ',str(relation.get('note'))DownloadRelationMemberTask.run(DownloadRelationMemberTask(relation,relation.getIncompleteMembers(),mv.editLayer))continueelse:JOptionPane.showMessageDialog(Main.parent,'Please download all incomplete member of the relations first')exit()relationType=relation.get('type')ifrelationType=='route':checkPTroute(relation,aDownloadWasNeeded)ifaDownloadWasNeeded:JOptionPane.showMessageDialog(Main.parent,'There was incomplete data and downloading mode was initiated,\nNo further quality checks were performed.\nPlease run the script again when all downloads have completed')