I presented some research I’ve carried out at CASA, at the Cycle City conference in Leeds last week. The research shows how the numbers of bikeshare bikes and docking stations have varied between 2010 and 2014, for 46 systems across the world (not all systems have numbers for whole period of study). The numbers are from the database which backs my live global map.

The work has been written up into a CASA Working Paper (#196). The appendix includes the numbers of bikes and docking stations, for the 46 systems, across eight periods of collection in six-monthly intervals from October 2010. You can view the paper as a PDF by following the link above.

This Place is a visualisation of 2011 Census data for England and Wales, for your local area.

I’ve been meaning to adapt Michal Miguski‘s This Tract for the 2011 UK Census, ever since I saw it a couple of years ago showing the 2000 US Census. The clear, clean styling – simple a map of the local area, and a nice table of pie charts – was a world away from the choropleth maps I’ve produced previously. The most striking feature is what’s not there – when you are looking at a particular area, the surrounding areas are blanked out – they don’t distract.

Following the release of fine-grained 2011 Census data at the end of last month, at least for England and Wales, I’ve spent some time getting the data into the equivalent format and also customising the website with UK-specific metrics. The end result is not architectured in quite such an elegant way as Michal’s – his version uses geographical information direct from the “official” Census site, courtesy of their web services, and predefined static datafiles, whereas mine makes numerous queries to a local database – so his would scale better, although mine is backed by a decent academic server.

I’ve used different colour ramps for each of the metrics – for ethnicity I used a rainbow-based colour ramp. The attempt is that the “colourfulness” of the wheel shows the ethnic diversity of an area. A fully diverse area will have significant proportions of every colour, creating a “wheel” of colour.

One undocumented feature – you can input the MSOA code (found at the bottom of the page) into the search box, or the URL, to create a weblink specifically for that area. At the moment, my smallest unit geography is MSOA – the size is about right, but the boundaries of MSOA can be very arbitrary. If the data is released at ward level I may well switch to that.

It’s not just Wandsworth and Fulham that will be getting Barclays Cycle Hire in the next year or so when Phase 3 goes live – Hackney and Islington will be getting a few too. The iconic “Boris Bikes” will be heading up Mare Street towards central Hackney – although not quite getting there – plus there’ll be various new docking stations in Haggerston, just north of the Regent’s Canal. There will also be a docking station on Islington Green, and a few around the Canal Museum on Calendonian Road. In all, if planning permission is forthcoming, there will be up to 15 new docking stations, all north of the Regent’s Canal. It’s a modest increase – 3% – but the communities affected will doubtless enjoy the new facility. It’s still a long way south from myself though!

It’s great to see that the system is continuing to expand in all directions – but now the central London demand is being sated, it would be nice if Transport for London relaxed their requirement for docking stations to be within 300m of each other. The most successful bike share systems generally have a dense core and a well spaced out periphery, which accommodates commuters, tourists and locals equally well. I would much rather have the system properly penetrating Zone 2 and 3, even if there’s a 1km gap between each docking station. Then it becomes more useful for the utility users who unlike the commuters (going from stations to skyscrapers) and tourists (concentrating on the bigs parks and markets) act as useful re-distributors in their own right by the nature of their diverse journey directions.

Thanks to Loving Dalston for spotting a planning application for the docking station by London Fields. I had a quick trawl through the Hackney and Islington council planning websites to spot the others.

About this time last year, I created a “Map of the Geodemographics of Great Britain” which included the Output Area classifications (OAC) for GB, based on the 2001 Census, and also included the Index of Multiple Deprivation (IMD) for England, published in 2010. At the time, there was no up-to-date equivalent to the IMD for Scotland. However the 2012 SIMD (Scottish IMD) has recently been published, and I’ve applied the resulting datasets to my map, using the same technique of filling in just the buildings, rather than all the land, in the appropriate colour (a red-yellow-green Colorbrewer ramp from most to least deprived).

The SIMD and IMD are calculated in a similar way – by looking at measurements of poverty for each area across several categories (e.g. education, crime, income) – however the details of the way the measures are taken is slightly different between the two countries. Additionally each index is based on the range of deprivation found in that country. This means that the indices should not be directly compared across the two countries, i.e. A dark green area in Scotland only has the same relative level of deprivation to similarly coloured areas in Scotland, not in England. Accordingly, the website does not show the two IMD maps at the same time – there is a toggle at the bottom to switch between the two (and to the OAC). As an example – just because Edinburgh is largely green does not mean that it has the same leve of affluence/deprivation, on absolute terms, as a similarly-coloured city in England.

Nonetheless, comparisons within Scotland are perfectly valid, and the differences between the cities are striking – most notably Edinburgh vs Glasgow. See the whole map here.

As always with classifications, remember that they represent an average throughout the geographical area concerned – in Scotland this area is known as a Data Zone, which is similar to an English Output Area (as an aside, the SIMD is more fine-grained than the IMD – the latter uses a more aggregated measure). This means that the colour covering a house is not a measure for that house, simply that that house is within an area where the average SIMD is that value. Also, non-residential buildings get coloured, as the dataset I’m using for the building (Ordnance Survey Vector Map District, via the OS Open Data releases) does not distinguish building types. The SIMD of buildings that have no occupants is meaningless, and they are not included in the underlying calculation.

Google has replaced their normal logo with a special “Doodle” for today, celebrating the 150th Anniversary of the London Underground today. The graphic is a stylised version of the iconic tube map, with the lines warped to form the letters (and colours) of the normal Google logo, but retaining much of the topological structure of the London tube network.

I was prompted by the excellent Twitter Tongues map, where geolocated tweets in London (including mine, and those from hundreds of thousands of others) were mined by Ed Manley over the summer, and then mapped by James Cheshire, to see where I had left my own Twitter footprint.

Many people would probably be quite alarmed to learn that the data, on the exact locations they have tweeted at – if they’ve allowed geolocation – is freely accessible to anyone, not just themselves, through the Twitter API.

It’s a bit of a faff to get the data – Twitter is starting to rollout a “download my Tweets” option which may make the first few steps here easier – but here’s how I did it.

I used the user_timeline call on the Twitter API, repeatedly, to pull in my last 3200 tweets (the maximum) in batches (“pages”) of 200. The current Twitter API (1.1) requires OAuth authentication – not of the person whose tweets you are mining, but simply yourself, so that rate limits can be correctly applied. Registering a dummy application on the Twitter gives access to OAuth credentials, and then using the OAuth tool generates a CURL string that can then be run – the result is put in a file ( > pageX.json), and I do this 16 times to get all 3200 tweets, using the count, page and include_rts parameters. For this particular case, I’m interested in the locations of my own account but – to stress again – you can do this for anyone else’s account, unless their account is protected and you are not a follower.

The output is as various JSON files. Lacking a JSON parser, or indeed the skill, I had to do a bit of manual text processing. Those with a flexible JSON parser can therefore skip a few steps. I then merged together the files (cat *.json > combined.txt), and in a text editor, put a line break between each },{"crea and replaced ," with ,^" with the caret being an otherwise unused character.

I opened up the file as a text file (not CSV!) in Excel and did a text-to-column on the caret. I then extracted three columns – the date/time, tweet text, and the first coordinates column that occurred. These were the 1st(A), 4th (D) and 28th (AB) columns. I did further find/replace and text-to-columns to remove the keys and quotes, and split the coordinates column into two columns – lat and long.

I removed all the rows that didn’t have a lat/long location. Out of 3186 (14 less than 3200 due to deleted tweets) I had 268 such tweets. I also added a header row.

I created a new Google Fusion Table on the Google Drive website, importing in the Excel file from the above step, and assigning the latter two columns to be a two-column location field.

I marked the table as public (viewable with a link). This is necessary as Google doesn’t allow the creation of a map from a private file, except though a paid (business) account. The flip side of course is this gives Google themselves the right of access to the file contents, although I can’t imagine they are particularly interested in this one.

Finally, I added a tab to the Google Fusion Table which was a map tab, and then zoomed in and around and took the screenshots below. The map is zoomable and the points clickable as normal. It should be possible to colour-code the dots by year, if the categories are set appropriately and the appropriate part of the datetime feed is reformatted appropriately in Step 3.

The whole process, including some trial-and-error, took a little over an hour – not so bad.

In the images above and below, you can see the results – 268 geolocated tweets over the course of two and a half years from my account – many of them precisely and accurately located.

My rush-hour commute through Paris proved to be slightly more traumatic than planned (I wonder if Parisian visitors find London Underground stations as confusing as I find those on the Paris metro?) but I arrived at the École des Ponts ParisTech in time to hear the workshop organiser introducing the sessions. First up was Pierre Borgnat talking about network analysis of Lyon’s system. I had seen a paper by him on Lyon before, and the popularity and density of Lyon’s system has allowed for a rich and interesting dataset for mining and community detection. The community detection has been done using both spatial and temporal variables. Pierre’s thorough and technical treatment of the data was backed up with some excellent mapping of the data, which you can see above and below.

Next up was Jon Froehlich. Jon’s talk was underpinned by a discussion of the different data sources and types available in the field. He focussed on temporal cluster analysis of the Barcelona bicycle sharing system (below) – a particularly interesting city for me as, along with London and Zurich, it is a case study for the EU project I have recently started working on, EUNOIA. Barcelona’s bicycle sharing system is not unlike London’s, in terms of its size, shape and usage characteristics – although the general downward slope of the city causes headaches for its operator. Jon gets bonus points for including not only a quote from this blog on his presentation, but Martin’s beautiful routed bike-flow animation for London, and Dr Jo Wood’s more recent bi-directional flow animation, again of London.

Etienne Côme, from the hosting school, was next on, with an analysis of the biggest system (outside of China) of all – the Vélib in Paris. The Vélib is perhaps the holy grail of academic research in the field as its size, and Paris’s multiple commercial and residential zones, means that community and network analysis is likely to be eye-opening. Similar to Pierre, Etienne outlined eight detected communities, by looking at temporal variations in the origin-matrix between the 1200-odd stations on the Vélib network.

After lunch, Vincent Aguilera was first on, with a switch away from bicycle sharing systems but showing some techniques that have potential for the field – Vincent looked at using mobile phone network data to detect station dwell times and true journey durations on a section of the RER metro in Paris. He compared this data with Twitter messages with appropriate hashtags (below), and the real-time running supplied by the operator on its website. The availability and structure of the cell-towers on the network allowed a direct comparison to be made – indeed, such data may actually be of better quality than that currently available at the operator’s disposal, allowing more fine-tuned operation and monitoring.

Neal Lathia was next with a look at London’s system – specially effects caused by the addition of casual (i.e. non-key, non-member) availability in December 2010. The additional option did see some changes in the usages of certain docking stations. The comparison was done by clustering the network’s docking stations by time, before and after the transition, and then seeing which stations changed cluster. One of the main areas of change was in the very heart of London, around the Trafalgar Square area, suggesting a slight shift away from the (still dominating) railway station-based usage patterns.

Fabio Pinelli’s talk was wide-ranging – it included system design, routing for Dublin’s (over)used system, a look at the reliability of the Vélib fleet.

Finally, Francis Papon from the hosting school took a step back from the modern electronically managed bicycle sharing systems and mobile/social data sources, and looked at change in uses of urban cycling more generally. His dataset stretched over a hundred years, rather than the typically five-year maximum historical range that bicycle sharing systems have. A key trend is that in the largest French cities studied, including Paris, there is a recent (post-2000) renaissance in urban cycling usage, but this is not matched in many of the country’s smaller cities.

The workshop concluded with a general discussion of the research field to date and its direction. What was particularly interesting was that several bike sharing operators were in attendance, they were fully engaged with the academic research being carried out, asking questions but also revealing some nuggets of information about how the systems are rebalanced, relative costs of operations and why they thought some systems were more successful than others.

Hopefully there will be more such workshops in the future in Europe – with UCL CASA, Cambridge, City University London and LSHTM all involved in the field, maybe there should be one taking place in London next year?

Here is a webpage that uses my own CityDashboard API*, to build a Periodic-Table inspired “data artwork” of live London information, as a series of coloured square panels on a website. The squares update regularly with fresh information, and throb red (or blue) if there are particularly extreme values present.

As an artwork, it’s deliberately not 100% clear what it shows. A key on the bottom right will help a bit, but a degree of guesswork will be needed for some of the panels. With a bit of thought, almost all of the panels should be decipherable.

It’s a super-simple webpage. I’m using CSS3 for the animations – no Javascript used. The page is customised to be most relevant to the CASA office here in central London – the chosen weather station, bike share stands, air quality monitor and variable message road sign have been chosen accordingly. A more sophisticated version – which doesn’t currently exist but would be simple to do – would use a combination of the location information in the CityDashboard feeds, and the HTML5 geolocation functionality of many browsers, to show a version more relevant to where in London the viewer is.

As the page is so simple, it displays well on mobile browsers – on my iPhone, the webpage shows four panels on each row. On larger displays, it will rearrange appropriately. See the acknowledgements link on the page to see where the data’s coming from – the same sources as CityDashboard, including TfL, DEFRA, Yahoo! Finance and Mappiness, as well as CASA’s own sensors.

I created the piece for the ODI’s recent Data as Art installation competition – I didn’t win, but decided to do it anyway.

I’ve made some minor alterations to the CSV API for CityDashboard. The main changes are in the metadata rows (the top two) rather than the subsequent rows. Specifically, the top metadata row has now split out the description, source and source URL – which were previously rather messily combined into a bit of HTML – into three text fields; and the second metadata row now uses properly formatted names for value titles, i.e. including spaces, and units, for example “broken_pc” now becomes “% docks/bikes broken”.

The reason for these changes is to accommodate a new and exciting use of the API here at CASA – our lab hardware specialist has recently been hard at work building an “iPad wall” and one of the visualisations in it is of CityDashboard data. Here’s what the uncompleted – but operational – iPad wall looks like (source):

It’s a physical CityDashboard!

I also took the opportunity to fix a few bugs and typos – mainly just cosmetic, but including a pretty silly one for the Mappiness-sourced data that was over-reporting the true value by a large and variable amount. Entirely my fault. That will serve me right for doing a coding change during a colleague’s Ph.D viva drinks reception! I also handle temporarily unavailable source feeds a little better – they’ll now appear unavailable for one complete update cycle but it means the source server doesn’t get repeatedly hammered until it comes back up again.

In six weeks time, London will have a second orbital railway. The Circle Line has been running for just over 100 years, and on 9 December will be joined by the latest addition to Transport for London (TfL)’s Overground network – a link between Clapham Junction in the south-west and Surrey Quays in the south-east. This means that the West London Line, North London Line, East London Line and South London Line will all be linked up (you won’t be able to travel 360 degrees on one train though – you’ll need to change at both Highbury & Islington and Clapham Junction, and often Willesden Junction, to complete a circuit). Should you travel around the complete loop, you’ll pass through areas as varied as Imperial Wharf, Dalston Junction, Whitechapel and Peckham Rye.

Anyway this was a tenuous excuse for me to produce a diagram – above – of London’s TfL-owned network – the Underground, the Overground, the DLR, Tramlink and the Cable Car. Click the graphic for a larger version. My starting principles for the diagram were concentric circles for the orbital sections of the Circle Line and the Overground network, and straight lines for the Central and Piccadilly Lines, with the latter two converging in the centre of the circles. I then squeezed everything else in. I realised that the Northern Line’s Bank branch passed the Circle Line three times so was going to need something special, so I added a sine wave for this section, and extended this north and south as much as possible.

The River Thames is on there – because any tube diagram doesn’t look correct without the river – and the diagram is topologically accurate – everything connects correctly, and features are in an approximately correct geographical position relative to their neighbours, but not to the diagram overall. Only stations that are designated intersections, or have connections with National Rail stations, are shown. I haven’t labelled anything. It’s art.

I pulled together this interactive map of Proposed Constituency Boundary Changes in England, after the information was released by the Boundary Commission for England last week. My colleague James Cheshire highlighted that this kind of map could be illuminating, particularly as the official maps are simple greyscale PDFs of each new constituency boundary, without the old boundaries or adjoining constituencies for context, and with one document per constituency!

Click the image above to go to the interactive map, then use the slider to fade between the current and proposed boundaries. The new boundaries have been put together to have roughly the same populations in each one (72000-80000 people), and also the total number of constituencies has been dropped by around 5-10%. They are just proposed ones, and are themselves revised from an earlier version.

There are some interesting patterns – many urban areas, such as London, have undergone very significant redrawings, while many rural areas – historically with higher constituency populations – remain untouched. For example, Tottenham loses its identity as a single constituency, the southern half being assimulated into Stamford Hill and the northern half into Edmonton. Slough has a big bite taken out of its SW corner, the people here potentially being represented by a Windsor MP in the future. Much of north Yorkshire is unchanged however.

We didn’t use vector-based boundaries here, even though this would have made it more interactive, because of the size of the boundary files – simplifying them to reduce the size would have been tricky (as it would have made unmoved boundaries move slightly) and the necessary simplification might have distorted the boundaries too much.

As with all my more recent web visualisations, social media (Twitter and Facebook) buttons are included, and geolocation is used to default the view to the user’s location, if they are in England.

On a technical note, this is my first pure HTML5 map. It also takes advantage of simpler ways of setting up maps in the latest release of OpenLayers, 2.12. It means it is out-of-the-box compatible with mobile browsers, and the HTML, JavaScript (including a JQueryUI slider) and CSS adds up to less than 200 lines of code – the only other code used being a couple of Mapnik XML stylesheets for rendering the two maps themselves.

I was at the V&A earlier today to see Prism, a new installation by digital artist Keiichi Matsuda which is part of the London Design Festival.

Prism uses data from UCL CASA’s CityDashboard and other London open data sources, to visualise London in a novel way. The exhibit, which consists of triangular sails joined together in an irregular pattern, and lit from within, slowly pulses and evolves as the data that the patterns and colours are showing, changes. The visualisations are derived from fast-changing weather, travel and other London data sources. There is no key at all so you have to use your imagination to hypothesise what each panel is showing – although a couple have TfL roundels and bike share bikes on them, hinting at their purpose. Prism’s shape and positioning makes it look slightly organic, as it appears to about to burst through the floor and into the gallery space below.

Seeing Prism is a bit of a mission – it requires first going to the sixth floor of the V&A – not immediately obvious to find – then signing a disclaimer, ascending – in small groups of just 6 – a tiny spiral staircase. You then move across a narrow ledge, before finally you enter the darkened room. Prism is suspended in the middle, allowing a 360-degree inspection, and also a glimpse of the galleries beneath. Another spiral staircase, in one corner, then allows visitors to get a different, surprise view.

If you want to see Prism you need to book a timed ticket (free) in advance, and be aware it’s only on for the next 10 days. If you don’t manage to get a ticket, you can still see a glimpse of the base of Prism, as it is suspended over one of the galleries on the sixth floor of the museum.

The tour showed some of the treasures of the Map Room, including the world’s first printed colour map, proofs of the world’s largest atlas, and a fragile nested set of globes; followed by a walk through the huge, industrial map storage facility in the bottom basement underneath the British Library (the Northern Line could be heard rumbling above!) and a quick look in the Map Reading Room. Some of the older maps of (real) places look like they are straight out of a fantasy novel – presumably the latter being heavily influenced by the former. A good example is above.

Thanks to the SoC for organising and the Curator of Antiquities for showing us around.

I presented on the Mapping London blog, at the Society of Cartographers’ 48th Annual Conference which was at UCL this year, showing a general outline of the blogs and some maps featured on it, plus some work done by James and I. My presentation is here (6MB PDF). Note that the attribution for the many maps featured on the presentation is at the end.

After Google abruptly turned off their XML weather feed this week, I’ve switched to using Yahoo! Weather (an RSS feed) for the CityDashboard weather forecast module. Yahoo uses WOEIDs rather than city names, which takes a bit longer to configure but is unambiguous – Google just used the city name, so required careful specification to get Birmingham (UK) weather rather than Birmingham (Alabama, US) weather, for example. Google’s feed was undocumented (so, strictly, private) but was widely used on other websites.

I’m using the weather icons (which link to the codes supplied by Yahoo) from the WeatherIcon project.

Many of the bike share operators whose systems I’ve mapped have accounts on Twitter – but do they use them to reply to customers, notify of system changes, or just tweet promotional measures? Have they built up an appropriately large set of followers? Do they tweet often? An active Twitter account is good customer service, one that replies to queries is great customer service! (N.B. Google has translated the Velib conversation above from French.)

There are 24 operators, for which I able to find a relevant Twitter account. The following table shows how they use it. This does of course leave several hundred other operators (many very small) for whom I could not find an account.

^ = Account also handles smaller bike share systems in other cities.
^^ = Account also handles other public transport in the city.

Operators get a star for being on Twitter, another for having more followers than bikes on the street, another for replying directly to at least some user queries on Twitter, another for tweeting and least some system issues and other “bad news”, and another for having made at least a couple of tweets in the last 48 hours.

Large (500+ bike) systems with no active official Twitter account that appear on bikes.oobrien.com: Brisbane, Luxembourg City, Lyon, Milan, Nice, Saragossa, Valencia and Vienna. Not including Chinese or South Korea systems as Twitter appears to not be widely adopted in these countries, at least in terms of official transport accounts. Metrics were measured on 21 August 2012.

I was in Vancouver at the end of June for the Velo-City conference – which is the cycling industry’s conference on bike sharing and urban cycling.

The lead sponsors were PBSC who are behind the technology for many of the larger North American systems (Montreal, Minneapolis, Washington DC) so there was was a strong bike sharing theme through the conference, and they had a prominent stand with bikes in the various scheme liveries. The stand also had a couple of design updates from the ones you see in London and elsewhere – a “totem-pole” for capturing sunlight to provide power, and a slots that takes credit cards as well as the existing key-fobs. There is no indication that these updates will be making it to London anytime soon though. B-Cycle, who supply and run many of the smaller systems in the US (e.g. Denver, San Antonio) also had a stand with their own Trek-built bikes, which have a distinctly different look.

I presented on my Bike Sharing Map showing the detail for various cities around the world, it was the middle part of a 90 minute presentation at three geographical scales – the first segment given by Russell Meddin on his global map of bike shares, and the last segment being given by Andrea Beatty on detailed information available for a single city through mobile apps.

I also sat in on several other presentations – some of the most interesting being given by far-eastern presenters, particularly the Chinese. This is because China has 7 of the largest 10 bike sharing systems in the whole world, but getting information on them can be difficult, so it was interesting to find out the information from people on the ground.

One of the most interesting talks focused on the modal shift in Chinese and Western cities – many of the former are shifting from bicycle to car, while the rise of the bus in Western cities was cited by the contrast between Thatcher’s 1968 “A man who, beyond the age of 26, finds himself on a bus can count himself a failure” with the 2008 appearance of a red London bus in Beijing during the 2008 Olympics closing ceremony! In China, lanes that were once dedicated for bicycles have been turned over to extra space for cars.

My personal highlight was being able to borrow one of the PBSC bikes and take it for a spin around Stanley Park – a lovely circuit and on a very pleasant afternoon. Apparently Vancouver does not have very many rain free days in the summer, but it was warm and sunny throughout my stay.

Vancouver is itself getting a bike sharing system, probably next summer. Vancouver’s existing cycle insfrastructure is brilliant – properly segregated cycle lane, with planters and cycle parking to separate the lanes from the cars. The operator will have its work cut out for the scheme to be a success though – helmets are required by law in Vancouver. There was some talk of a system where every docking station comes with a helmet vending machine, and on return the helmet gets safety-checked and automatically cleaned ready for the next user.

Thanks to Russell and Paul for letting me crash in the apartment they rented, B-Cycle for covering my conference fee, and CASA for flying me there.

There’s a new, temporary panel on the London CityDashboard which shows Twitter activity at the London 2012 venues. The panel is using data from new Twitter collector tools in the Big Data Toolkit, which being developed by my colleague Steven James Gray as part of his PhD.

For each venue, the collectors count the number of Tweets in the last hour that have latitude/longitude information stamped on them, that are located within an area radiating around the centre of each stadium or arena. Unfortunately this excludes the majority of relevant tweets, as most mobile Twitter applications don’t include this information by default – stadium designs can also interfere with the accuracy of the GPS on mobile phones – when I was in the Velodrome for a test event, my iPhone was convinced I was in, ironically, Beijing, and nothing could be done to convince it otherwise.

Nonetheless, the tweets that the collectors do manage to capture still give an indication of how lively and busy each venue is. A collector covering the whole Olympic Park is also included – this includes the venues within the park and also the various promenades and green areas. Most people, before or after visiting the venue they have tickets for, are remaining in the wider park.

On the way we discovered an obscure Twitter bug: including a search radius that spreads across the Prime Meridian (0 degrees longitude) causes an error to appear from Twitter – fixing the centre of the search point on the Meridian itself works around this bug. Until we spotted the but, the Greenwich Park collector was always reporting zero, as the Meridian line goes through the park.

After the Olympics, we hope to reuse the collectors to give an indication of Twitter activity in certain key London hotspots, such as Shoreditch and Covent Garden. Potentially, we would be able to include a similar panel for the other seven UK cities on CityDashboard.

Yes! It is possible! There may not be any Barclays Cycle Hire docking stations in the Olympic Park itself, possibly due to “Barclays” not being the official financial services provider of the Olympics but more likely because of the logistics of rebalancing flows to/from major events and the safety aspects of a crowded space, but that doesn’t mean you cannot “Boris Bike” to near the edge of the park. Even better, you get to use one of the two quieter entrances to the park, avoiding the huge queues and crowd mechanics of the approach from Stratford through Westfield.

The above map is adapted from my live docking station map and shows the nearest docking stations to these two park entrances. Cycle to these docking stations, leave the bike at one of them, and then follow the arrows to walk the final kilometre or so.

Victoria Gate (west entrance). The docking stations on Old Ford Road and Roman Road are not far away, and these generally have plenty of spaces during the day, filling up in the evening as commuters return home – so if you are journeying to them to visit the park, you have a good chance of finding a free space, and similarly there should be bikes for you to hire on your return in the evening.

Greenway Gate (south entrance). This is the route for people walking from West Ham station – but this is a long walk, and you might as well walk from the nearest Barclays Cycle Hire docking stations which are about the same distance away – on Bow Road and Bromley High Street. However you do have to cross the notoriously unpleasant Bow Roundabout, which has no pedestrian crossings, to be able to pass along Stratford High Street. Also, these docking stations have generally been full during the day, for recent days, suggesting some are already using this route.

Both entrances are likely to be quick ways into and out from the park. If you have your own bike, there is a large secure cycle park in Victoria Park, from where you can walk to Victoria Gate.

There are several Olympic venues in Central London, which can therefore also be approached by Barclays Cycle Hire bikes, but be warned TfL is removing the docking stations that are very near, or inside, the venues themselves. A full list is here.

Background map based on OpenStreetMap data and designed by The Guardian.

This is my Twitter social graph. Click on the graphic to see a larger version.

Key

The font sizes for the names correspond to the number of followers, while the colour ramp (light grey to yellow to blue) is proportional to the number of listings per follower. That is, someone who has a small number of followers, but has been listed by many of those people (and others) will appear bright blue. This is designed to be a very simple measure of value and influence – you can have a few number of followers, but if many of those have considered you to be an authority in a subject (and are themselves switched on enough to know about Twitter listing) then you can be considered to be a more influential Twitterer. I bet you most of the “celebrity” accounts will therefore score poorly here, while experts will be picked out. Bad luck BTTowerLondon.

How this Compares to other Social Graphs

To make the graph, I have taken the subset of people that both follow me and I follow back. I’ve then looked at connections between these people. Doing this in Twitter is a similar idea to what has been done in Facebook and Linked-In before except that:

The groups that appear will be quite different to what appear in Facebook. Facebook is a social network for friends, whereas Twitter is more of a social network for interests.

Twitter’s connections are asymmetric (you can follow people who won’t follow you back, and vice versa) which means you have to think about exactly what you are mapping.

It’s much more of a fiddle in Twitter because you have to query each person’s connections separately.

Twitter’s rate limits (for unauthenticated connections) are aggressive – a maximum of 150 requests an hour from a single IP. Luckily I have access to nine Linux machines which run my Python scripts nicely.

The lack of the equivalent of Facebook “apps” that do this kind of visualisation automatically, mean you have to do it yourself. I produced the visualisation in Gephi, which is powerful but tricky to get to grips with.

There is one great thing though:

You can build up these kinds of visualisations for anyone, not just yourself, as the raw information is accessible to anyone.

Community Classification

My Twitter network is more homogenous than I thought – a big blog of tech/geo, with the orienteers forming the main breakaway group, and some slender strands of friends on either side. Networks of friends which don’t share any connections with the other groups, will not be connected at all and will float away.

Below is a hand-done, rough community classification. Again, please click for a larger, more readable version. If I pulled in more of the metadata (profile and qualitative/quantitative) from Twitter for each person, then this could probably be done automatically – enough people in the CASA cluster, for instance, will mention CASA on their profiles, for it to be detectable, showing such people as CASA-linked even if they don’t say so themselves.

A – The Neogeo (Geography+Technology) community

B – OpenStreetMappers in London and elsewhere

C – The Open Data movement

D – Data visualisation and data journalism

E – UCL CASA, UCL Geography and associates

F – London general

G – East London

H – Running

I – Orienteering

J – Non-techy friends

K – Techy friends

L – An unlinked group of non-techy friends There are a couple of other such groups.

M – People unconnected to themselves and the others

N – Bike share operators

The last group is small – I follow a lot more of them, but generally these “official” accounts don’t follow back.

[Updated] I’ll be presenting at Velo-City in Vancouver later this week. Velo-City is the “world’s premier cycling planning conference”. It is likely to have a significant bike-sharing flavour – the lead sponsor being PBSC which designed the 6000-odd “Boris Bikes” (aka Barclays Cycle Hire bikes) that are a distinctive sight in central London, as well as equivalent systems in Montreal, Washington DC, Minneapolis, Boston and (shortly) New York City – known generically as Bixi bikes. Vancouver does not have a bike-sharing system of its own, but PBSC have imported a whole load of their Montreal bikes for delegates to borrow for the week, although a recent collar-bone break means I unfortunately won’t be taking up the offer. I did however spot a PBSC/Bixi bike “in the wild” in Vancouver’s beautiful Stanley Park – see above.

I’ll be talking about some new insights into bike-sharing cities worldwide that have been revealed by my Bike Share Map, as part of a three-part presentation on looking at bike-sharing cities at different scales – my co-presenters being the author of the Bike Sharing World Map, and the software developer behind the B-Cycle bike sharing systems.

My presentation is on Wednesday morning (Pacific time) and I’ll write/tweet about it on the day, wifi-access permitting.

To prepare for the presentation, I’ve added a few new cities to the Bike Share Map: Suzhou, Zhongshan, Wujiang, Shaoxing and Heihe in China; and Kanazawa in Japan. One early insight coming from these new maps could be that the Chinese really do work hard (if you excuse the gross overgeneralisation) – typically 11 hours between morning and evening commuter peaks, and seven days a week!

Hehei is shown below – it’s right on the Russian border, opposite a much larger Russian city – hence the Cyrillic (although no bridges across the river near there!)

Note that, in the maps of the Chinese systems, the docking station locations are slightly misaligned with the background maps because of location obfuscation carried out by that country – I’m using OpenLayers rather than the Chinese-based map service that corrects for the errors. The resulting offset is typically only 1-400m though so you can still get a good idea of the shape and size of each system.

Here is the API documentation for CityDashboard. It’s really not a very advanced API, and it’s not delivered in a “proper” format (e.g. XML or JSON), instead it’s available as a number of CSV/TXT-formatted files. It ain’t pretty but it works!

I’ve put together this documentation as a number of people have asked. However, it should still be considered to be a “private” API, so could change or break at any time, for one of three likely reasons:

I make a change to the API structure. If it’s a big change, I will attempt to preserve the old structure, possibly by using an extra parameter (e.g. &v=2) to indicate the new one.

Our server goes down. Certainly not inconceivable!

One of the upstream data providers changes their feed format in such a way that it causes the CityDashboard data to freeze up. Again, quite likely, particularly as generally I don’t have a formal agreement with the upstream organisations.

The CSV format will be most useful, as the HTML and “blob” formats are specifically designed for the CityDashboard website. However, many of the modules don’t (yet) have a CSV format feed available – a blank page will instead be returned.

The first line in each CSV file contains a number in the second field. This is the number of seconds between each update. i.e if this is 5, then the file won’t update more than once every 5 seconds.

Modules which have a CSV feed for them, have an “m” included in the sixth field in the appropriate row in the london.txt file (typical values, d, db, dbm etc)

By the way, the module list will most likely be changing very soon to add a couple of important fields that I overlooked – first of all, the source URL will be in a field of its own, and secondly I will add in a proper attribution statement for each source.

This produces the minimum bikes number for each day, which is great, but the timestamp included is just the first one of each day (in fact it could be a randomly chosen timestamp from within the day, but MySQL’s internal logic happens to pick the first one out). This is because the time(timestamp) is not part of the “group by” (aggregate) clause, and all fields in a query must be included in the group by unless they are part of the aggregate. I don’t want to aggregate the time(timestamp) though – I want the value associated with the minimum bikes, rather than the maximum, minimum or average (etc) value.

It’s the second solution from the above link. There is one problem, where if there are multiple rows in a day that share the same min(bikes) value, they each appear. Using distinct won’t get rid of these, because the time(timestamp) does vary. The fix is to use an additional wrapper (tables co3) to eliminate these duplicate rows:

New York City last week released a preliminary map showing the proposed sites for the launch of its bike sharing scheme, now named Citi Bike (with Citigroup being the lead sponsor along with Mastercard).

Citigroup’s sponsorship is crucial for the scheme, which has promised no public subsidy on at least operating costs, and is a rather convenient sponsor in terms of its name. In several other cities around the world, their bike share schemes are known as City Bikes, such as Stockholm and Vienna, so Citi Bike has a good chance of becoming the “on the street” name for the scheme, unlike the unwieldy “Barclays Cycle Hire” name we have here in London – most people here know them as the snappier, if politically incorrect “Boris Bikes”.

NYC’s scheme is clearly influenced by London’s – its of a similar size, it has a big sponsor from financial services and a mayor fully behind it, and a Boris Bike from London even appears on the front cover of the NYC DOT presentation to communities. The technology used is the same and the bikes are also the same design.

The stand sizes and descriptions are also from the official map, and I’ve simulated the empty/full status of each stand, based on the distance from Wall Street and random perturbation. This results in just under 7000 bikes, based on a roughly 1:1 empty to occupied stand ratio, which is fairly standard around the world.

New York vs London

New York

London

System Name

Citi Bikes

Barclays Cycle Hire

Bicycle design

Devinci/PBSC

Devinci/PBSC

Operator

Alta Bicycle Share

Serco

Lead sponsor deal

$41m over 5 years

£25m ($40m) over 5 years

Bikes (at launch)

7000*

4200

Docks (at launch)

13639

7685

Stations (at launch)

413

345

Largest station size

128

126

Average station size

33

19

Ratio bike:docks

1:1.95

1:1.83

System footprint (at launch)

53 km2

42 km2

Annual membership

$95

£45 ($72)

24 hour membership

$4

£1 ($1.60)

Max free journey time (24h mmbr)

30 minutes

30 minutes

Max free journey time (annual mmbr)

45 minutes

30 minutes

Single metro journey (smartcard)

$2.25

£2** ($3.20)

Single metro journey (cash)

$2.50

£4.30 ($6.90)

* Announced figure. Actual figure may be less due to bikes in maintenance and temporary storage. London’s equivalent figure was 6000 bikes. ** Zone 1 only. Cost higher if travelling to Zone 2 (which has bike share bikes in it). Cost lower if only in Zone 2.

What stands out for me, when comparing New York‘s and London‘s bike share schemes, which are roughly similar in terms of number of bikes and stands, is that NY’s footprint is similar in size to London’s at launch , but with many more bikes, and the scheme is accordingly more dense. Certainly, New York will have the critical mass of stand locations, so allow the scheme to work efficiently – you’ll never have to travel very far, if your destination stand is full, to find another one.

The other thing that strikes me is that all the stands are quite big – very few of them have less than 20 docks. The biggest, on Pershing Square (by Grand Central Station) has 128 docks – this is ever so slightly larger than our own “superdock” at Waterloo Station and presumably designed with a similar purpose of satisfying the commuter “tide”. The other big commuter station in NYC, Penn Station, has three large docks surrounding it. The coverage is also fairly uniform, my only surprise is that there are only two docks in Battery City, which is surely full of people likely to use the scheme – or perhaps they just walk to work? Also there are none in Central Park – although perhaps these will be included in the Upper East/Upper West areas for next year’s expansion?

One big difference is the fee structure – at $10 a day but only $95 a year, this suggests that tourists and public-transport-based commuters are the target users, rather than local residents and errand users. This is a pity – the latter group tends have more heterogenous usage flows and help “mix” the scheme up and redistribute it organically, requiring less redistribution of bikes trucks by the operator.

$10 is four times more than the cost of the New York subway ($2.50/trip with Metrocard) so you would need to do at least four journeys a day to save money. In London, our tube in Zone 1 is £2 per journey with Oystercard) or £1 a day on the Boris Bikes. So end up often using the latter simply on cost, even for one journey. The over-30-minute journey extra cost is also significantly more – $4 compared with £1 here. Subscribers get 45 minutes free rather than 30 minutes. This gives those commuters a chance to travel further in the busy rush hour – although surely this increase the redistribution challenge even further.

NYC’s CitiBikes are thinking big, and the design of the scheme suggests that it is expected to be wildly successful at launch. Hopefully this will prove to be the case!

I was at the third WhereCampEU “unconference” which took place in Amsterdam over the last weekend of April, following previous editions in London and Berlin which I was also at. The meeting was an ideal opportunity for me to feature CityDashboard which I unveiled at the CASA Smart Cities conference a week before, and to show a couple of the items that were popular at the exhibition that accompanied Smart Cities – namely the London Data Table and PigeonSim.

Amsterdam proved to be a challenging city (financially) to visit for the conference, as it was the weekend before Queen’s Day – which is essentially a massive party throughout central Amsterdam, resulting in expensive transport to get there and all the central hotels being booked up or extremely pricy. So it was that I ended up on the outskirts of the city, overlooking a motorway, although this did mean I got to use the very fast and efficient metro service into town each day. Pre-conference drinks were held upstairs in De Waag, the oldest non-religious building in Amsterdam and a fantastically atmospheric venue. The conference venue was a short walk from here.

To get to Amsterdam I took the Eurostar to Brussels, spent an hour and a half cycling around the city on one of the Villo bike-share bikes, and then got another high-speed train to Amsterdam. A nice way to see the countryside, but it did take six hours in total. My return was a 40-minute flight.

Unconferences have no set speaker schedule, but instead participants put a post-it note with their talk title on a grid of times and rooms, and everyone looks at the grid to determine what to go to next. The plan had been to present early on the Saturday and then just relax and enjoy the rest of the meeting, but the Saturday grid was very quickly full, and it wasn’t until Sunday lunchtime that I was able to squeeze in my talk. Although 26 minutes of my 30 minute slot was spent on CityDashboard, most of the tweeted photos were of PigeonSim (that I squeezed in the last four minutes) and my attempts at demonstrating the flying gestures…

There was as usual a wide range of geo and tech talks, one of the most unusual being a psychogeography session with Tim Waters – this unexpectedly involved a practical where we went out in groups and followed and observed pedestrians going about their business (an initial “meta” idea to follow the followers having been vetoed by Tim). I also enjoyed Jeremy Morley’s update on the OSM-GB project at Nottingham to quantify the quality of OpenStreetMap in the UK, and Peter Miller’s peek at a 2.5D rendering of OSM data. Peter also showed behind the scenes of ITO Map’s map layer scripts, these produce simple overlays highlighting particular OpenStreetMap content – these were the inspiration for similar functionality I incorporated into GEMMA. Finally, a short Geo-yoga (mimicing the shapes of countries) session was certainly an eye-opener. Parallel sessions meant I missed some more interesting talks, including one from Google on why Google can work with OSM.

Thanks to all the organisers for putting on another excellent, and free, WhereCampEU!

I’m using a manually created colour ramp instead of a “standard” (i.e. ColorBrewer) diverging or sequential ramp, to emphasise the outliers (the big, expensive Band H houses and the small, cheap Band A ones) and try and reduce the “patchwork quilt” effect that you get when looking at such a map (which has nearly 170000 areas.) Another way to minimise this effect would have been to use larger geographies (LSOAs and MSOAs) at the smaller scales.

The map shows a swathe of light blue Band A housing across the north of England, and in Birmingham. In London, generally this doesn’t happen, and indeed a band of very large, expensive houses, protrudes from the affluant commuter belt right into the centre of London, from the south-west and north.

The map was created using UCL CASA’s MapTube, with a CSV file, descriptor file and stylesheet being the inputs. Welsh council tax bands use a different scale so are not included here. The Scotland/N.I. data is not available through the ONS website.

A gotcha when producing this map is that the file uses the new (2011) identifiers for OAs. Thankfully I found a file that maps the old to the new ones, although it took a bit of sleuthing to find it on the ONS website.

The London Data Table was one of my personal favourites from the exhibition accompanying the CASA “Smart Cities” conference which took place at the University of London last Friday. The concept was thought up by Steven Gray and it consists of a wooden table, cut by programmable lathe into the outline of London. A special “short throw” projector with a fish-eye lens was purchased. It was mounted vertically on a converted basketball hoop stand, pointing downwards and outwards, allowing the content to be approached and examined without the projector getting in the way. Steven has blogged about the construction process.

I created a generic dark grey background map (from Ordnance Survey OpenData) with a blue River Thames as the main identifying feature. This was used by several authors, including myself, to create either Processing “sketches” in Java, or pre-recorded videos, which were displayed on the table during the exhibition. A simple Javascript script running on Node.JS was written to automatically cycle through the animations.

By ensuring that the background map and accompanying sketches/videos where “pixel perfect”, we were able to take advantage of having control of every individual pixel, producing the quite pleasing pixellated effect as seen in the below closeup of one of the sketches (a photo taken of a part of the table) – it is showing a bike share station animation that I created, based on the same data that powers the equivalent website.

The photo above shows the table running another Processing sketch, showing point information from CityDashboard and similar to the map view on the website, except that points are randomly and automatically selected to be displayed, as people stand beside and watch the table.

The most interesting sketch presented on the table (and shown on the right – photo by Helen) was built by Steven Gray and connected to a airplane sensor box, that picked up near-real-time broadcasts of location, speed and aircraft ID, of planes flying over London. The sketch stored recently received information, and so was able to project little images of plans, orientated correctly and with trails showing their recent path. Attached to each plane image was a a readout of height and speed, and most innovatively of all, a QR code was programmatically generated and rendered behind each plane, allowing smartphone users to scan it. QR codes are normally encoded URLs, and these ones were set to point to a flight information website, with the aircraft’s details preloaded, showing a photo, and the origin and destination at a glance.

The QR codes were able to be made very small – using a single projector pixel per QR code pixel and little error correction. Various smoothing and blurring digital effects having been switched off, and a digital connection between computer and projector used, to allow the sharpest possible representation. As a result, my iPhone was able to tell me more about the planes I was seeing fly, in near real time, around the table.

Here are the colour ramps I am using for numeric measures in the recently launched CityDashboard (which by the way now has a new URL – http://citydashboard.org/):

The colours have been designed to be clearly distinguishable from the white text that is on top of them.

Here is the PHP code that I’m using to choose the appropriate colour for each measure, and which I also used to produce the above ramps – the reverse colour and bad value handling is only implemented where I currently needed, ideally these would be implemented for all the ramps:

CityDashboard is the main project that I have been working on for the last few months. It aims to summarise quantitative data (both officially provided and crowd-sourced) for the major UK cities, in a single screen. Point data is also shown in an alternate map view.

It was launched at the CASA Smart Cities conference last Friday, for eight cities – London, Cardiff, Edinburgh, Glasgow, Manchester, Leeds, Birmingham and Newcastle. London has the most dashboard “modules” at present, with a number of London-specific modules from Transport for London, the Port of London Authority, and CASA’s own sensors. Other cities have several more generic modules (such as weather and Twitter trends) and more city-specific modules will be added to these in due course. I am also looking at improving the overall look and feel of the website, possibly by using the BBC Glow API that was suggested to me at the conference (but just now took me half an hour to find on the web!)

CityDashboard features specially curated Twitter lists. For each city, there is a general news list, featuring tweets from local newspapers, local correspondents for the BBC and other TV and radio channels, tourist organisations and the official accounts for the relevant local authorities. There is also a universities list, with the official Twitter accounts for the main universities in each city, as well as their student unions. It is hoped that this latter list with detail the latest university research outputs, coming out of that city. The account that manages the lists is CityDB and the lists take the form of, for example, http://twitter.com/citydb/london and http://twitter.com/citydb/london-uni. Anyone can subscribe to these lists, you don’t have to only view them through CityDashboard.

There will be an exhibition at the conference, some people in the department have been building some very cool things which will be unveiled there. Unfortunately I’m not allowed to talk about the very coolest one of all, but I have been allowed to post the above graphic which has got something to do with it…

(If you want to have a guess at what it is, leave a comment!)

[Update 3/4 - Tickets are sold out, however I think an extra batch will be available soon.]
[Update 13/4 - A few more tickets now available.]

As planned, Tower Hamlets (east London) and Shepherd’s Bush (west London) saw a big expansion of bike share docking stations, overnight last Wednesday night. There’s also been some incremental additions to the existing zone, and a build-out of Camden Town in the days leading to the “big bang” expansion.

So where are the new docks? As Diamond Geezer has noted, the all four compass points have received new bike share docking stations recently.

The map below shows (in colour) the new docking stations – those that were installed since 1 January 2012, and are currently operational. The old ones are in grey.

The numbers:

Area

New Docking Stations

East

91

West

9

North

5

South

6

Central

27

TOTAL

138

Docking Stations

Stands

Old (2011-)

410

3937

New (2012+)

138

10071

TOTAL

548

14008

There are a few more “ghost” docking stations that appear on the map, these are old docking stations that have been decommissioned or more new ones that were recently in testing and so appeared on the official map – TfL have promised an additional 10-15 docking stations) will go in in early April.

The Barclays Cycle Hire bikesharing system (map) in London is due for a major expansion on 8 March. Overnight on the 7th, operators will be working flat out to add 23001700 1900 new bikes into 48003000 3400 new stands, clustered in the 200 150 new docking stations that have been tested over the last few weeks, across the East End of London (Tower Hamlets, Shoreditch/Hackney Road and Canary Wharf). Also going live the same day will be a much smaller expansion west to the area round the Westfield shopping centre in Shepherd’s Bush in West London. Another small expansion around Camden Town has just been completed, adding several new stands to the northern tip of the system, including handily around Camden Town tube station and Camden Road train station, allowing commuters from the north and the Overground network (like me!) to avoid the expensive Zone 1 fares and Boris Bike the last few kilometres to work.

The expansion will move London up the league table of bike share cities from 7th to 5th – in a top 20 dominated by China. It will remain the second largest system outside of China, after Paris, although New York’s planned system will be even larger:

I’ve recently been extracting some river geometries for major cities around the world. The data needs to be a list of latitude/longitude coordinates, representing the nodes on the shape for the river concerned.

1. Extract the data from OpenStreetMap. Use the Export function, and draw out the area concerned with a bounding box. Choose OpenStreetMap XML as the format. I originally tried SVG, but this presents you with screen coordinates instead of latitude/longitude pairs.

2. Open the resulting file in Quantum GIS (QGIS). I used QGIS 1.9. You need the OpenStreetMap plugin installed, this will allow the OSM file that was created in Step 1 to be read straight in (in fact you could download the file directly from the OSM servers, if you wanted to).

3. Select the feature you are interested in. My river (actually a waterbank polygon) is a “hairy feature” as it extends well beyond the extent of the data that was downloaded. Make sure you are selecting it (feature turns yellow) rather than highlighting it for feature information (feature turns red). Otherwise, the subsequent file is, rather unhelpfully, blank.

4. Do Layer > Save Selection as Vector File. Choose “KML” as the format. You probably don’t need to change the coordinate reference system (CRS) as the data will already be in WGS 84, and this (“normal GPS-style latitude/longitude) is the CRS you want.

5. Edit the resulting file, removing the XML tags, and header/footer, and replace spaces with return characters, to leave a long list of latitude/longitudes, ready for importing into your visualisation code.

Rank Clocks are a type of visualisation invented by Prof Michael Batty here at UCL CASA. They are time-based line charts, wrapped around a clockface – with the start date at the top, wrapping around clockwise to the end date. The lines on the clock show the change in ranking of the items being visualised. By effectively wrapping a line chart around itself, certain patterns, that would be otherwise hard to spot, become clearer.

Starting from Prof Batty’s Rank Clocks application (written in VB), I created a web version that has a subset of the application’s features, but also includes a map, allowing both temporal rank changes, and location, to be shown. A future enhancement would also be to show the change in location with time as well (an example would be how football clubs have moved around in London over the years and how their relative rank in the leagues has also varied) but for now each item in the dataset has just a single point location that remains constant with time.

The “classic” Rank Clock is of New York skyscrapers – looking at the clock allows bursts of skyscraper development to be easily spotted, and as New Yorkers have been building skyscrapers for over a hundred years, and have many of them, it is a rich dataset. I have curated a London equivalent from various sources including Wikipedia. It includes the many residential towerblocks of the 1960s/1970s, many now knocked down, but is not quite the same as New York’s.

The website is written in Javascript, using OpenLayers both for the map (with OpenStreetMap background) and for the rank clock itself. For the rank clock, I am doing some basic trigonmetry to calculate the coordinates needed to show the lines and converting from polar coordinates to “native” screen coordinates. This is a novel but not particularly efficient use of OpenLayers, but I used it as I am quite familiar with using OpenLayers, particularly for showing lines as vectors, rather than using a Javascript vector-based charting API which would be the more obvious choice.

My interpretation of the Rank Clock concept has plenty of flaws – in particular, data can often be easily obscured, and spotting patterns in noisy (frequently changing rank) data is difficult. It’s difficult even to select lines (to see their caption) if other lines are nearby and overlaying them. Nonetheless, it can provide an unusual way of looking at some interesting datasets.

For one of the datasets in the sample website (US baby names) I have repurposed the map to effectively show a 2D graph indicating beginning and ending (in time) positions of the names – so here OpenLayers is being used to show two “maps” – but neither are actually maps.

I’ve also linked into the Google Earth browser plugin (installation maybe be required), replacing each dot on the OpenStreetMap map, with a column of varying height (and colour) based on the initial rank, with an extent appropriate to the data set. Google Earth can be refreshed by supplying new KML information – and it turns out that OpenLayers has a rather nice KML conversion and export feature for any geometry in it, which allows Google Earth to be driven in this way. This is done when clicking on a Rank Clock line, allowing the equivalent feature in Google Earth to be redrawn with a thicker border. Unfortuantely events cannot be captured from Google Earth and back into the OpenLayers map, so clicking on a pillar in the former will not highlight the corresponding Rank Clock line in the latter. Still, it’s a nice way of linking spatialtemporal information and then visualising it in 3D.

I carried this work out quite a while ago, but haven’t mentioned it to now, as it’s not complete. There are only a limited number of datasets available, and plenty more features could be added – and the navigation and interaction improved significantly. Please bear this in mind when viewing the live site.

There are a few “toy” features already though – you can invert the rank clock (normally the top-ranked items are in the middle of the circle and so are hard to see), change the metric the colour is showing, and filter and relayer.

The three rank clocks shown here are showing: TOP – Changes in population of the London Boroughs of Newham and Tower Hamlets, and the City of London, over 150 years. The City of London line spirals outwards, showing its drop in population (and so rank). Tower Hamlets also shows a big drop in rank during WWII, but has started to increase again recently. Westminster’s population rank has steadily increased, until WWII – but again its rank has also more recently increased. MIDDLE – Tall buildings in London, coloured by year they were built. The oldest (red) buildings have been selected and show in Google Earth, showing that such buildings were entirely in the centre and west of London. BOTTOM – US company revenue. The San-Francisco-headquartered companies are selected on the map and correspondingly highlighted on the rank clock, showing that only one was founded before the 1970s – IBM – and a general spiralling inwards as Silicon Valley grows.

I’ve created a new visualisation, a dasymetric map of housing demographics which you can see here, which attempts to improve on the common thematic (a.k.a. choropleth) maps – a traditional example is shown below – where areas across the country are colour-coded according to some attribute. My visualisation clips the colour-coding to the building outlines in each area, leaving open ground, parks etc uncoloured.

The Traditional Approach

The shortcoming of choropleth maps is that each area is coloured uniformly. If the attribute being measured is a property of the houses in that area, such as much of the census data, then choropleth maps not only colour the houses in each area, but also the parks, rivers and mountains that might also be contained within the area, even though the data being displayed arguably only applies to the houses. This means that geodemographic classification results that predominate in rural areas tend to overwhelm a map at smaller scales – as can be seen in the map on the right – where the green represents a countryside geodemographic.

An alternative to choropleth maps is to use cartograms. These distort the area, elastically, to tessellating hexagonal groups or to circles (Dorling cartograms), to match typically population rather than geographic extent, so that the colours are represented more fairly, but cartograms are very difficult for most people to interpret and relate to familiar physical features. They can look very “alien”. One further alternative is dot distribution maps – these assign dots of colour, randomly within each area. This reduces the colour density correctly in sparsely populated areas, but distributes the dots evenly across empty parks and rows of houses, if both are in a single area, and imply single points of population.

Clipping the Choropleth Maps

My visualisation attempts be the best of both worlds, by retaining the familiar geographic shape of the UK and its towns and cities, but not swamping the map with colours in all areas, and indeed ensuring that unpopulated areas have no colour. This is possible because Ordnance Survey Open Data includes Vector Map District. The second release of this dataset improved the quality of building outlines considerably, allowing distinct rows of buildings on streets to be seen and even individual detached houses. Unfortunately building classifications are not included, so the process necessarily colours all buildings, rather than just the residential ones that formed part of the census data. This is why, for example, the Millennium Dome in Greenwich appears, even though no one (hopefully!) lives there.

The major shortcoming of doing this is that it falsely implies a higher level of precision within each Output Area, by often showing and colouring individual buildings, whereas the colour is representative as an average of the properties in the area concerned, rather than telling you something about that particular building itself. That is, the technique is showing no new or more detailed data than can be seen in the traditional choropleth maps, but tends to mislead the viewer otherwise. This is balanced by making the map seem more realistic, by not unformly covering everything in the area with a giant blob of a single colour.

The map can be considered to be a dasymetric map, albeit one where the spatial qualifier, population density, is one of two values – high (in a building) or zero (not in a building).

Booth’s Poverty Map

An inspiration for this kind of map is the Charles Booth Poverty Map of 1898-9, although my example is considerably less sophisticated. For this map, Booth (and his assistants) visited every house, to determine the demographic of the house, and then painstakingly coloured in the houses, along the streets. His map therefore did not suffer from the falsely implied accuracy – his map really was as accurate as it looks. The Museum of London, incidentally, has a “walk in” Booth poverty map, I featured it on Mapping London blog last year.

The photo above compares Booth’s map (from a photo of the map in the Museum exhibition, including a friend’s hand) with my map, for the Hackney area in London.

OAC, IMD and London

My main geodemographic map is showing the OAC (Output Area Classification), which was created by Dan Vickers in Sheffield in 2005, and is based on data from the 2001 census. The areas used are Output Areas, there are around 210,000 of them in the UK, each one with a population of roughly 250 people in 2001.

The OAC map is not particularly illuminating for London – the capital is considerably more ethnically diverse than most other parts of the country, but because the clustering process used to create OAC is run across the whole country uniformly, only one Supergroup appears to show such ethnically diverse areas – “7″ (Multicultural), rather than showing the variety within this group that extends across the capital. With this in mind I have created an alternative map, which colours the housing according to the IMD (Index of Multiple Deprivation) rankings. This covers England only, and the data is only available at larger spatial units, called LSOAs (Lower Super Output Areas) but is more up-to-date, being from 2010, and shows considerable more variety across London. Use the link at the bottom of the visualisation to switch between the two.

You can view the map here. It uses geolocation to attempt to zoom to your local area, if you allow it to – it will probably ask you to allow this when you visit the site.

Transport for London (TfL) take their colours extremely seriously – the London Underground, in particularly, uses colour extensively to brand each line, and the maps and liveries are very well known.

The organisation has a colour guide to ensure that, when referencing the tube lines, the correct colour is used. Somewhat surprisingly, the guide includes hexadecimal (i.e. web) colours for only a “safe” palette – i.e. colours which would definitely work in very old web browsers. They don’t list the “true” hexadecimal for the colours, even though, confusingly, the colour shown is the true one. I couldn’t find anywhere on the web that did this either, all in one place, so here below is a summary. I’ve also included the safe colours so you can see the difference – but don’t use these unless you have to.

The Mappiness project is run by one of CASA’s technology superstars Dr George MacKerron – it was his Ph.D project at LSE. The project, which is still going, aims to quantify happiness based on environmental factors, such as location, views and sound, as well as who people are with and what they are doing. Data is collected by volunteers downloading an iPhone app, which then pings them at random moments twice a day between 8am and 11pm (configurable) to ask them the questions and collect the data. Volunteer incentive is driven by having access to a personal webpage which contains all their collected data, visualised in a wealth of attractive graphs and maps.

I’ve been using the app since late October, it has been steadily pinging me twice a day since then, and most of the time I hear the familiar ‘ding ding’ and get around to recording the information. With around 160 responses, some interesting insights are now appearing, some(!) of which are non-personal enough to share here. The map above shows the locations where I was pinged, for the London area – yellow stars indicate where a photo was taken.

Here’s one, based on the general environment:

Perhaps more interesting is that I spend much less time outdoors than I thought. The app (by default) only asks for a picture if you are outdoors, so by counting the number of pictures that appear on my personal webpage – just 14 out of 161 – this in theory means that I spend only 8-9% of my waking life outside. This percentage will hopefully grow as summer approaches and things start to warm up again.

Because I don’t get to choose when to post the images, the photos are a good snapshot of my “everyday” outdoor view, rather than a nice or interesting place that I would specifically stop to photograph. Here’s a couple of my most recent ones:

One of Dr MacKerron’s current projects involves using Microsoft Kinect sensors for visualisation – this is my very tenuous link to allow me to post the image below, which is a 3D grid “photograph” of me at my desk, constructed from Kinect data.

Mappiness managed to choose to ping me this morning precisely at the moment that my bike chain snapped, on the way to work. Needless to say, a low score for happiness was recorded.

Capital Bikeshare, the bike sharing system for Washington DC and Arlington, recently released the data on their first 1.3 million journeys. Boston’s Hubway bike sharing system also released journey data for around 5000 journeys across an October weekend, as part of a visualisation competition. Both these data releases sit alongside London’s Barclays Cycle Hire scheme, which also released data on around 3.2 million journeys made during the first part of last year.

Taking together all these data sets, I’ve used Routino and OpenStreetMap data to suggest likely routes taken for each recorded journey. This same set of data was used for Martin Zaltz Austwick’s excellent animation of bikes going around London streets. I’ve then built another set of data, an node/edge list, showing how many bike sharing bikes have probably travelled along each section of road. Finally, I’ve used node/edge visualiser Gephi and its Geo Layout plugin to visualise the sets of edges. The resulting maps here are presented below without embellishment, contextual information, scale or legend (for which I apologise – unfortunately this isn’t my current primary work focus so my time on it is restricted.)

For the two American schemes featured here, I have set the Routino profiler to not use trunk roads. Unlike most UK trunk roads, American trunk roads (“freeways”?) appear to be almost as big as our motorways, and I expect you wouldn’t find bikes on them. Unfortunately there are some gaps in the Washington DC data, which does show some cycle-lane bridges alongside such freeways, but these aren’t always connected to roads at either end or to other parts of the cycle network, so my router doesn’t discover them. This means that only a few crossings between Virginia and Washington DC are shown, whereas actually more direct ones are likely to be also in use. The profile also over-rewards cycleways – yes these are popular but probably not quite as popular as the distinctive one in the centre of Washington DC (15th Street North West) showing up as a very fat red line, suggests. The highlighting of other errors in the comments on this post is welcomed, I may optimise the profiler (or even edit OpenStreetMap a bit, if appropriate) and have another shot.

…it sounds like one heck of a lot of running. But Murray Strain, one of Scotland’s top terrain runners, is counting on it for his basic training. He’s logging the whole venture, which is based on his trusty Edinburgh A-Z. If two adjacent streets with very similar names are nonetheless separated in the A-Z index by one on the far side of the city, it means a couple of legs right across the city.

Since he started the exercise last year Murray’s got through all the As, and is currently midway through the Bs. I’ve produce a couple of GEMMA maps, one showing the A-Bs (above, As are red and Bs are orange) and one showing the A-Gs (below, in rainbow order). That’s a lot of streets. N.B. The maps in fact show all linear features in the area in OpenStreetMap, so the odd named cycleway and waterway has crept in there too. But the ~95% of the coloured lines will be the streets that Murray will be run.

In order to produce the map, I’ve added a new feature to GEMMA – it now allows you specify only one desired geometry type, i.e. points, lines OR polygons, when adding an OpenStreetMap layer to your map. Previously, you got all three types, although you could reduce each to a dot if desired. This example also highlights the need for legends on the PDF maps that GEMMA produces – a larger coding change, so one that would make it into a future version 2 of GEMMA.