This is returning results with too much emphasis on distance and not enough on how long ago the activity at that distance was. So I tried a few things:

Restrict to activity only in the past 4 hours instead of a week (not bad! helps quite a bit.)

Restrict to activity within 1000 miles (maybe restricts too much? This left me with no results for Missouri where I write this, but there are no quakes recently near here.

Between them I got a pretty usable app. If it can’t find anything, “Nope.” is returned. For now, if anything was found it returns a list. If we wanted we could hook up GeoIP as a fallback, but since I’m playgrounding this on ep.io I don’t have c extensions to use the module and have to call it good with just HTML5.

With that we’ve built a small app with a data source, views, templates, and a “real” purpose all the way through. I hope it’s been helpful for several of you and apologize that some things wound up basically just being repeated from earlier topics, but hopefully after the in-depth back then it was easier to walk along as something was built from it.

In the spirit of the holiday (in the US we’re celebrating Thanksgiving) I thought I’d offer my appreciation / thanks to the projects I use and lean on to make my applications work. This isn’t a canonical list, I’m sure I’ve forgotten libraries here and there. And regardless of whether I use your code right now or not, thanks for releasing it where I can someday if it does something I need. I contribute everything I can when I can because everybody else has contributed so much that makes me more productive I want to follow suit.

If you blog, I’d ask you to consider something alone these lines of your own. As software authors there are lots of instances where our code is doing things we have no idea about until we’re told.

We’ve got our quakes app installed (update it if you’re following along at home, I committed a bunch of code changes for this entry) and our initial data imported.

However, we need this data to be updated somewhat regularly for the service to be useful. Sean shipped django-quakes with Celery support (which I commented out of the requirements file because I’m running this app on the neat ep.io auto-scaling webapp service‘s free tier and don’t have Celery access. If you run Celery just re-enable it, if you don’t use Celery or even know what it is (check it out!) you can use cron to schedule it just as well.

To avoid polling too often and bugging the USGS I’d set the interval at hourly or more under normal circumstances. We only care about quakes when our users do. I’d run it at least daily though so you keep history.

Since we care when our users do and earthquakes can’t be convinced to ONLY hit at certain minutes after the hour, what do we do? How about we check the latest quake we have and see if it’s beyond a certain age to force a refresh?

This is more of that “please don’t do this” code – yes, it works, but it’s bad. If you’re like me and want the site to auto-retrieve at a faster schedule when users are using it, you might be tempted to do this:

This is bad. Why? Our webapp isn’t a single-file line. When there’s an earthquake, people start going nuts. The resulting traffic will be somewhat of a flood, so imagine a bunch of these views all running at once. USGS will not be happy with us, and we’re doing all kinds of extra processing that just gets thrown out. (NOTE: If you’re trying this in the default dev server, it won’t work – it’s single-threaded so both will never be going at once.)

So what’s our next thought? Lock files (virtually, in cache instead of filesystem)!

We’re still not quite there. After all, at the time of this writing there hasn’t been an earthquake in 45 minutes – so every access kicks off a poll. Let’s add caching to store the last time a poll was done.

But what about people who hit the site while a poll is in progress? How about a time.sleep(0.5) loop?

Did you just throw your mouse? I dodged it. Yeah, that’s a bad idea. After all, it does tie up our Python processes at the moment when we’ve got hordes of eager, shaken (pun only slightly intended) users ringing the doorbell.

So what’s a programmer to do? Let’s set a context flag in the template and use it to trigger something in the template about “we just heard, we’re checking into it!”with what we already have.

We’ve discussed all sorts of things this month, now it’s time to start putting them together more completely than snippet-for-this, snippet-for-that. I’m not going to inline command + output on everything but I am going to spell out all of the steps – if you have trouble, let me know in the comments.

To accomplish this we’re going to build a simple web service I’ve snagged “isthatanearthquake.com” for.

First order of business is creating the project.

Because virtualenv and virtualenvwrapper rocks (`pip install virtualenv virtualenvwrapper`), I started with `mkvirtualenv isthatanearthquake`. My global virtualenv `postactivate` script includes the line `cd $VIRTUAL_ENV` which drops me in the env whenever I activate it. Next I created a `src` directory, created and cd’ed into `isthatanearthquake-git`, created `templates` and created requirements.txt with these contents:

Create the Django project with `python $VIRTUAL_ENV/lib/python2.?/site-packages/django/bin/django-admin.py startproject isthatanearthquake` (note you will need to specify your python version in this command)

`add2virtualenv isthatanearthquake` will make the new project show up on your PYTHONPATH.

Open isthatanearthquake/settings.py and `import os` at the top. Somewhere in the file (I tend to put it nearer the bottom define

PROJECT_ROOT = os.path.abspath(os.path.dirname(__file__))

Set up a database: DATABASES[‘default’][‘ENGINE’] should be ‘django.contrib.gis.db.backends.postgis’ and [‘NAME’] should be whatever you want to call the new database.

Set your template dirs to use the templates folder created just outside the project from earlier:

TEMPLATE_DIRS = (
os.path.join(PROJECT_ROOT, "../templates"),
)

Add ‘django.contrib.gis’, and ‘quakes’, to INSTALLED_APPS. Uncomment ‘django.contrib.admin’ (and ‘django.contrib.admindocs’, though that one isn’t strictly required it’s just my habit.) We’re done with settings.py for now.

Uncomment the line in urls.py including admin.site.urls. If you uncommented django.contrib.admindocs in INSTALLED_APPS, uncomment its include statement as well.

Create the database you configured for with `createdb -T template_postgis <dbname>`

`python manage.py syncdb` will create the necessary tables, create a superuser of your choice, and `python manage.py load_quakes` will grab the last seven days of earthquakes from USGS.

`python manage.py runserver` will pop up the development server, and now with a web browser we can hit http://localhost:8000/admin/, login and choose “Quakes” to see a list of what came in.

One of the hardest parts of doing geo projects is getting the data you need to do it in the first place. In the US at least there are mountains of data at the federal level, some at the state level and who knows what at the local level. There isn’t a single place I can go to for anything outside of census-type stuff. Which school for a given grade level would a child at this point attend? Maybe it’s published, maybe it’s not.

Most cities or counties will have a GIS department. Some of them will be great, and helpful towards your goal as a developer. Others won’t. Our world of the tools and technologies are leapfrogging traditional methods.

So here’s some of my favorite sources of different kinds of data, worth looking at for your next project.

TIGER/LINE: Gigabytes of shapefiles of things the US Federal government collects. I tend to look here first if I need basic location data.

SimpleGeo Places DB: SimpleGeo put a dataset of 21 million places (12ish million in the US) into the public domain last summer, and links to the file here. I have played with the data some and it’s pretty clean but often categorized inconsistently (but hey, it’s free for all to use) There are discussions online about various ways of getting it imported, shapefile conversion didn’t work for me and neither did wrapping it in a feature collection in GeoJSON – it’s just too big – nearly 8GB of JSON. The method I got to actually work was taking it line by line, deserializing and then processing. I split into a TON of files containing 20,000 places each and ran several processes to get it imported. It’s a BIG database, and pretty slow in PostGIS so be warned. I have no plans to put anything in production from it so speed isn’t that big of an issue. As a side note, people importing this dataset are probably their best sales tactic toward paying for their SAAS version.

Flickr Shapefiles: Also public domain, potentially useful if you need to bring photos into the mix.

Timezones: Full shapefile of timezones of the world. Useful for auto-detecting your users’ time by their location and auto-shifting times to them. Public domain.

Free IP/Location database: Similar to GeoIP but community built. Likely not as comprehensive, it had no idea where I was for instance.

data.gov: It’s hard to find what you want and sometimes hard to figure out how to use the format it’s in, but data.gov is as close to a one-stop-shop as you may come.

data.nasa.gov: Datasets published by NASA. This one’s still a work in progress, but has a better search than data.gov and includes most NASA data listed on data.gov.

I stopped short of displaying an equivalent overlay in Google Maps v3 yesterday, but wanted to circle back and show that it’s doable as well. It’s far more verbose than the magic-y “pass it a polygon and it does the rest”, but you have more control too.

View, pretty much just like the one we used for points only with a model that has a polygon:

But we want our apps to be more interactive – so let’s quickly add popup windows when something is clicked on. The blog isn’t going to handle a diff very well, and that’s the best way to show it, so check out this gist. It’s not very difficult at all. Now we get popups when somebody clicks a marker, without writing a single line of view code. I tend to build methods on my models for map_display_html() or something similar. If I’m really returning HTML I’ll put it in a template and use the template language to make it work, if it’s just a name and hyperlink or something similar I just leave it in the model.

For this example I’ve got a view that will return the objects to be displayed in `object_list` in the context – same dataset as last time, only instead of building things inside the view we’ll do it in our Django template.

This doesn’t do anything fancy – just the simple behavior we got from using GeoDjango’s map generator – but since we generated the JavaScript ourselves, we know what things will be named and can add on our own JavaScript much easier – and this works without modification. No API key creating, switching based on what site we’re on, it just works.

Since we generated normal JavaScript we also have the benefit of being able to use tutorials and examples found all over the internet to fine tune things. Want an onClick event that opens an infoWindow? It’s quite easy. Want to do zooming differently, say to a range from the search point instead of making every point visible? You can do that too. [you will need another geometry passed into the context for the search location in order to do this]

So far we’ve done all kinds of wearying and information – but never left a Python console. I suppose it’s time we fall into the web mapping world and look at our options.

On the left of the easy-to-hard scale is Google Maps, API v2. This is the deprecated version, but is the only frontend generation available to us out of the box. It’s not mentioned in the documentation, but is well represented in the class’ docstring. (django.contrib.gis.maps.google)

Then instantiate the map with a markers kwarg (you can include key=’api_key_here’ in the class instantiation or it will fall back to settings.GOOGLE_MAPS_API_KEY), pass it into the template and watch in amazement as a map materializes before your eyes!

The resulting HTML, even after running simplify() on the geometry, is 2.4MB so I won’t display it here – but here’s a link. Warning: displaying polygons gets big. I’m asking for trouble here, after all – every single point that makes up the boundary of four states is going to be a bunch.

django.contrib.gis.maps.google is great for quick, don’t need a lot of details or fancy overlays maps. But it leaves out an easy way to do click events on markers and labeling, and API keys are inconvenient.

IF you do need onClick events, infoWindows from markers or any of that stuff there are hacks – but it involves looping over the items in the template as well and generating JavaScript that infers what it knows this map generation code will name things. Best to bite the bullet and build API v3 code in a template on your own (coming soon!)

First, a public service announcement: tomorrow, Nov. 16th, is “GIS Day” which means universities all over will be holding day lectures, in most cases for free. Many of them will be over our heads since we’re but neogeographers, but inspiration can come from anywhere – and it’s a good opportunity to meet others interested in this stuff. I’ll be at the one held at the University of Kansas here in Lawrence – why not look it up and see if your nearby university is having a program? My apologies on the short notice for this though, hopefully you can still make something happen as a professional development day.

I’ve shown more than a few ways we can use the ORM with GeoDjango to do real geographic searches of various types. But what if your limitations aren’t so generous and geospatial libraries and databases are off limits?

What if there were a way we could FAKE it?

There’s the incredibly nasty way which I won’t even put in code. Seriously, don’t do this. For completeness I’ll put it in semi-pseudocode that at least conveys the basic idea.

Create a set, loop over all objects in the database and use our
multi-tool GeoPy's vincentydistance (correct capitalization) method
to calculate the distance to each of them. Store the distance and a
record identifier in a dictionary and append it as a new "row" to our
set. Sort the set, and WOW, we have a really slow, potentially
disastrous in memory usage, but pure Python and minimal prerequisite
implementation of geographic searching.

Still listening? Please don’t do this.

Somebody smarter at math than me had this problem and thought about a way to solve it. It’s called Geohash and essentially it’s goal is to have a way of defining geography with varying degrees of accuracy with the variance in the number of characters. So 9yum8 will match 9yum8yef3vds6 (Lawrence, KS) and we can use a CharField to store, and query with the __startswith operator.

It’s not perfect, though – items JUST across the line which do overlap, but are not centered inside the hash we search for won’t be retured. To get them we need to calculate the neighbors of our hash. (in the case of python-geohash a method called “expand” will return neighbors + our origin box)

There’s a good library called python-geohash (use that name to pip install it, “geohash” on pip is not the right thing) that will do both of these calculations for us – pure python with C for speed if that works on your system.

So what we’ve done is create a Q object you can use on a queryset to fake a geographic search around the general area. This isn’t a very wide area, though. Trial and error will help. Passing a “precision=” kwarg into .encode() to request less accuracy (and a wider search area) like this:

According to Wikipedia’s explanation of the algorithm (see under “Worked Example”) accuracy ranges from +/- 2500km at 1 character to +/- 0.019km at 8 characters. A length of 5 is +/- 2.4km, close to what we were using for circle radius generation in previous work. But drop to 4 and you’re grabbing +/- 20km worth of records.

It’s far from perfect, but when there’s nothing available but plain text search, it’s better than nothing!

As we saw last week GeoIP can be pretty inaccurate for mobile users – the exact audience we may be trying hardest to serve with a geographically aware website. But the W3C saw, or was made to see, the writing on the wall and built a set of standard APIs into HTML5 for just this case and most modern browsers have picked it up.

The API is pretty marvelously simple. This implementation changes the URL to return latitude and longitude when they are available, which we can use in our Django view. Plus, the same code works on mobile devices (at least the iOS ones I carry) with no changes.

So let’s dive right in, and make the campgrounds dataset grab the nearest results to the user.

Other than that we check to see if the querystring has location info available – if not we request it from the browser and register a callback to bring the user back to the page with the right querystring args. The second function passed into getCurrentPosition() is the error-handling callback. In this case I’ve just set it to alert in the various cases for simplicity’s sake.