Local search with PostGIS

Given a set of elements with a position and a name in a database give me the N closest to a certain point that
match some pattern on the name.

So imagine you have openstreetmap database and want to find the first 300 banks and bars closer to
Madrid city center (pretty interesting combination I'd say, in Spain ATMs are in the banks).

So in order to test it I loaded Madrid OSM in a postgres database. I just downloaded the data from geofabrik site and imported using osm2pgsql tool, pretty straightforward.

Then I created some indices for the way and amenity columns (using full text search stuff)

-- gist index for the geometry, full text search for the text
create index on planet_osm_point gist(way);
create index on planet_osm_point using to_tsvector('spanish', amenity)

First try, use Nearest Neighbour search

Since postgis 2.0 we have Nearest Neighbour search (thanks to CartoDB which founded it) that allows
to use the geospatial index to sort results (read this blogpost in boundless blog), so the first try was to order by distance operator <-> the results from the text filter.

select way, amenity from planet_osm_point
where
to_tsvector('spanish', amenity) @@ to_tsquery('ba:*')
order by way <->'SRID=900913;POINT(-412661.352370664 4926477.3516323)'::geometry limit301

This takes around 48ms (everything cached). Looking at the explain analyze out of curiosity I
realize spatial index wasn't being used:

I don't fully understand how the postgres planner works but sounds like it might be using a bitmap
and operation both indices. Increasing the limit does not change anything, I thought it could change
the selectivity. I tried a search by distance:

The time for this query is around 48ms so no improvement at all. But that may depend on the number
of iterations it needs to do until fetch all the results. Starting with 300 meters takes 4 loops
since last bar is 2393 meters away form city center.

If it starts the iteration with a bigger radius, like 1500 (2 iterations), the query time is 25ms. If it only needs
to do the fist iteration, it's 18ms which is much better.

So how do we know what would be a good value to start? Hard to say without some density information stats... luckily postgres has pretty good stats about indices and there are good ways to access it: EXPLAIN and_postgis_selectivity

So using a radius of 4200 meters the query takes ~18ms as it's doing a single iteration. It's
important to say that calculate the stats is almost free, it takes less than 1ms (except total count
that could be precalculated).

Other nice thing about the recursive query is it can be paginated, so you can find 100 results get
the last distance and the next time use that distance to start iterating.

Maybe the method works well for this case but not for other ones, depending on how postgres uses the indices the times may vary a lot but this is the best approach I found, any idea?