Help Optimizing within Group Aggregate Query

12-21-2007, 02:58 AM

I have a database that has city/state data. I am making a webpage for each unique city/state combination. There are collisions in the unique names and when that occurs I want to choose the largest city with that name by default.

I have the following query I wrote but there are some penalty hits associated with it, including:

2x Using where
1x Using temporary
1x Using filesort

SELECT a.id, a.name, a.population
FROM geo_cities a
JOIN
(
SELECT name, MAX(population) AS population
FROM geo_cities
WHERE state = [numeric value]
GROUP BY name
) AS b ON a.name = b.name AND a.population = b.population

I was curious if anyone has any suggestions on how to mitigate these performance issues? Thanks!

What indexes do you have. You definintely need a non-unique index on name, and possibly a two column (name,population) may help.

How volatile is your data? Maybe you should just run the query once to populate another table if your city/population data rarely changes, and reexecute it after an update, rather than computing this on the fly.