Database Administrators Stack Exchange is a question and answer site for database professionals who wish to improve their database skills and learn from others in the community. It's 100% free, no registration required.

The query I am currently trying to perform should retrieve all customer names, the date of their last order, when that order was delivered and their total number of orders for all customers at a particular store whos most recent order was in the last 6 months. Here is what I'm currently using.

However, this is pretty slow and takes 1-2 seconds each time, and I can only assume this time would increase as the amount of data increases.

In an attempt to profile the query I have used EXPLAIN which seems to indicate it is searching through around 20k records in the deliveries table. What should my next steps be to optimize this query? How can I go about reducing the execution time?

I'm not 100% sure I understand all, but from a business standpoint this queries data that by all means shouldn't be run very often. Do you really need all this data, grouped that way, regularily?
–
0xCAFEBABEJun 25 '13 at 7:45

1 Answer
1

If you want to speed up this query, first of all, make sure you have indexes on the columns that define join condition, used as a filter (WHERE), or column[s] you have in GROUP BY or ORDER BY. In general, a good sign that some indexes are missed is when EXPLAIN shows "NULL" in key,key_len and "ALL" in type column (not always though; for instance if you select all rows from the table, or table is small enough, so full scan is faster than index seek + lookup). Then you may want to tweak some indexes to make them covering for this query.

Side note. The query seems to me like a BI query which normally is not executed against OLTP database. When dealing with a big data, it makes sense to build a few cubes based on data from operational db and query them instead of original data. The nature of OLTP implies high level of normalization and optimization for INSERT/UPDATE/DELETE, not for SELECTs (still possible, but queries can be very long, not clear, and quite slow).

Hi, thanks a lot for your answer. Everything seems OK on they key front. I've been reading up on what you mentioned in your side note, and as you say I'm trying to do BI work on an OLTP database. My problem really is that I'd like to provide fast BI services to a large number of users. How can I go about creating a set up more suited to this?
–
AaronDSJun 25 '13 at 14:04

1

@AaronDS: You can try the following. As far as I understand , out of 5 tables, purchases and deliveries are two tables with most data. It may make sense to partition them (say range partition by purchase_id which probably correspond somehow to date of purchase. datetime/timestamp columns are also good candidates for partition key). Another way is to periodically move data from those table to archive tables. Then you can build a cube or just mimic "materialized view" : You will have something like [Your original query] UNION ALL SELECT * FROM archive_data...
–
a1ex07Jun 25 '13 at 14:37