I have an authentication system for my programs which logs every authentication attempt.Now over the months, the logs table have gotten 4,064,450 records.The problem is when im executing that query to get the latest 13 records it takes about 10 seconds to do it.

"SELECT * FROM logs_table ORDER BY time DESC LIMIT 0, 13"

time = the time of the attempt.

Is there a way to optimize the query ?

Thank you.

Mittineague
—
2011-03-19T00:54:48Z —
#2

I'll say that going through 4+ million records to only use 13 is very inefficient!

Do you need to retain them all? If not I would periodically DELETE them, maybe after saving a backup just in case you want them later.

Otherwise I would move the older ones to some sort of "archive" table that doesn't need to be searched so often.

IMHO it isn't the query that needs optimizing but rather the database architecture.

Cool7
—
2011-03-19T01:08:48Z —
#3

Thats the problem, we do search for the logs to get information about the user if needed for alot of cases.

Could you please give me more info about the "optimizing the database arch" ?

Thanks.

system
—
2011-03-19T01:35:57Z —
#4

Something like this should speed up your query significantly (with tweaking, should run about 100x faster):

How many entries per day? If 3 days returns too many logs, reduce it... or adjust higher if you want more padding.

Also, if you don't have an index on the "time" column, try:

ALTER TABLE `logs_table` ADD INDEX(`time`);

Cheers!

Mittineague
—
2011-03-19T01:42:51Z —
#5

If you need to keep and use the data that does limit your options.

Do you really need every field returned - the "*" ? Even if you do, AFAIK explicitly listing the fields allows the database to use indexes and should run faster.

For the frequent query, maybe you could make a "recent" table that points to indexes in the "logs" table. Then save the heavy grunt work for the occassion.

system
—
2011-03-19T02:15:30Z —
#6

Mittineague said:

AFAIK explicitly listing the fields allows the database to use indexes and should run faster.

That's not the culprit here... he's combining an ORDER BY and a LIMIT... that will necessarily compare the time field on every result... even when indexed, it's doing a multi-million-record hash prior to the LIMIT.

To increase performance, he has to to a "pre-sort pseudo-limit" by date. This will allow him to cut out a significant % of his results on an indexed column prior to the ORDER BY/LIMIT.

It'll significantly increase his performance (assuming the time column is indexed).

SpacePhoenix
—
2011-03-19T07:17:12Z —
#7

Cool7 said:

I have an authentication system for my programs which logs every authentication attempt.Now over the months, the logs table have gotten 4,064,450 records.The problem is when im executing that query to get the latest 13 records it takes about 10 seconds to do it.

"SELECT * FROM logs_table ORDER BY time DESC LIMIT 0, 13"

time = the time of the attempt.

Is there a way to optimize the query ?

Thank you.

What server-side language are you using? Try dropping the LIMIT CLAUSE and instead in the server-side language run the loop that you use to get each row of the result set 13 times (or for however many rows are required) then break from the loop.

system
—
2011-03-19T13:52:23Z —
#8

SpacePhoenix said:

What server-side language are you using? Try dropping the LIMIT CLAUSE and instead in the server-side language run the loop that you use to get each row of the result set 13 times (or for however many rows are required) then break from the loop.

Yikes!!! That will retrieve 4 MILLION records!!! I don't even KNOW how a server will handle that... even with an iterative model, there has to be some point in the DB connection that will process all 4 million records!

system
—
2011-03-19T14:23:08Z —
#9

Ok, I decided to run a test on this, because I'm interested to see the results...