The SitePoint Forums have moved.

You can now find them here.
This forum is now closed to new posts, but you can browse existing content.
You can find out more information about the move and how to open a new account (if necessary) here.
If you get stuck you can get support by emailing forums@sitepoint.com

If this is your first visit, be sure to
check out the FAQ by clicking the
link above. You may have to register
before you can post: click the register link above to proceed. To start viewing messages,
select the forum that you want to visit from the selection below.

Huge Table

I am working on a project in which an admin uploads 4000-5000 records per day in a table through CSV files (average 500 records per CSV upload). Database is MYSQL.

Users on the main website could search and fetch data from this huge table.

Now the table already has nearly half a million records and search and query functionality (on the main website) have become very slow. Though, I have applied a few indexes on this table and all other tables that are related to this table (through foreign keys) bur still the searches are slow.

I could add some more indexes on this table however I am afraid that could slow down the inserts/updates (via CSV upload). As you know indexes slow down the inserts/updates.

Do you guys have any idea regarding how to manage a database with such a huge table by optimizing both the processes (inserts/updates and searches).

Don't think half a million records are much, in fact the site in my signature has a table with 2.6 million records (ip address mapping to location id). Are you sure you have created all the indexes properly which can be used to speed up your queries?

Hey Viv5,
I’m building large scale DB system in MySQL and I’ve tested table with 100 million records (size 4 GB). Test select that gets 700000 rows takes 8 seconds and total number of records doesn’t make it any slower.

It all about index and order. But index in all key fields that you use in query and If you have ORDER BY “something” in SELECT query, try to leave that off.

The size of your dataset is very small, simply ensure you index your queries correctly and that your queries are correct, i've had 1800 queries a second on some v.large databases and tables i've worked with. Bare in mind one bad query can grind your database server to a halt.

An example that I had yesterday of a bad query is that a colleague of mine, missed out a condition joining a users table to another, which caused a tmp table. This was generating 40gb temp files each and consuming all disk space, totaling around 200+gb.

Guys, thanks for the information. I am glad to learn that even tables with 1 billion records could be optimized for both INSERTs and SELECTs. I would need to optimize my table using Indexes sensibly. Thanks again.