I couldn't get it to work, but I would like to explore this option, only because I've got a csv file being built with this process that could be over 50MB (400,000 rows). So, I want to be wise in the way I do this.

While I see the LIMIT 400000 clause, I don't know how many records are really going to appear in the results set. If it's a small number, performance will not be an issue. If it's a larger number, the latter method (the one that goes from database to CSV directly) is going to be the best way to get reasonable speed.

Your bigger problem with the 400,000 scenario is more likely going to lie with memory. When you query 400k records, all those records are pulled into PHP's memory space, so they're sitting there as you iterate through the record set. It might be 50 megs of raw data, but with all the memory structures, you're likely looking at a much higher amount of memory necessary JUST to hold the MySQL result in memory.

If you're trying to keep peak memory usage down, use smaller groups, like 10,000 records at a time. That way, you're only holding 10k records in memory at any given time. You may need to make more queries in order to fetch all the data, but the increase in overhead between 1 query and 40 queries is fairly minimal with the amount of data you're processing.

@gr8gonzo: Good point. 1,000 CSV records == 152 KB so it follows that 400,000 records == 59 MB. But... (and this is a big but) PHP array and object structures can take as much as 20X the storage required by the string data. Which would imply a memory requirement for 1.2 GB. Don't use PDO!

As far as the insight the two of you have offered thus far, let me explain it back to you so I can be certain I'm understanding WHY things need to be done a certain way and not just THAT it has to be done that way.

However properly indexes have been applied to a table, while that may dramatically improve the speed of said query, you still have to take into consideration the amount of memory you're using as the number of rows you're iterating through increases.

With my situation, it seems like what I need to do to best serve my user is determine how many rows I'm looking at. Grab the count first and then, based on the number of rows, route the process in a way where I'm either exporting the recordset directly into a csv file, or doing a series of queries that insert records into a temporary table 10K records at a time.

If that resonates with y'all, assuming I've got a large recordset - over 300K - how do I grab those records 10K at a time without grabbing the same ones over and over again?

So, if my query is, hypothetically "select * from Verizon_test where ... LIMIT 10000" - I do that once knowing there's 300K results. How do I structure my next query so that I'm grabbing the next 10K?

You would use a WHERE clause and an ORDER BY clause. You might be able to ORDER BY the AUTO_INCREMENT key. The first query might say something like WHERE id BETWEEN 1 AND 10000; the second query might say WHERE id BETWEEN 10001 AND 20000, etc.

Ray, your last suggestion makes total sense, but assuming that the records in my recordset don't start with id#1 and proceed sequentially, then what I'm looking would look something like this, correct:

query_1 = "select * from Verizon_test where mystuff limit 10000 order by id ASC" (grab the id of the last record)

query_2 "select * from Verizon_test where mystuff and id>query_1['id'] LIMIT 10000 order by id ASC

Featured Post

Highfive is so simple that setting up every meeting room takes just minutes and every employee will be able to start or join a call from any room with ease. Never be called into a meeting just to get it started again. This is how video conferencing should work!

The viewer will learn how to create a basic form using some HTML5 and PHP for later processing.
Set up your basic HTML file. Open your form tag and set the method and action attributes.:
(CODE)
Set up your first few inputs one for the name and …