I'm struggling with a philosophical question on database programming in PHP. In particular, I'm trying to decide when it's best to read in an entire table into an object, vs. querying MySQL directly whenever I need data.

Is there ever a situation where you'd want to just read in the entire database into an object? Where do you draw the line?

For example, if I had a table full of names and phone numbers, and I need to get the phone number for one individual, that's a simple one-time mysql query. Reading in an entire table into an associative array just to get one phone number sounds ridiculous... But:

(1) what if I need to get the names and phone numbers of 50 individuals? 100? 1000?

(2) When is it more efficient (if ever) to read in the entire table into an object? Is performing 1000 mysql queries on 1000 names always going to be more efficient than reading in the entire table?

(2a) Obviously it would depend on the total number of records in the table. Would it be better to do 1000 queries for 1000 phone numbers, or read in a table of 2000 total records from a MySQL into an associative array? What if it was 5000 total records, and I needed 1000? What if it was 10k? Etc. etc.

(3) What if I need to do something a little more complex, like return all phone numbers in a certain area code? Obviously in that case I could use a regexp SQL query, but I'm sure I could come up with a more complex case where a simple query doesn't give me exactly what I want.

I guess what I'm getting at is, as a developer, you have several knobs you can turn to optimize your application. Obviously you want to think about the data you're using and optimize the database model to match the types of data requests you'll be doing. But sometimes you get into a mutually exclusive case where you're forced to pick optimizing your data model for one scenario, at the expense of another, competing scenario.

2 Answers
2

Databases are designed to be efficient at locating and returning exactly the data that you need to work with for a particular operation.

Transferring data over a network connection is orders of magnitude slower than processing it on the machine where it resides. Use databases for what they're good at... holding lots of information and allowing application code to query and work with exactly the subset of that data it needs to at a given point in time.

If you find that you need to frequently access the same data over and over, caching it at the application layer or in a dedicated caching solution like memcached does make sense, but I cannot imagine a scenario where it makes sense just to read in a whole table because my application logic needs to process a subset of the rows and/or columns in the table.

That makes sense, I appreciate the response. I said the same thing above, but basically I'm just wondering where (if ever) a strict database implementation breaks down, and you might be tempted to just read raw data into memory and build a complex object that gives you what you need. I'm probably far away from needing anything that complex, but still, it's something that's been on my mind lately.
–
dustin999Jul 19 '11 at 19:11

(3) but I'm sure I could come up with a more complex case where a simple query doesn't give me exactly what I want.

This is usually an indication that your database hasn't been properly normalized and/or has design flaws.

(2) When is it more efficient (if ever) to read in the entire table into an object? Is performing 1000 mysql queries on 1000 names always

Neither is a good choice. SQL is intended for set-based operations. You really need to use the system correctly for it to work well, but to do this you have to have properly designed your database. The best thing would be to write one query that returns exactly the records you want, no more and no less.

what if I need to get the names and phone numbers of 50 individuals

Maybe use something like select * where ID in (1,2,3,...,50), if you have a larger number of users, maybe create a temporary table with the list of users you want, and join on that. With a properly designed database there is usually a good way to retrieve a set of data with a single query.

"This is usually an indication that your database hasn't been properly normalized and/or has design flaws." -- well, or the requirement is for a regular expression and your DBMS doesn't support that. That said, I'd still not read the ENTIRE table in.
–
Nikki9696Jul 19 '11 at 16:54

In the type code I work on, I have never seen a case where using a regular expression on a database seemed like good idea, but I suppose someone could have a reason for that.
–
ZoredacheJul 19 '11 at 16:58

Ugh, yeah I was trying to come up with an example that might require several queries, and of course, the comma-delimited list of ID's would be much better. Oh well, perhaps I'm complicating things in my mind. I'm just curious to the average developer, how married you are to your database structure and if there are scenarios where you'd want to create the object rather than query the database because of the complexity of the data model or the complexity of the types of queries you'd be performing.
–
dustin999Jul 19 '11 at 19:09