hotsnoj's learning experience

Second to actually programming a website for me in terms of not being easy is laying out what the database will look like. You need to decide what each field's type will be and what type will be the best for a given variable. Not to mention what you're going to put in the database, whether or not some piece of data is best left to do in the program, and in the case of a Friends feature, how you're going to store friend lists!

Like my programming, I like to get the most out of the least effort. This means I use a relational database type setup. I'm sure it's not a true relational database that I'm using, in fact I think this is a "keyed index" database, but the idea is that I have tables for user log in data, friends, profile data, photo's, etc. While this adds to the complexity of the query statement, it makes everything ultimately smaller and makes changes in one place available in all others. In other words, unless we need to start planning on more servers to handle the load of a bazillion users, no waiting 24 hours for photos to change on friend lists.

Anyway, here are some example tables (in MS Excel format). Notice we're not storing anything related to passwords, just some information about who this user is. Why? Because why open up the possibility for others to get the password hashes of users? Well that's mostly because I like to use the '*' in my queries and then push all that returned data into the template engine.

You may ask yourself why have general profile data and profile settings as two different tables? Because in the future I'm thinking I may add profile setting profiles! Much like what Windows has for hardware. Same machine, same users, just different driver setups (as far as I understand). That and when you get insanely huge tables it's hard for me to keep track of what's where.

Friends:

"userid","friendid"

This is the actual table from my myspace clone database. These are the only two fields and it is the part of this project I'm most proud about.

Lets say User A and User B are friends. Generally you'd have one entries for each stating they're both friends. Why do that? I guess one reason would be easy testing to see if User B has approved to User A to being their friend. But I digress. Why not just make a query that checks to see if User A is either in the userid or friendid cell? One row, and we show a relationship between two users. Brilliant.

Anyway, what ties almost every table together is 'userid'.

So, to sum it up. Plan out your database before you start programming. That's not to say that once you start the programming that you cannot make changes, hardly. You'll be making changes through out the process because either you'll find the database layout won't work the way you thought, or you'll think of a better way to do things.

What have you learned from me today? Hopefully that you need to put a lot of thought into how your database is going to be structured so that not only is it easy to get data out and in, but also so that it is easily expanded for new features on your site. Or you could have learned nothing and I'm just rambling.

But anyway, next time we'll have a look at cookies, session objects and keeping real passwords out of the wrong hands.

Replies To: hotsnoj's learning experience

Re: hotsnoj's learning experience

Posted 16 February 2006 - 05:19 PM

Just a couple of thoughts in no particular order:
- fname, lname might not be enough in every scenario, as some of us have middle names, too - and in some areas a bunch of names (ok, its not really a problem, as it depends on the user what to stuff into each of these fields...)
- I am not sure how myspace works, but a similar thing here only dispays somebody as a friend, if he actually confirmed that he is your friend. In the present form of your table, I think people can add as many friends as they wish, so I think you either need a status field indicating that the connection is confirmed, or have to make 2 queries to check every connection ( but it would mean a serious performance hit )
- I don't know if a timezone "database" is available for cities, but if it isn't, maybe a timezone field would also make sense
- and some additional field ideas (however these can be added anytime if the engine is designed with care): schools, hobby, languages spoken, pets, etc. maybe all of these could go into some extended profiel info

And please, store everything (where it makes sense) in utf-8!!!!!!!!!!!!!!!

Re: hotsnoj's learning experience

Posted 16 February 2006 - 06:07 PM

1lacca, on 16 Feb, 2006 - 07:16 PM, said:

Just a couple of thoughts in no particular order:
(1)- fname, lname might not be enough in every scenario, as some of us have middle names, too - and in some areas a bunch of names (ok, its not really a problem, as it depends on the user what to stuff into each of these fields...)
(2)- I am not sure how myspace works, but a similar thing here only dispays somebody as a friend, if he actually confirmed that he is your friend. In the present form of your table, I think people can add as many friends as they wish, so I think you either need a status field indicating that the connection is confirmed, or have to make 2 queries to check every connection ( but it would mean a serious performance hit )
(3)- I don't know if a timezone "database" is available for cities, but if it isn't, maybe a timezone field would also make sense
(4)- and some additional field ideas (however these can be added anytime if the engine is designed with care): schools, hobby, languages spoken, pets, etc. maybe all of these could go into some extended profiel info

And please, store everything (where it makes sense) in utf-8!!!!!!!!!!!!!!!

All very good thoughts 1lacca! But I'd like to respond with my reasoning behind things.

Thought 1: While there are people out there that have more then 2 or 3 names. There really isn't a need for more then 2 names. Hardly anyone I know uses their full name or even their middle name. In fact it often seems that one's middle name around here is a source of either embarassment or just like to keep secret for some reason. Anyway, I thought only first and last names would be more than appropriate.

Thought 2: After posting this I figured that adding one more field to the friends table would be easier then doing my first idea of having a friends queue table that held all friend requests. You see the friends (relation) table wouldn't have been updated until the friendship had been verified by both parties. So two queries would not have been needed anyway. But like I said, I dropped the idea because I decided to be lazy and just decided to make a quick check of the third field.

Saying that, there are methods of checking data against other data (whether it exists or not) from another table in one query. (Which will be discussed in later sessions.)

Thought 3: Very good idea. I probably would have never thought of that!

Re: hotsnoj's learning experience

Posted 17 February 2006 - 02:04 AM

1) Yes, I know what you are getting here, but I would further argue One, who doesn't want to use this feature just wouldn't provide such a name (making it an optional field), however if you do a search on Imdb for Sung, I would estimate that around 5 percent of the names consist of three parts.