The SitePoint Forums have moved.

You can now find them here.
This forum is now closed to new posts, but you can browse existing content.
You can find out more information about the move and how to open a new account (if necessary) here.
If you get stuck you can get support by emailing forums@sitepoint.com

If this is your first visit, be sure to
check out the FAQ by clicking the
link above. You may have to register
before you can post: click the register link above to proceed. To start viewing messages,
select the forum that you want to visit from the selection below.

Hybrid View

Data structure problem

Hi all

I'm building a site at the moment for an award ceremony, so a fundamental part of that is who won what award. To complicate things, it's for a film award so an award can be won by a film, a person, more than one person, a person for a particular film or more than one person for a particular film.

Being a sensible developer I of course normalized my data so I have a film table and a people table. Without complicating things too much I also have a table called award_ceremony which is made up of id, award_id and ceremony_id fields, so that each award in each year has a unique id.

I then created a winner table consisting of id, award_ceremony_id, film_id.....

and this is where I get stuck

Assuming that each award can be won by more than one person, do I need to have a winner_people linking table with a winner_id and a people_id? I just seem to be ending up with too many tables!

So, assuming that I do, I then get stuck on the query to do the select. This is what I have so far

Code:

SELECT award, title FROM winner
LEFT JOIN film ON film_id=film.id
LEFT JOIN award_ceremony ON award_ceremony_id=award_ceremony.id
LEFT JOIN award ON award_id=award.id
WHERE ceremony_id=[ceremony id here]

Which nicely gets the names of awards and winning films for a particular year.

So to incorporate the winning people ids, I change it (unconvincingly) to

Code:

SELECT award, title, people.id FROM winner
LEFT JOIN film ON film_id=film.id
LEFT JOIN winner_people ON winner_id=winner.id
LEFT JOIN award_ceremony ON award_ceremony_id=award_ceremony.id
LEFT JOIN award ON award_id=award.id
WHERE ceremony_id=[ceremony id here]

Ermm, but hang on, there might be more than one person. How does that work?

I'm building a site at the moment for an award ceremony, so a fundamental part of that is who won what award. To complicate things, it's for a film award so an award can be [won] by a film, a person, more than one person, a person for a particular film or more than one person for a particular film.

you have three main entities: award, film, person

there will be at least one three-way relationship table, due to "an award [won] by ... a person... for a particular film"

Without complicating things too much I also have a table called award_ceremony which is made up of id, award_id and ceremony_id fields, so that each award in each year has a unique id.

let's leave that and come back to it later -- design for one set of awards, get that straightened out, then it will be easier to modify the design for multiple sets of awards (as this involves primary/foreign keys)

those are foreign keys, and it is perfectly okay for a foreign key to be null, it means that the relationship is optional, i.e. that particular row does not relate to a parent (e.g. for an award won by a person, the film is a non-existent, non-sensical attribute)

because two of the keys can be null, the combo of the three keys cannot be declared a primary key, but that's okay because you only need a primary key if the table acts as parent table in a parent-child relationship, and this one doesn't

however it is imperative that you have an index on the keys, for join performance, and multiple indexes may be needed in a large database

now let's make this AWARDWON a child in another relationship, as a child to the CEREMONY, which is for a specific year

thus each instance like "25 117 1212" has to relate to which ceremony/year it belongs to

this means add a foreign key to AWARDWON which links to the id of the CEREMONY row for the year

if it were me doing it (see note), i would use the year number (e.g. 1995) as the primary key of the CEREMONY table

note: some people think that a pk should never change, as this has ripple effects on foreign keys that refer to it (the "on update cascade" so-called problem), but me, i don't think they would in this example -- change, i mean