We are starting to build up a database with PostGIS. The database is supposed to be for a team of about 5-8 researches who work frequently with geodata and statistics.

Has anybody experience with naming conventions when setting up a database?

some important things I already figured out are:

use only lowercase

use_underscores not spaces

don't use special characters like ä,é, etc

only use one language (might seem trivial but we are international)

name tables and columns always in singular

find a standardized way to name the objects in the database i.e. topic_year_source_format

Especially the last point is tricky. Storing my own data i recognized that sometimes you'll get enormous names. So would it be bether to store this information in a metadata that is easy accessible instead of making these huge names that can be quite annoying...
Can anybody give me some advice? thx!

1 Answer
1

It sounds like you have the technical conventions worked out. I don't think the question you're asking has a correct answer, but I will tell you what I've come up with for use in my organization.

I prefer to organize data by groups because, as we all know, sometimes the metadata just doesn't get filled out. I've found building some of the most basic metadata into the naming convention to be very beneficial.

To start, I created a spreadsheet listing the major categories of data that my organization handles and gave each of these a unique two letter code. The spreadsheet also has a description of the category and examples of features which may be found within each category. This spreadsheet is available to everyone in my organization and I include it alongside exported data.

I start out each name with the two letter code followed by an underscore. You could of course expand on this idea and build in the name of data creators as well. Try to keep the names short and document your methods.
Here are some examples of the categories I use:

This is a valid method, but I really don't like abbreviations. This is of course a matter of personal taste, but specially if you are on a international team, these abbreviations may confuse everyone and one will always need a data dictionary whenever he needs to use the database. PostgreSQL allows, if I'm not mistaken, 64 letter object names. Make good use of that space and create the most descriptive names you can find, in a language everyone can understand.
–
GeorgeMar 5 '13 at 10:42

I really like the idea of categorizing the data and I will discuss this with my collegues. Still i'm not sure about naming the data inside the db. Your arguments make totally sense that for usability it will be bether to give clear names inside the db. But I fear that the metadata document may be less used like this. I thought that naming the data with abstract numbers would drive users to refer to the metadata document and with this contribute more to it in a way that people fill out more metadata information since they have to refer to it on a daily base and the document is already open...
–
DspanesMar 6 '13 at 12:44

@Dspanes, that's an interesting argument. Like I said, there's no right answer. In general I'm not sure I like the idea of making the names intentionally confusing so as to make users rely on the metadata...it is an interesting idea though.
–
PaulMar 7 '13 at 0:38

@Paul Yeah it seems kind of mean I know ;) But from what i've expirienced so far, people only use what is usefull for them. The more usefull the more they use it and the more they use the better the metadata might get...The thing is that we don't have a person to take care of metadata so we need a participative approach where everybody contributes. the metadata document could maybe bring benefits too e.g. you could have better search and filter functions that allow to find more adequate data...but without a doubt I'm also thinking in alternative approaches to foster participation...
–
DspanesMar 7 '13 at 10:07