Pages

Thursday, September 18, 2008

Bad Data Architecture

What is data architecture? Simply, it's data design. Same as you would design anything else in the world, you can also design data. Since you can design something, people like to add fancy words, like architecture, to make the activity sound important and impressive. In this case, designing and architecting data are basically the same thing.

With any design, there are pros and cons, and also it is easier to spot bad design over good design. It most cases you don't notice good design. It works for you or the consumer and is not a distraction and is not confusing. You don't have to think about the implementation to understand why it was done a certain way. If it is natural, the design is likely good, makes sense and allows you to carry on with your normal business. Basically, the design is not getting in the way when it is good. It doesn't require rules, definitions and instructions.

So what is bad data architecture?

You'll know it when you see it. It will feel awkward, you'll have to read documentation, you'll have to search on Google for answers. When table names don't imply the storage of specific content, the architecture is bad. When field names are generic, the architecture is bad. When the contents of fields (the value) are codes for more clearly defined words, the architecture is bad.

The table name should define the purpose of the container. The field names should spell out the attributes of that container. You should know what does in the table and how to populate the fields from their names only, or at least have a general idea of how to do this. If not, the architecture is bad.

Why is this architecture bad? Because data gets old. We keep it. We hang on to it. Software does not get old, or as old as data. Software updates, upgrades, is decommissioned, is uninstalled, is swapped for something else. Data moved between software. You need to understand data beyond and outside the context of the software currently using it.