Database Administrators Stack Exchange is a question and answer site for database professionals who wish to improve their database skills and learn from others in the community. It's 100% free, no registration required.

I'm building an inventory database to store enterprise hardware information. The devices the database keeps track of range from workstations, laptops, switches, routers, mobile phones, etc. I'm using device serial numbers as the primary key. The problem I'm having is that the other attributes for these devices vary and I don't want to have fields in the inventory table that are unrelated to other devices. Below is a link to an ERD of part of the database (some FK relations are not shown). I'm trying to set it up, for example, so a device with a workstation device type can't be put into the phones table. This seems to require the use of a lot of triggers to validate the device type or class, and new tables anytime a different device with different attributes will be tracked; not to mention all of the one-to-one relationships which will make joins a nightmare (there are more one-to-one relationships not shown).

I looked into setting up attribute tables that can be mapped to serial numbers, but that would allow attributes that do not apply to a device type to be assigned to a device, e.g., someone could assign a phone number attribute to a workstation if they wanted. I found an explanation on this site that gave the following structure:

This structure would work great if the attributes were all applicable to the items I am storing. For example if the database was storing only mobile phones, the attributes could be things like touchscreen, trackpad, keyboard, 4G, 3G...whatever. In that case, they all apply to phones. My database would have attributes like hostname, circuitType, phoneNumber, which only apply to specific types of devices.

I want to set it up so only the attributes that apply to a given device type can be assigned to a device of that type. Any suggestions on how to setup this database? I'm not sure if this is a proper use of one-to-one relationships, or if there is a better way to do this. Thank you in advance for taking the time to look into this.

Here are some of the other threads I read. They gave me some good insight, but I don't think they really apply:

3 Answers
3

How about looking into the supertype/subtype pattern? Common columns go in a parent table. Each distinct type has its own table with the ID of the parent as its own PK and it contains unique columns not common to all subtypes. You can include a type column in both parent and children tables to ensure each device can't be more than one subtype. Make an FK between the children and the parent on (ItemID, ItemTypeID). You can use FKs to either the supertype or subtype tables to maintain the desired integrity elsewhere. For example, if the ItemID of any type is allowed, create the FK to the parent table. If only SubItemType1 can be referenced, create the FK to that table. I would leave the TypeID out of referencing tables.

Naming

When it comes to naming, you have two choices as I see it (since the third choice of just "ID" is in my mind a strong anti-pattern). Either call the subtype key ItemID like it is in the parent table, or call it the subtype name such as DoohickeyID. After some thought and some experience with this, I advocate calling it DoohickeyID. The reason for this is that even though there could be confusion about the subtype table really in disguise containing Items (rather than Doohickeys), that is a small negative compared to when you create an FK to the Doohickey table and the column names don't match!

To EAV or not to EAV - My experience with an EAV database

If EAV is what you truly have to do, then it's what you have to do. But what if it weren't what you had to do?

I built an EAV database that is in use in a business. Thank God, the set of data is small (though there are dozens of item types) so the performance is not bad. But it would be bad if the database had more than a few thousand items in it! Additionally, the tables are so HARD to query. This experience has led me to really desire to avoid EAV databases in the future if at all possible.

Now, in my database I created a stored procedure that automatically builds PIVOTed views for each and every subtype that exists. I can just query from AutoDoohickey. My metadata about the subtypes has a "ShortName" column containing an object-safe name suitable for use in view names. I even made the views updateable! Unfortunately, you cannot update them on a join, but you CAN insert to them an already-existing row, which will be converted to an UPDATE. Unfortunately, you cannot update only a few columns, because there is no way to indicate to the VIEW which columns you want to update with the INSERT-to-UPDATE conversion process: a NULL value looks like "update this column to NULL" even if you wanted to indicate "Don't update this column at all."

Despite all this decoration to make the EAV database easier to use, I still don't use these views in most normal querying because it is SLOW. Query conditions are not predicate pushed all the way back to the Value table, so it has to build an intermediate result set of all the items of that view's type before filtering. Ouch. So I have many, many queries with many, many joins, each one going out to get a different value and so on. They perform relatively well, but ouch! Here's an example. The SP that creates this (and its update trigger) is one giant beast, and I'm proud of it, but it is not something you want to ever try to maintain.

Here's another type of automatically-generated view created by another stored procedure from special metadata to help find relationships between items that can have multiple paths between them (Specifically: Module->Server, Module->Cluster->Server, Module->DBMS->Server, Module->DBMS->Cluster->Server):

If you MUST have some of the dynamic aspects of an EAV database, you could consider creating the metadata as if you had such a database, but instead actually using the supertype/subtype design pattern. Yes, you would have to create new tables, and add and remove and modify columns. But with the proper pre-processing (like I did with my EAV database's Auto views) you could have real table-like objects to work with. Only, they wouldn't be as gnarly as mine and the query optimizer could predicate push down to base tables (read: perform well with them). There would just be a one join between the supertype table and the subtype table. Your application could be set to read the metadata to discover what it is supposed to do (or it can use the auto-generated views in some cases). This protects your application code from having to be touched extensively just to add or modify things.

Or, if you had a multi-level set of subtypes, just a few joins. By multi-level I mean when some subtypes share common columns, but not all, you could have a subtype table for those that is itself a supertype of a few other tables. For example, if you are storing information about Servers, Routers, and Printers, an intermediate subtype of "IP Device" could make sense.

I will give the caveat that I haven't yet made such a hybrid supertype/subtype EAV-metatable-decorated database like I'm suggesting here yet to try out in the real world. But the problems I've experienced with EAV are not small, and doing something is probably an absolute must if your database is going to be large and you want good performance without some crazy expensive gigantic hardware.

In my opinion, the time spent automating the use/creation/modification of real subtype tables would ultimately be best. Focusing on flexibility driven by data makes the EAV sound so attractive (and believe me I love how when someone asks me for a new attribute on an element type I can add it in about 18 seconds and they can immediately start entering data on the web site). But flexibility can be accomplished in more than one way! Pre-processing is another way to do it. It's such a powerful method that so few people use, giving the benefits of being totally data-driven but the performance of being hard-coded.

(Note: Yes those views really are formatted like that and the PIVOT ones really do have update triggers. :) If someone is really that interested in the awful painful details of the long and complicated UPDATE trigger, let me know and I'll post a sample for you.)

And One More Idea

Put all your data in one table. Give columns generic names and then reuse/abuse them for multiple purposes. Create views over these to give them sensible names. Add columns when a suitable-data-type unused column is not available, and update your views. Despite my length going on about subtype/supertype, this may be the best way.

I thought of this design where each subtype table had the PK from the parent and the uncommon fields. I thought I could put the type field in the parent and each subtype table and then put a CHECK constraint on them. I decided to avoid this design because it would require a new table anytime a new type of device needs to be tracked, and many one-to-one relationships. It seemed messy and inflexible. I appreciate your input though.
–
TheSecretSquadMar 17 '12 at 23:20

I built an EAV database that is in use in a business. Thank God, the set of data is small (though there are dozens of item types) so the performance is not bad. But it would be if the database had more than a few thousand items in it. This experience has led me to really desire to avoid EAV databases in the future if at all possible, because they are so HARD to query.
–
ErikEMar 18 '12 at 1:17

After examining the the EAV pattern I realized that the values for the attributes are forced to share a data type (all string in this case). Also, querying the EAV setup will be a chore. Supertype/subtype is looking better. My question now, is certain tables allow only specific device types. Do I validate this by putting a device class ID (phone, computer, router) in every table and put a check constraint on that field, or do I exclude that field from the subtype tables and use a trigger on each of them? Please see ERD3 for reference.
–
TheSecretSquadMar 18 '12 at 16:43

1

For querying EAV data, it's not uncommon to build a datamart of relational tables for the data you want to query and then populate them using some script. The queries will run faster, but only against the data that you put in the datamart, and the setup requires a fair bit of planning.
–
FrustratedWithFormsDesignerMar 19 '12 at 15:01

In your case the best approach is a variation on the Entity-Attribute-Value (EAV) model. There are lots of people who shy away from EAV because it is unhelpful in some ways and misused a lot of the time. However, EAV is a solution that works well for your specific requirements.

The variation that you want to include for your situation is to abstract the attributes one level away from your entities (i.e. your inventory items). Essentially you want to define device types which have a list of attributes. Then you define device instances which have values for each of the attributes which devices of that type are supposed to have.

Here is an ERD sketch:

DEVICE_ATTRIBUTE contains the values for each type of generic attribute. DEVICE_TYPE defines the list of generic attributes which apply to a given type of device (these are the TYPICAL_DEVICE_ATTRIBUTEs.

This lets you control which attributes need to be filled out for a device while letting devices of different type have different lists of attributes. It also makes it easy for you to compare across devices by lining their attributes up against one-another.

This looks similar to what ssmusoke recommended. I changed my ERD using his recommendation and it looks like it matches yours. Feel free to check out the new RD at http://www.dividegraphics.com/ERD2.jpg and provide any feedback.
–
TheSecretSquadMar 18 '12 at 0:14

@reallythecrash - You are correct, I am suggesting the same basic approach as ssmusoke, I just took a different tack on my answer in the hopes of making it easier to understand both the structure of the model and also the rationale for using EAV which a lot of people (unfairly) denounce as being an anti-pattern.
–
Joel BrownMar 18 '12 at 2:01

After some research I see why people might consider EAV an anti-pattern. I think it is simple to store data using EAV, but especially complex for querying and maintaining data types. I think it is a pattern with a narrow purpose, and should be used by experienced developers who can implement it properly, i.e., not me. I will probably opt for the supertype/subtype paradigm.
–
TheSecretSquadMar 19 '12 at 5:04

@JoelBrown - what software did you use to sketch up that diagram?, looks cool.
–
VidarMay 1 '12 at 17:06

@Vidar - I used Visio with ERD smartshapes that I created to use the James Martin visual conventions and drawn with a custom line pattern that is sketchy. I find this is a good tool to use for quick/draft data models. When the diagram is too formal it can lead some people to think it's finished, so something sketchy helps to prevent people from jumping to conclusions about how firm/finished a data model is.
–
Joel BrownMay 1 '12 at 18:16

Thank you for your input. This is inline with what I'm looking for, I just couldn't figure out how to do it. I changed my ERD to reflect your specs. It seems like it requires more work to enter all of the allowable attributes for each device type, but it also looks like it offers maximum flexibility. I'm going to make a small prototype to see if it works the way I think it will. Thanks again. I uploaded an ERD with the changes if you want to take a look and let me know if I'm on the right track. http://www.dividegraphics.com/ERD2.jpg
–
TheSecretSquadMar 17 '12 at 23:47

EAV will offer lots of flexibility, but you also have a lot more metadata hanging around to keep it working.
–
FrustratedWithFormsDesignerMar 19 '12 at 14:59

@FrustratedWithFormsDesigner seems inevitable when the system stores a wide range of items, phones, switches, pcs, laptops, etc... Better more metadata than more tables I would say
–
ssmusokeMar 19 '12 at 16:12

1

@ssmusoke: Agreed, but I wanted to emphasize that point because I've seen people fail to realize the importance of metadata, and then their EAV implementation becomes a nightmare.
–
FrustratedWithFormsDesignerMar 19 '12 at 16:13