Flexible Hierarchy

An organization has a sales department with four levels, as shown in the logical data model in Figure 1.

Figure 1: Sales Department Logical Data Model

Zones are divided into Regions, Regions into Districts and Districts into Territories. Assume there will always be these four levels and that levels cannot be skipped (e.g., Territory 123 cannot report directly to Region 456). This logical data model is implemented as the physical data model in Figure 2.

Figure 2: Sales Department Physical Data Model

Sales Hierarchy represents a specific combination of Zone, Region and District, which is then divided into Territories.

The Challenge

What are the pros and cons with using this physical structure?

The Response

A hierarchy's rigidity can at times require additional effort to load, modify and retrieve data. Therefore, a logical hierarchy structure is often implemented in a more flattened format such as the physical structure in this challenge, which carries both pros and cons.

Pros

Flexible. This model offers two different levels of flexibility: data and structure. Data flexibility means that this model can accommodate frequent changes to Zone, Region or District relationship values in the hierarchy. One set of rows in Sales Hierarchy is changed instead of potentially many relationships crossing tables in the strict hierarchy. For example, if District D1 used to report to Region R1 and R1 used to report to Zone Z1 and now D1 reports to Region R2, which now reports to Zone Z2, this becomes an update to one set of rows in Sales Hierarchy instead of impacting two tables within the hierarchy. Structure flexibility means that Sales Hierarchy can accommodate when levels in the hierarchy are changed. For example, instead of having Districts roll up to Regions, Districts now roll directly up to Zones and Zones roll up to Regions.

Fast. This physical structure provides quicker data retrieval for certain queries. Corine Jansonius, data architect, notes, "There are quicker joins to Zone and Region because there's no need to join through District first."

Cons

Data quality concerns. There is less enforcement of hierarchy rules in the database, which can lead to data quality issues. Because the database is not enforcing that a District roll up to a single Region, it is possible (unless enforced through code) for a District to roll directly up to a Zone or even for a District to roll up to two Regions. Brenda Graham, data modeler, adds, "It's likely the Sales Hierarchy would have to be manually maintained, which involves identifying/maintaining all valid combinations. There isn't anything in the structure to ensure levels in the hierarchy aren't skipped." The database does still enforce the rule, however, that a Territory must be in a specific Zone, Region and District.

Lack of clarity. A model is a communication tool, and this physical structure does not communicate the strict four-level hierarchy. Art Trifonov, Oracle consultant, states, "It lacks clarity, as looking at the model does not tell you what the hierarchical relationship between Zone, District and Region is." Norman Daoust, data architect, adds, "Sales Hierarchy is a term foreign to the business and with no real-world counterpart." Ben Ettlinger, lead data administrator, summarizes, "This does not represent reality or the business. A programmer or DBA who only looks at the physical model will not get a sense of the business from the model."

Inflexible. Whenever logical structures are flattened, it often becomes more difficult to fit new concepts within such a structure, therefore reducing flexibility. Dave Hay, industry expert, brought up this key point, "It adds constraints that will make it difficult to insert 'Administrative area' between Region and Zone."