Featured Database Articles

ANSI SQL Can Combine Advantages and Principles of Relational and Hierarchical Data Processing

By Michael M. David

Michael M. David describes how relational and hierarchical data processing can be seamlessly combined in a way that supports and preserves both hierarchical and relational processing advantages while avoiding their disadvantages.

Relational and hierarchical data processing are both
controlled by very principled structured data processing operations. However,
they are very different in operation and for this reason, they have different advantages
and disadvantages in data processing characteristics. These characteristics are
listed and described below. This article will describe how these two different
principled data processing disciplines can be seamlessly combined in a way that
supports and preserves both hierarchical and relational processing advantages while
avoiding their disadvantages.

Relational and Hierarchical Processing Advantages and Disadvantages

Relational and hierarchical processing is very different so
there is no surprise that what is good for one is often bad for the other. This
will be reflected directly below where these characteristics are listed and
explained. These good and bad characteristics reflect what is still assumed today
such as relational processing cannot support hierarchical processing, which is
not an accurate statement today and will be covered later in this article.

Relational Advantages

Data Independence
means that data can be dynamically related through data relationships so that
the structures are not fixed and can be created dynamically on the fly. This
was the reason that relational databases supplanted hierarchical databases. This
data independence also allows Many-to-Many Data Relationships to be performed
easily today. This is a capability not supported in physical hierarchical databases.
Flat relational data with its Normalized Tables in relational databases and
its data independence capability does prevent data duplication in stored data.

Relational Disadvantages

Explosive data replications during processing

Joins are expensive

Only flat structures are available

While relational database systems can store data without
duplication as described in the paragraph above, relational join processing
with its explosive data replication can occur during its Cartesian product processing.
This is also a significant reason relational joins can be very expensive and
time consuming to process compared to other non relational methods. Flat
structures can also be very limiting on their own because of their lack of
structure and semantics.

Hierarchical Advantages

Useful hierarchical structure with inherent semantics

Easy and powerful semantic queries

Data value naturally grows nonlinearly

Hierarchical structures get their inherent powerful
processing capability from the semantics contained in their data structure enabling
more powerful semantic queries. This self-contained structure semantics allows
for powerful semantic queries enabling terse queries that are quick and easy to
specify. As hierarchical structures naturally grow larger over time, so does their
power with their ever-increasing inherent semantics. This allows the value of their
data and querying power to keep increasing naturally and nonlinearly as
described further below.

Hierarchical Disadvantages

Fixed inflexible structure

Inflexibility resulting in duplicated data

No M-to-M data structure support

While hierarchical structures offer the ability to model
structures, they are fixed making them inflexible. This inflexibility also
causes the problem of duplicate data because of the need to have different
physical structures comprised of the same data. This lack of data independence
means that M-to-M structures cannot be fully supported.

How Relational and Hierarchical Processing are Combined

The advent of the Left Outer Join in ANSI SQL-92 with its
hierarchical data preservation operation and its new ON clause replaces the
WHERE clause enabling specifying join criteria at each join point. This capability
enables full hierarchical processing naturally and inherently without ambiguity
problems. This also means that SQL's navigationless processing also continues
to work normally with hierarchical structured data. The LEFT Outer Join syntax
and the new ON clause allow the precise data modeling of full multipath
hierarchical structures while its hierarchical semantics perform hierarchical
processing. This complete SQL hierarchical processing is handled directly by
the ANSI SQL processor.

In addition and unexpectedly, this natural hierarchical
processing from the LEFT Outer Join also inherently supports the Lowest Common
Ancestor (LCA) processing required to support powerful multipath queries. This allows
the unrestricted query to reference any number of multiple pathways without
concern of the data structure. This overlapping of relational and hierarchical
processing allows for the creation and seamless processing of logical hierarchical
structures, which possess the capabilities of both relational and hierarchical
capabilities. This means that ANSI SQL can now naturally and dynamically define
logical hierarchical structures, which are not fixed. Logical hierarchical
structures defined by ANSI SQL keep all of their hierarchical principles while taking
on relational principles too. This produces results that are both relationally
and hierarchically accurate.

How Relational Structures Support Hierarchical Processing

Relational logical hierarchical processing can support the
best of both worlds, but how do flat tabular structures contained in and
processed in relational rowsets maintain and support hierarchical processing? They
also have to naturally support multiple pathways, which can also be variable in
length. This is performed by the hierarchical semantics modeling the multipath
hierarchical structure in flat relational structures. The newer LEFT Outer Join
with its ON clause operates hierarchically on the working rowset allowing for
the specific hierarchical processing of fragment portions of the structure by using
the multiple separate ON clauses as it is being built. The null data filler naturally
produced by the LEFT Outer Join allows for variable length pathways in
relational rowsets.

Combined Relational and Hierarchical Advantages

The following list of combined relational and hierarchical processing
advantages below allows the user to pick and choose from a list of characteristics
to match their specific need. Each characteristic usually has a good and bad characteristic
depending on its use. Relational and hierarchical capabilities can be
intermixed because hierarchical processing is actually a subset of relational
processing. Intermixed relational and hierarchical operations include the
seamless integration of relational and hierarchical data; combining unrelated fixed
hierarchical structures using a relational association table; and the direct
SQL hierarchical joining of hierarchical structures. Logical hierarchical
structures can also take on most relational and hierarchical advantages, which can
now support previously mutually exclusive capabilities. An example is relational
data independence while utilizing inherent hierarchical semantics.

Combined Relational and Hierarchical Disadvantages

The above list of advantages allows avoiding items from the list
of disadvantages below. The relational disadvantages below are avoided by selecting
from the hierarchical advantages above. And the hierarchical disadvantages
below are avoided by using the relational advantages above.

Synergy Derived New Capabilities

The synergy from the combined relational and hierarchical
processing above also produces powerful new capabilities and advancements in hierarchical
data processing. Utilizing relational processing's flexible dynamic querying along
with its basic hierarchical processing capability, hierarchical structures can
be dynamically and hierarchically mashedup. This processing still results in a valid
hierarchical structure to be query processed. By utilizing hierarchical
processing occurring naturally in relational rowsets, powerful new data
structure transformations and data structure virtualizations can also be
performed easily in SQL using the synergy of relational and hierarchical
processing.

Conclusion

ANSI SQL's inherent hierarchical processing offers the
capability to freely select from relational and hierarchical advantages and
capabilities. This paper is meant as an overview for this SQL hierarchical
processing topic. For more information on this topic, see a list and
description of SQL hierarchical and relational capabilities performed in ANSI
SQL described in my previous article, The
Top 10 SQL Hierarchical Data Processing Capabilities.

Advertiser Disclosure:
Some of the products that appear on this site are from companies from which QuinStreet receives compensation. This compensation may impact how and where products appear on this site including, for example, the order in which they appear. QuinStreet does not include all companies or all types of products available in the marketplace.