Abstract

Functional dependencies are an integral part of database theory and they form the basis for normalizing relational tables up to BCNF. With the increasing relevance of the data-centric aspects of XML, it is pertinent to study functional dependencies in the context of XML, which will form the basis for further studies into XML keys and normalization. In this work, we investigate the design of functional dependencies in XML databases. We propose FDXML, a notation and DTD for representing functional dependencies in XML. We observe that many databases are hierarchical in nature and the corresponding nested XML data1 may inevitably contain redundancy. We develop a model based on FDXML to estimate the amount of data replication in XML data. We show how functional dependencies in XML can be verified with a single pass through the XML data, and present supporting experimental results. A platformindependent framework is also drawn up to demonstrate how the techniques proposed in this work can enrich the semantics of XML.

This work was done while the author was on a research scholarship from the National University of Singapore.

For this paper, XML data refers to data represented in XML. It is not to be confused with the W3C Note XML-Data.