3
3XML and Data Management Motivation First, most database vendors today offer universal database products that combine their relational DBMS and ORDBMS offerings into a single product. Second, an ORDBMS has a more expressive type system than an RDBMS. Third, an ORDBMS is better suited for storing and querying XML documents that may use a richer set of data types.

5
5XML and Data Management Motivation: RDBMS weaknesses Poor Representation of “Real World” Entities Normalization leads to relations that do not correspond to entities in “real world”. Normalization leads to relations that do not correspond to entities in “real world”. Semantic Overloading Relational model has only one construct for representing data and data relationships: the relation. Relational model has only one construct for representing data and data relationships: the relation. Relational model is semantically overloaded. Relational model is semantically overloaded. Difficulty Handling Recursive Queries RDBMSs are poor at navigational access to data. Limited Operations RDBMs only have a fixed set of operations which are difficult to extend. RDBMs only have a fixed set of operations which are difficult to extend.

7
7XML and Data Management Motivation: ORDBMS Advantages Code held within database, as functions, procedures or methods common functionality can be centralised rather than re-implemented by every application that uses the data common functionality can be centralised rather than re-implemented by every application that uses the data BLOBs(Binary Large Objects) and CLOBs(Character Large Objects) are used to store large unstructured values within database allows storage of complex data e.g. multimedia allows storage of complex data e.g. multimedia

8
8XML and Data Management Motivation: ORDBMS Advantages Motivation: ORDBMS Advantages ORDBMS The ability to directly manipulate data stored in a relational database using an object programming language is called transparent persistence Object-relational mapping means less code to write Higher performance over an embedded SQL or a call interface(JDBC,ODBC)

10
10XML and Data Management XORator mapping The XORator(XML to OR Translator) algorithm is a practical demonstration of the use of XML data types It takes advantage of using an ORDBMS over an RDBMS. XORator uses Document Type Definitions (DTDs) to map XML documents to tables in an ORDBMS. An important part of this mapping is the assignmentof a fragment of an XML document to a new XML data type, called XADT (XML Abstract Data Type).

13
13XML and Data Management XORator: DTD complexity Simplify the DTD information to a form that makes the mapping process easier. Set of transformations to reduce the number of nested expressions and the number of element items: Flattening (to convert a nested definition into a flat representation): (e1,e2)* -> e1, e2 Simplification (to reduce multiple unary operators into a single unary operator) : e1**->e1* Grouping (to group subelements that have the same name): e0; e1*; e1*; e2 -> e0; e1*; e2 In addition, e+ is transformed to e*.

14
14XML and Data Management XORator: DTD -> OR schema The simplified version of the previous DTD
{
"@context": "http://schema.org",
"@type": "ImageObject",
"contentUrl": "http://images.slideplayer.com/14/4361963/slides/slide_14.jpg",
"name": "14XML and Data Management XORator: DTD -> OR schema The simplified version of the previous DTD OR schema The simplified version of the previous DTD

15
15XML and Data Management XORator: DTD -> OR schema we build a DTD graph to represent the structure of the DTD. Nodes in the DTD graph are elements, attributes, and operators. In the DTD graph, elements that contain characters are duplicated to eliminate the sharing.

16
16XML and Data Management XORator: DTD -> OR schema Given an DTD graph, a relation is created for nodes that satisfy any of these following conditions: 1) nodes that have an in-degree of zero 2) recursive nodes with in-degree greater than one 3) one node among mutually recursive nodes with in- degree one. 4) All remaining nodes (nodes not mapped to a relation) are inlined as attributes under the relation created for their closest ancestor nodes (in the DTD graph).

18
18XML and Data Management XORator: DTD -> OR schema An XADT attribute can store a fragment of an XML document The XORator algorithm allows mapping an entire subtree of the DTD graph to an attribute of the XADT.

20
20XML and Data Management XORator: XADT A storage representation is to use a compressed representation for each XML fragment. The element tags are mapped to integer codes, and element tags are replaced by these integer codes. A small dictionary is stored along with the XML fragment to record the mapping between the integer codes and the actual element tag names. There is two implementations of the XADT: one that uses compression, and the other one that does not.

21
21XML and Data Management XORator: XADT The decision to use the “correct” implementation of the XADT is made during the document transformation process by monitoring the effectiveness of the compression technique. Compression is used only if the space efficiency is above a certain threshold value.

22
22XML and Data Management XORator: XADT XADT getElm(XADT inXML, VARCHAR rootElm, VARCHAR searchElm, VARCHAR searchKey, INTEGER level): This Method returns all rootElm elements that have searchElm within a depth of level from the rootElm. INTEGER findKeyInElm(XADT inXML, VARCHAR searchElm, VARCHAR searchKey): This method examines all elements with the tag name searchElm in inXML, and searches for all searchElm elements with content that matches the searchKey keyword and returns 1 if true XADT getElmIndex(XADT inXML, VARCHAR parentElm, VARCHAR childElm, INTEGER startPos, INTEGER endPos): This method returns all childElm elements that are children of the parentElm elements and with the sibling order from startPos to endPos positions.

23
23XML and Data Management XORator: XADT This query retrieves lines that are spoken in acts by the ‘SPEAKER’ ‘HAMLET’ and have the keyword ‘friend’ in the line.

24
24XML and Data Management JDOM JDOM is an open source, tree-based(DOM), pure Java API for parsing, creating, manipulating, and serializing XML documents JDOM represents an XML document as a tree composed of elements, attributes, comments, processing instructions, text nodes, CDATA sections,etc.. JDOM is written in and for Java. It consistently uses the Java coding conventions and the class library and it implemets the cloenable and serializable interfaces

25
25XML and Data Management JDOM Xerces 1.4.4 is bundled with JDOM to parse XML documents. A JDOM tree is fully read-write. All parts of the tree can be moved, deleted, and added to, subject to the usual restrictions of XML. Unlike DOM, there are no annoying read- only sections of the tree that one can’t change.

32
32XML and Data Management JDO Sun's Java Data Objects (JDO) standard. JDO allows you to persist Java objects. It supports transactions and multiple users. It differs from JDBC in that you don't have to think about SQL and "all that database stuff." It differs from serialization as it allows multiple users and transactions. It allows Java developers to use their object model as a data model. There is no need to spend time going between the "data" side and the "object" side.