XML stands for eXtensible Mark up Language. It is classified as an extensible language because it allows its users to define their own tags.

XML was developed to provide a universal format for describing structured documents and data.

There are no fixed tags for XML. Any user can add his own set of tags. The tags though are similar to HTML, they do differ by the way it is presented.

Unlike HTML, which tags elements in Web pages for presentation by a browser, e.g. Oracle, XML tags elements as data, e.g. Oracle. In this example HTML identifies as a command to display the data within as Bold. But in case of XML, the company for instance can be a column name in a database and Oracle is the column value.

Why do we use XML?

As XML is W3C(World Wide Web Consortium) standard, various software companies have openly accepted and implemented it in their operations.

It is a fee-free open standard.

It is platform-independent, language-independent, textual data.

XML can be used with existing web protocols (such as HTTP and MIME) and mechanisms (such as URL's ), and it does not impose any additional requirements.

XML can handle any kind and high volumes of information especially over the internet and WWW.

It is Unicode compatible, means it can handle UTF ready languages.

It is used as an interface touch-point between majority of applications. XML is replacing the age-old flat file system to send and receive data between applications.

Building blocks of XML

XML documents are made up by the following building blocks:

Elements

Attributes

Entities

PCDATA

CDATA

What are Elements?

Elements are the main building blocks of XML documents.

XML elements could be "my_body" and "message" in the following example. Elements can contain text, other elements, or be empty.<my_body>some text</my_body><message>some other text</message>What are Attributes?

Attributes provide extra information about elements.Attributes are always placed inside the opening tag of an element. Attributes always come in name/value pairs. The following "images" element has additional information about a source file and its name:Example:<images location="computer.gif" name="some image name"/>

In the above example, images is called as an Element; whereas location and name are called as Attributes.

What are Entities?

Some characters have a special meaning in XML, like the less than sign (lt;gt;amp;quot;apos;

Add and & mark before this special character.

What is PCDATA?

PCDATA means parsed character data. Think of character data as the text found between the start tag and the end tag of an XML element.

PCDATA is text that WILL be parsed by a parser. The text will be examined by the parser for entities and markup. Tags inside the text will be treated as markup and entities will be expanded.

However, parsed character data should not contain any &, <, or > characters; these need to be represented by the amp, lt; and gt; entities, respectively.

What is CDATA?

CDATA means character data. CDATA is text that will NOT be parsed by a parser. Tags inside the text will NOT be treated as markup and entities will not be expanded.

In the above example XML file the string starting from <!DOCTYPE note [ upto ]> is called as DTD (Document Type Definition).

What is DTD?

DTD (Document Type Definition) is a set of rules or grammar that we define to construct our own XML rules (also called a "vocabulary"). In other words, a DTD provides the rules that define the elements and structure of our new language.

This is comparable to defining table structures in Oracle for a new system. As we define the columns of a table, determine the datatypes of the columns, determine if the column is 'Null' allowed or not, the DTD defines the structure for the XML document.

A DTD can be declared inline inside an XML document (as in the previous slide), or as an external reference(as in the below example).

Starting from Oracle 9i Release 1, there are datatypes available in Oracle which can store period intervals. The differences or intervals can be specified in years, months, days, hours, minutes and seconds. The new datatypes where introduced to comply with SQL99 standards. Till Oracle 9i programmers had to convert the interval to some milliseconds/seconds format and store it in NUMBER datatype.

Why do we use the INTERVAL datatypes introduced? We already have DATE datatype which can store DATE and TIME information. The answer is that the INTERVAL datatype stores an interval of time and not exactly one time at one point of time. Isn’t this a good feature?

In Oracle 9i, the time interval datatypes INTERVALYEARTOMONTH, and INTERVAL DAY TO SECOND were added along with several other datatypes to deal more precisely with points in time. The TIMESTAMP, TIMESTAMPWITHTIMEZONE, and TIMESTAMPWITHLOCALTIMEZONE datatypes all express time to fractions of a second, and the last two accounts for changes geographical location as well.

You can use the interval datatypes in both SQL and PL/SQL. They are specified the same way:

INTERVAL YEAR [(year_precision)] TO MONTH

INTERVAL DAY [(day_precision)] TO SECOND [(fractional_seconds_precision)]

There are defaults for the precision values: two digits for the year and day, and six digits for fractions of seconds. Interval literals are expressed by the word INTERVAL, followed by an expression in single quotes, and words that interpret the expression. YEARTOMONTH interval literals use a hyphen (-) between the year and month. DAYTOSECOND interval literals use a space between the number of days and time.

For example, this is a time interval of two years and six months:

INTERVAL '2-6' YEAR TO MONTH

This covers three days, 12 hours, 30 minutes, and 6.7 seconds:

INTERVAL '3 12:30:06.7' DAY TO SECOND (1)

Intervals can be positive or negative. They can be added to or subtracted from the various TIMESTAMP datatypes; the result is a new TIMESTAMP. They can be added or subtracted from each other as well, resulting in a new interval.

The following example shows how to create a table to record the time interval, for instance for a test.

Unfortunately, the TO_CHAR function doesn't contain any format models that map to the individual pieces of interval datatypes. Instead, you can use the new EXTRACT function to extract pieces and combine them. The syntax is as below:

Have you ever wondered after connecting to the database as SYS you wanted to check whether you are connected as regular user SYS or as SYS AS SYSDBA? Well the information is available in view V$SESSION_CONNECT_INFO. The following are the fields available in this view:

The index skip scan is a new execution plan in Oracle9i whereby an Oracle query can bypass the leading-edge of a concatenated index and access the inside keys of a multi-values index.

For example, consider the following concatenated index:

create index sex_emp_idon emp (sex, emp_id);

Prior to Oracle9i, this index could only be used with both sex and emp_id were present in the SQL query, or when the sex column was specified. The following query would not be able to use the concatenated index:

selectemp_idfrom empwhere emp_id = 123;

The Oracle9i skip scan execution plan allows for the concatenated index to be used, even though sex is not specified in the SQL query. This feature promises that there is no need to provide a second index on the emp_id column. Oracle acknowledges that the index skip scan is not as fast as a direct index lookup, but states that the index skip scan is faster than a full-table scan (otherwise why would oracle introduce such a feature).

What Oracles does not mention is that the cardinality of the leading column has a direct impact on the speed of the index skip scan. In our example, the first column, sex has two records (‘F’, and ‘M’).

While Oracle does not publish the internals of the index skip scan, we can infer from the execution plan that Oracle is internally generating multiple queries, thereby satisfying the query with multiple sub-queries:

Oracle skip scan execution plan performance will decrease according to the number of unique values in the high order key. If the leading column were “state” with 50 values, Oracle would be issuing 50 index probes to retrieve the result set.

Comments from the author: If you like Oracle tuning, you might enjoy my latest book “Oracle Tuning: The Definitive Reference” by Rampant TechPress. It’s only $41.95 (I don’t think it is right to charge a fortune for books!) and you can buy it right now at this link: http://www.rampant-books.com/book_2005_1_awr_proactive_tuning.htm

Explicitly chooses the cost-based approach to optimize a statement block with a goal of best throughput (that is, minimum total resource consumption)

/*+ CHOOSE */

Causes the optimizer to choose between the rule-based approach and the cost-based approach for a SQL statement based on the presence of statistics for the tables accessed by the statement

/*+ FIRST_ROWS */

Explicitly chooses the cost-based approach to optimize a statement block with a goal of best response time (minimum resource usage to return first row). It will also force the optimizer to make use of index, if available. There are other versions of FIRST_ROWS hints. This hint is useful in an OLTP environment when the user cannot wait till the last row is fetched. This is mainly used in Java lookup screens. If there are some calculations then this hint should not be used.

Explicitly chooses an execution plan that uses an access path that merges the scans on several single-column indexes

/*+ CLUSTER(table) */

Explicitly chooses a cluster scan to access the specified table

/*+ FULL(table) */

Explicitly chooses a full table scan for the specified table

/*+ HASH(table) */

Explicitly chooses a hash scan to access the specified table

/*+ HASH_AJ(table) */

Transforms a NOT IN sub query into a hash anti join to access the specified table

/*+ HASH_SJ (table) */

Transforms a NOT IN sub query into a hash anti-join to access the specified table

/*+ INDEX(table index) */

Explicitly chooses an index scan for the specified table

/*+ INDEX_ASC(table index) */

Explicitly chooses an ascending-range index scan for the specified table

/*+ INDEX_COMBINE(table index) */

If no indexes are given as arguments for the INDEX_COMBINE hint, the optimizer uses whatever Boolean combination of bitmap indexes has the best cost estimate. If particular indexes are given as arguments, the optimizer tries to use some Boolean combination of those particular bitmap indexes.

/*+ INDEX_DESC(table index) */

Explicitly chooses a descending-range index scan for the specified table

/*+ INDEX_FFS(table index) */

Causes a fast full index scan to be performed rather than a full table scan

/*+ MERGE_AJ (table) */

Transforms a NOT IN sub query into a merge anti-join to access the specified table

/*+ MERGE_SJ (table) */

Transforms a correlated EXISTS sub query into a merge semi-join to access the specified table

/*+ ROWID(table) */

Explicitly chooses a table scan by ROWID for the specified table

/*+ USE_CONCAT */

Forces combined OR conditions in the WHERE clause of a query to be transformed into a compound query using the

UNION ALL set operator

/*+ ORDERED */

Causes Oracle to join tables in the order in which they appear in the FROM clause

/*+ STAR */

Forces the large table to be joined using a nested-loop join on the index

/*+ DRIVING_SITE (table) */

Forces query execution to be done at a different site from that selected by Oracle

/*+ USE_HASH (table) */

Causes Oracle to join each specified table with another row source with a hash join

/*+ USE_MERGE (table) */

Causes Oracle to join each specified table with another row source with a sort-merge join

/*+ USE_NL (table) */

Causes Oracle to join each specified table to another row source with a nested-loops join using the specified table as the inner table

/*+ APPEND */ , /*+ NOAPPEND */

Specifies that data is simply appended (or not) to a table; existing free space is not used. Use these hints only following the INSERT keyword.

/*+ NOPARALLEL(table) */

Disables parallel scanning of a table, even if the table was created with a PARALLEL clause

/*+ PARALLEL(table, instances) */

This allows you to specify the desired number of concurrent slave processes that can be used for the operation. DELETE, INSERT, and UPDATE operations are considered for parallelization only if the session is in a PARALLEL DML enabled mode. (Use ALTER SESSION PARALLEL DML to enter this mode.)

/*+ PARALLEL_INDEX */

Allows you to parallelize fast full index scan for partitioned and non-partitioned indexes that have the PARALLEL attribute

/*+ NOPARALLEL_INDEX */

Overrides a PARALLEL attribute setting on an index

/*+ CACHE */

Specifies that the blocks retrieved for the table in the hint are placed at the most recently used end of the LRU list in the buffer cache when a full table scan is performed

/*+ NOCACHE */

Specifies that the blocks retrieved for this table are placed at the least recently used end of the LRU list in the buffer cache when a full table scan is performed

/*+ MERGE (table) */

Causes Oracle to evaluate complex views or sub queries before the surrounding query

/*+ NO_MERGE (table) */

Causes Oracle not to merge mergeable views

/*+ PUSH_JOIN_PRED (table) */

Causes the optimizer to evaluate, on a cost basis, whether or not to push individual join predicates into the view

/*+ NO_PUSH_JOIN_PRED (table) */

Prevents pushing of a join predicate into the view

/*+ PUSH_SUBQ */

Causes non merged sub queries to be evaluated at the earliest possible place in the execution plan

/*+ STAR_TRANSFORMATION */

Makes the optimizer use the best plan in which the transformation has been used.

If you've been developing applications on Oracle for a while, you've no doubt come across the concept of «Bind Variables». Bind variables are one of those Oracle concepts that is key to application performance.

To understand bind variables, consider an application that generates thousands of SELECT statements against a table; for example:

Each time the query is submitted, Oracle first checks in the shared pool to see whether this statement has been submitted before. If it has, the execution plan that this statement previously used is retrieved, and the SQL is executed. If the statement cannot be found in the shared pool, Oracle has to go through the process of parsing the statement, working out the various execution paths and coming up with an optimal access plan before it can be executed. This process is know as a «hard parse» and for OLTP applications can actually take longer to carry out that the DML instruction itself.

When looking for a matching statement in the shared pool, only statements that exactly match the text of the statements are considered; so, if every SQL statement you submit is unique (in that the predicate changes each time, from id = 674 to id=234 and so on) then you'll never get a match, and every statement you submit will need to be hard parsed. Hard parsing is very CPU intensive, and involves obtaining latches on key shared memory areas, which whilst it might not affect a single program running against a small set of data, can bring a multi-user system to it's knees if hundreds of copies of the program are trying to hard parse statements at the same time.

The extra bonus with this problem is that contention caused by hard parsing is pretty much immune to measures such as increasing available memory, numbers of processors and so on, as hard parsing statements is one thing Oracle can't do concurrently with many other operations, and it's a problem that often only comes to light when trying to scale up a development system from a single user working on subset of records to many hundreds of users working on a full data set.

The way to get Oracle to reuse the execution plans for these statements is to use bind variables.Bind variables are «substituion» variables that are used in place of literals(such as 674, 234, 332) and that have the effect of sending exactly the same SQL to Oracle every time the query is executed. For example, in our application, we would just submit

SELECT fname, lname, pcode FROM cust WHERE id = :cust_no;

and this time we would be able to reuse the execution plan every time, reducing the latch activity in the SGA, and therefore the total CPU activity, which has the effect of allowing our application to scale up to many users on a large dataset.

What we've done to the SELECT statement now is take the literal value out of it, and replace it with a placeholder (our bind variable), with SQL*Plus passing the value of the bind variable to Oracle when the statement is processed. This bit is fairly straighforward (you declare a bind variable in SQL*Plus, then reference the bind variable in the SELECT statement)

Bind Variables in PL/SQL

Taking PL/SQL first of all, the good news is that PL/SQL itself takes care of most of the issues to do with bind variables, to the point where most code that you write already uses bind variables without you knowing. Take, for example, the following bit of PL/SQL:

And that's all there is to it. One thing to bear in mind, though, is that you can't substitute actual object names (tables, views, columns etc) with bind variables - you can only subsitute literals. If the object name is generated at runtime, you'll still need to string concatenate these parts, and the SQL will only match with those already in the shared pool when the same object name comes up. However, whenever you're using dynamic SQL to build up the predicate part of a statement, use bind variables instead and you'll reduce dramatically the amount of latch contention going on.

The Performance Killer

Just to give you a tiny idea of how huge of a difference this can make performance wise, you only need to run a very small test:

That is pretty dramatic. The fact is that not only does this execute much faster (we spent more time PARSING our queries then actually EXECUTING them!) it will let more users use your system simultaneously.

Bind Variables in VB, Java and other applications

The next question is though, what about VB, Java and other applications that fire SQL queries against an Oracle database. How do these use bind variables? Do you have to in fact split your SQL into two statements, one to set the bind variable, and one for the statement itself?

In fact, the answer to this is actually quite simple. When you put together an SQL statement using Java, or VB, or whatever, you usually use an API for accessing the database; ADO in the case of VB, JDBC in the case of Java. All of these APIs have built-in support for bind variables, and it's just a case of using this support rather than just concatenating a string yourself and submitting it to the database.

For example, Java has PreparedStatement, which allows the use of bind variables, and Statement, which uses the string concatenation approach. If you use the method that supports bind variables, the API itself passes the bind variable value to Oracle at runtime, and you just submit your SQL statement as normal. There's no need to separately pass the bind variable value to Oracle, and actually no additional work on your part. Support for bind variables isn't just limited to Oracle - it's common to other RDBMS platforms such as Microsoft SQL Server, so there's no excuse for not using them just because they might be an Oracle-only feature.

Conclusion

Lastly, it's worth bearing in mind that there are some instances where bind variables are probably not appropriate, usually where instead of your query being executed many times a second (as with OLTP systems) your query in fact actually takes several seconds, or minutes, or hours to execute - a situation you get in decision support and data warehousing. In this instance, the time taken to hard parse your query is only a small proportion of the total query execution time, and the benefit of avoiding a hard parse is probably outweighed by the reduction in important information you're making available to the query optimizer - by substituting the actual predicate with a bind variable, you're removing the ability for the optimiser to compare your value with the data distribution in the column, which might make it opt for a full table scan or an index when this isn't appropriate. Oracle 9i helps deal with this using a feature known as bind variable peeking, which allows Oracle to look at the value behind a bind variable to help choose the best execution plan.

Another potential drawback with bind variables and data warehousing queries is that the use of bind variables disallows the potential for star transformations, taking away this powerful option for efficiently joining fact and dimension tables in a star schema.

oPL/SQL "continue" keyword - This will allow a "C-Like or PB" continue in a loop, skipping an iteration to bypass any "else" Boolean conditions. A nasty PL/SQL GOTO statement is no longer required to exit a Boolean within a loop.

beginfori in 1..3 loopdbms_output.put_line('i='||to_char(i));if ( i = 2 ) thencontinue;end if;dbms_output.put_line('Only if i is not equal to 2′);end loop;end;

oDisabled state for PL/SQL - Another 11g new feature is a "disabled" state for PL/SQL (as opposed to "enabled" and "invalid" in dba_objects).

oImproved PL/SQL stored procedure invalidation mechanism - A new 11g features will be fine grained dependency tracking, reducing the number of objects which become invalid as a result of DDL. Fine Grained Dependency Tracking (FGDT?). This means that when you add a column to a table, or a cursor to a package spec, you don't invalidate objects that are dependant on them! The coolest feature

oScalable PL/SQL - The next scalable execution feature is automatic creation of "native" PL/SQL (and Java code), with just one parameter for each type with an "on/off" value. This apparently provides a 100% performance boost for pure PL/SQL code, and a 10%-30% performance boost for code containing SQL.

oEnhanced PL/SQL warnings - The 11g PL/SQL compiler will issue a warning for a "when others" with no raise.

oStored Procedure named notation - Named notation is now supported when calling a stored procedure from SQL.

oNew Data Type:simple_integer. Always NOT NULL, wraps instead of overflows and is faster than PLS_INTEGER

oSpecify Trigger firing order.

Oracle 11g SQL New Features

oNew "pivot" SQL clause - The new "pivot" SQL clause will allow quick rollup, similar to an MS-Excel pivot table, where you can display multiple rows on one column with SQL. MS SQL Server 2005 also introduced a pivot clause.

oThe /*+result_cache*/ SQL hint - This suggests that the result data will be cached in the data buffers, and not the intermediate data blocks that were accessed to obtain the query results. You can cache SQL and PL/SQL results for super-fast subsequent retrieval. The "result cache" ties into the "scalable execution" concept. There are three areas of the result cache:

The SQL query result cache - This is an area of SGA memory for storing query results.

The PL/SQL function result cache - This result cache can store the results from a PL/SQL function call.

The OCI client result cache - This cache retains results from OCI calls, both for SQL queries or PL/SQL functions.

oScalable Execution - This 11g feature consists of a number of features, the first of which is query results caching; this feature automatically caches the results of an SQL query as opposed to the data blocks normally cached by the buffer cache, and works both client (OCI) and server side - this was described as "buffer cache taken to the next level". The DBA sets the size of the results cache and turns the feature on at a table level with the command "alter table DEPT cache results", the per-process cache is shared across multiple session and at the client level, is available with all 11g OCI-based clients.

oSQL Replay - Similar to the previous feature, but this only captures and applies the SQL workload, not total workload.

oImproved optimizer statistics collection speed - Oracle 11g has improved the dbms_stats performance, allowing for an order of magnitude faster CBO statistics creation. Oracle 11g has also separated-out the "gather" and "publish" operations, allowing CBO statistics to be retained for later use. Also, Oracle 11g introduces multi-column statistics to give the CBO the ability to more accurately select rows when the WHERE clause contains multi-column conditions or joins.

oDynamic SQL. DBMS_SQL is here to stay. It's faster and is being enhanced. DBMS_SQL and NDS can now accept CLOBs (no more 32k limit on NDS). A ref cursor can become a DBMS_SQL cursor and vice versa. DBMS_SQL now supprts user defined types and bulk operations.

oFully Automatic SQL Tuning - The 10g automatic tuning advisor makes tuning suggestions in the form of SQL profiles that will improve performance. You can tell 11g to automatically apply SQL profiles for statements where the suggested profile give 3-times better performance that the existing statement. The performance comparisons are done by a new administrative task during a user-specified maintenance window.

oImproved SQL Access Advisor - The 11g SQL Access Advisor gives partitioning advice, including advice on the new interval partitioning. Interval partitioning is an automated version of range partitioning, where new equally-sized partitions are automatically created when needed. Both range and interval partitions can exist for a single table, and range partitioned tables can be converted to interval partitioned tables.

Disclaimer

This blog contains things about technology that I jot about, keep for my records. Some articles in this blog are not owned by me. They are for my reference only. In older posts the original links may not be there, its not intentional. Thanks for understanding.