You are here

External Tables

External Tables let you query data in a flat file as though the file were an Oracle table. In 9i, only read operations were permitted; in 10g, you can also write out data to an external table, although you can't write to an existing table.

While external tables can be queried, they're not usable in many ways regular Oracle tables are. You cannot perform any DML operations on external tables other than table creation; one consequence is that you can't create an index on an external table. External tables are largely used as a convenient way of moving data into and out of the database.

Oracle uses SQL*Loader functionality, through the ORACLE_LOADER access driver to move data from the flat file into the database; it uses a Data Pump access driver to move data out of the db into a file in an Oracle-proprietary format, and back into the database from files of that format. While there are some behaviour differences and restrictions, you can think of external tables as a convenient, SQL-based way to use SQL*Loader and Data Pump functionality.

For example, suppose that you receive a daily .csv report from another department. Instead of writing a SQL*Loader script to import each day's .csv file into your database, you can simply create an external table and write an "insert ... select" SQL query to insert the data directly into your tables. Place the day's CSV file in the location specified in the external table definition, run the query, and you're done.

Since an external table's data is in the operating system, its data file needs to be in a place Oracle can access it. So the first step is to create a directory and grant access to it.

First create the directory in the operating system, or choose an existing directory. It must be a real directory, not a symlink. Make sure that the OS user that the Oracle binaries run as has read-write access to this directory. Note: Be sure not to use a directory you should be keeping secure, such as an Oracle datafile, program, log or configuration file directory. And if the data you'll be putting there is sensitive, make sure that other OS users don't have permissions on this directory.

You must actually move or copy the file to the data directory; symlinks won't cut it. Again, make sure that if the data is sensitive, only the Oracle user can read or write to it.

The next step is to create this directories in Oracle, and grant read/write access on it to the Oracle user who will be creating the external table. When you create the directory, be sure to use the directory's full path, and don't include any symlinks in the path -- use the actual full path.

The last step is to create the table. The CREATE TABLE statement for an external table has two parts. The first part, like a normal CREATE TABLE, has the table name and field specs. This is followed by a block of syntax specific to external tables, which lets you tell Oracle how to interpret the data in the external file.

At this point, Oracle hasn't actually tried to load any data. It doesn't attempt to check the validity of many of the external-table-specific parameters you pass it. The CREATE TABLE statement will succeed even if the external data file you specify doesn't actually exist.

With the create table statement, you've created table metadata in the data dictionary and instructed Oracle how to direct the ORACLE_LOADER access driver to parse the data in the datafile. Now, kick off the load by accessing the table:

Oracle used the ORACLE_LOADER driver to process the file, and just as with SQL*Loader, it's created a log file that you can inspect to see what just happened. The log file -- and the "bad" and "discard" files -- will have been written to the directory you specified as the "default directory" in your CREATE TABLE statement, and the file names default to tablename_ospid :

If Oracle was unable to process the data given the access parameters you specified, you'll get an error on the command line and in the log file, and there will also be a bad and/or discard file. (Note: if you're copying and pasting data into your external data file, be sure not to put a newline after the last record, or SQL*Loader will expect a seventh record, and you'll get an error when you try to select from the external table.)

You may want to configure separate directories for the SQL*Loader output files -- the LOG file, the DISCARD file and the BAD file -- as well as for the external table data. You can lump all four in the same directory, as we did in the previous example, although it's a bad idea: a naming mishap could have you overwriting one external table's data file with another's bad file. I like to have one directory for data files, and one for log/bad/discard files:

Again, these must be actual directories, not symlinks, and be sure to set the permissions appropriately. To eliminate the possibility of any naming mishap, you can grant READ access only on /.../data, and WRITE access only on /..../log, to the user creating the external tables.

You can use ALTER TABLE to change the access parameters without dropping and redefining the whole table:

Alternatively, you can set up the table so that no log, discard or bad files are generated. SELECTing data from the table will still fail if the maximum number of rejects is exceeded, just as in SQL*Loader. You can change the reject limit for an external table with an ALTER TABLE statement:

Where external tables really shine are in the ease with which you can load their data into your tables. A particularly nice feature is that you can use any valid function that the current Oracle user has rights on to transform the raw data before loading it into your database tables. For example, suppose you had a function, get_bday_from_ssn (ssn in varchar2) that looked up an employee's birth date given their SSN. You can use that function to populate a BIRTH_DATE column in your local database table in the same step as you load the data into it.

Oracle 10g lets you create a new external table from data in your database, which goes into a flat file pushed from the database using the ORACLE_DATAPUMP access driver. This flat file is in an Oracle-proprietary format that can be read by DataPump. The syntax is similar to the CREATE TABLE... ORGANIZATION EXTERNAL above, but simpler -- since you can't specify the data format, you can specify very few access_parameters. The key difference is that you must specify the access driver, ORACLE_DATAPUMP, since the access driver defaults to ORACLE_LOADER.

We've seen an introduction to loading and unloading data with external tables. External tables in 9i and 10g provide a convenient, seamless way to move data in and out of the database, integrating SQL*Loader and Data Pump functionality with the power, scriptability and ease of SQL statements. It's definitely worth considering external tables the next time you have a daily upload or download to arrange.

Natalka Roshak is a senior Oracle and Sybase database administrator, analyst, and architect. She is based in Kingston, Ontario and consults across North America. More of her scripts and tips can be found in her online DBA toolkit at http://toolkit.rdbms-insight.com/.

In 2002, PolyServe and XIOtech partnered to perform a huge Proof of Concept (audited by IDC) of a 10TB database on a 10 node cluster and ramped the active OLTP users count to 10,000 users. The project was called The Tens.

The entire setup was stored in PolyServe Matrix Server CFS, to include External Tables. With RAC, External Tables can be accessed via IPQO (Intra-node Parallel Query). The source data for the load was pipe-delimited and instered with APPEND as select /*+ PARALLEL */ * from External_table.

The paper is a good reference for VLDB with Linux/RAC and can be obtained at:

This is definitely something I needed to know about since I have been asked to complete a task that this will be perfect for :) Thank you! Thank you!

Petty perhaps, when you've had something spoon fed to you like this; but, it would be a nice touch when articles with SQL examples gave you a shot without the SQL> Prompt so you could cut and paste the entire code. As it was, I had to do 18 individual lines (okay, 16, I didn't cut and past the parenthesis).

Maybe a link to "download code" or something like that would be the solution.

Is there any kind of similar thing we can use with XML file. Presently I am using the XSLT to transform the data into text file & then loading it to database through CONTROL file using SQL loader. Is there any way we cna directly fetch the XML file to database.

This is a very nice article. Thanks Natalka.
However the data security is an issue - if only Oracle has access to the .csv file then all the users having access to the database have access to this file. Or am I missing something?

Am new to external table concept but got an idea from your well written article.
I have some doubts.

You have created the below table:
create table empl_info as
(select empl_id, last_name, first_name, ssn, get_bday_from_ssn (ssn) birth_dt
from xtern_empl_rpt)

Is it possible to add more columns to empl_info table apart from those present in the csv file?

Also, I would like to know the best way to dump data from multiple xls/csv files into oracle data tables? This dumping might be done daily as content of files will change. So shall we have multiple external tables for this?