4.1.1About System Properties

System properties are system-wide settings that enable you to tailor Oracle Content Server to your particular requirements. System properties are set during installation and are generally updated occasionally, or as needed, in contrast to other administration tools, which are used more regularly for maintenance of users and content.

Important:

Regardless of which method is used to modify system properties, you must restart the Oracle Content Server instance for any configuration changes to take effect.

There are several ways to interact with system properties:

The Admin Server enables you to configure a single Oracle Content Server instance. You also can enable and disable system components. The Admin Server can be accessed by using a web browser and selecting the Administration link.

Most system properties settings correspond to a configuration variable in one of the following configuration files:

IntradocDir/config/config.cfg

DomainHome/ucm/short-product-id/bin/intradoc.cfg

IntradocDir/search/search.cfg

It is recommended that you make changes to these files through the Admin Server or System Properties application to ensure that the settings are entered correctly. While it is possible to edit these files directly using a text editor, it might allow errors to be introduced. See the Oracle Fusion Middleware Idoc Script Reference Guide for more information on configuration variables.

There are many techniques for optimizing the performance of Oracle Content Server. One of the types of tuning involves changing default parameters and software settings that affect the core Oracle Content Server performance. System optimization and performance tuning is often accomplished by adjusting system settings and configuration variables or tuning resources such as databases and indexes.

For example, as the content in your Oracle Content Server instance increases, you might experience a shortage of available space. In this case, moving the vault, weblayout, and search index directories to another drive with more space can help alleviate storage problems. Moving these directories requires adding entries into the DomainHome/ucm/cs/bin/intradoc.cfg file.

You do not have to log in as the system administrator to access the System Properties application. You only need access to the local computer where the Oracle Content Server instance is installed.

If you plan to use the Batch Loader utility to update and insert a large number of files on your Oracle Content Server system simultaneously, you must create a batch load file. Two of the optional parameters that you can include in your batch load file are primaryOverrideFormat and alternateOverrideFormat. However, these options only work as parameters in the batch load file if you enable the IsOverrideFormat configuration variable. You can set this variable using the System Properties application.

4.1.2.1 Revision Label Sequence

The metadata field named Revision has a default revision number sequence of 1, 2, 3, 4, 5, and so forth. This number increments automatically for each revision of a document.

You can override the Revision default by changing the definition of the revision label. The revision label consists of two parts: a major and minor revision sequence. The Major Revision Label Sequence is the first number or letter and the Minor Revision Label Sequence follows. For example, in the revision sequence 1a, 1b, 1c, 2a, 2b, 2c, 3a, 3b, 3c, and so forth, the numbers 1, 2, 3 are the major revision sequence and a, b, c are the minor revision sequence.

4.1.2.2 Revision Label Ranges

Both the major and minor revision sequences are defined as a range of numbers or letters. The major sequence can have multiple ranges, while the minor sequence can only have one range.

The following are the restrictions on defining the range:

Numbers or letters can be used, but not both. For example, 1-10 is a valid range but A-10 is not a valid range.

Letter ranges can have only one letter. For example, A-Z is a valid range but AA-ZZ is not a valid range.

4.1.2.3 Revision Examples

The following are examples of different revision sequences and how you would define the major and minor revision entries in the config.cfg file.

4.1.2.4 Revision Configuration Settings

To change the default revision sequence manually in the IntradocDir/config/config.cfg file, enter the following name/value pairs:

MajorRevSeq=range1,range2,range3...

MinorRevSeq=range

where range1,range2,range3... and range are the defined range sequence.

4.1.2.5 Chunking Function

The Oracle Content Server Chunking function protects large data transfers from transfer failures by dividing data into chunks and transferring one chunk at a time. If a transfer fails, all chunks transferred to the Oracle Content Server instance before failure are saved, and the transfer can be resumed from the point of failure.

Note:

If the client session using the Chunking function is killed, either by timeout or by closing the client browser, the transfer will fail.

4.1.4 Configuring Internet Information

You must restart the Oracle Content Server instance for any configuration changes to take effect.

4.1.5 Configuring the Database

The Oracle Content Server system uses an Oracle WebLogic Server data source to communicate with the system relational database where metadata and other information is stored. The Oracle WebLogic Server Administration Console must be used to manage the database connection information for the system relational database, therefore JDBC user name and password information is not stored in the IntradocDir/config/config.cfg file, and it is not managed through the SystemProperties utility.

Note:

If you set database connection information for an Oracle WebLogic Server domain using the Oracle Content Server SystemProperties utility, the JDBC user name and password are encrypted and stored in an unspecified location.

For information about supported databases, see the "System Requirements and Supported Platform" document for your product on the Oracle Fusion Middleware Supported System Configurations page on Oracle Technology Network at

An IBM DB2 database does not support the keyword CONTAINS in search queries. The correct configuration of the Oracle Content Server instance for IBM DB2 searches requires the addition of the flag SSUseContains=false variable. To configure the Oracle Content Server instance:

Open a new browser window, and log in to the Oracle Content Server instance as system administrator.

In the navigation tree, choose Administration, then Admin Server.

In the option list for the Oracle Content Server instance, choose General Configuration.

The General Configuration page is displayed.

Add the following line in the Additional Configuration Variables area:

You must restart the Oracle Content Server instance for any configuration changes to take effect.

Caution:

If you do not use a Hostname filter, IP Address filter, or some other network-based security, you will have a security hole in your Oracle Content Server instance. For example, with no login, any user with in-depth knowledge of the system could create or modify any other user to have system administrator access.

Hostname filter or IP Address filter values must be set to allow communication with the Oracle Content Server instance in the following situations:

Running Inbound Refinery and PDF Converter (even on the same physical computer as the Oracle Content Server instance).

Transferring Oracle Content Server archives between computers.

Configurations where the web server and the Oracle Content Server instance are on different systems.

EJB-enhanced operations.

Using the IdcCommand or IdcCommandX utilities on a system separate from the Oracle Content Server instance. (You must change the default value and specify the IP address of the web server.)

4.1.7.1 Date Format

The default English-US locale uses two digits to represent the year (yy), where the year is interpreted to be between 1969 and 2068. In other words, 65 is considered to be 2065, not 1965. If you want years before 1969 to be interpreted correctly in the English-US locale, you must change the default date format for that locale to use four digits to represent years (yyyy).

This issue does not apply to the English-UK locale, which already uses four digits for the year.

To modify the default English-US data format:

Start the System Properties applet:

Windows operating system:

Select Start, then All Programs, then Oracle Content Server, then [Instance Name], then Utilities, then System Properties.

UNIX operating system:

The SystemProperties utility is located in the /bin subdirectory of the Oracle Content Server installation directory.

Open the Localization tab.

Choose the English-US entry in the list of locales, and click Edit.

The Configure Locale dialog is displayed.

Modify the date format to use four digits for the year (yyyy) rather than two (yy).

After you are done editing, click OK to close the Configure Locale dialog.

Click OK to apply the change and exit System Properties.

Stop and restart the Oracle Content Server instance for configuration changes to take effect.

4.1.7.2 Interface Language

The default interface language for the Oracle Content Server system can be specified in several ways:

Select a default language in the Localization tab of the System Properties utility, using the same basic procedure described in Section 4.1.7.1, "Date Format."

Use the Oracle Content Server navigation portal to selection Administration, then Localization. Select the checkbox for a default language from the list of Enabled Locales on the Localization Administration page.

4.1.8 Configuring Paths

You can use the System Properties: Paths Tab to change the location of the help browser, Java classpath, and the shared directory path. For security reasons, the Admin Server cannot be used to configure the path options. You must use the standalone application for this configuration.

You must restart the Oracle Content Server instance for configuration changes to take effect.

4.2.1 Variances in Indexing Tools and Methods

Oracle Content Server interfaces with a variety of indexing tools such as commercial search engines and databases. The indexing tool to use is chosen before installation based on the purpose and environment in which the Oracle Content Server system performs.

Each indexing tool provides full-text indexing and metadata-only indexing. Full-text indexing means that every word in a file is indexed, not only its metadata. Full-text indexing takes longer than metadata indexing; however, it can return a more comprehensive result set. Metadata-only indexing means that every word in the stored content information is indexed. Metadata-only indexing is faster than full-text indexing. By default an Oracle Content Server instance is configured to use metadata-only indexing.

4.2.2.2Updating the Search Index

4.2.2.3Rebuilding the Collection

On the Repository Manager screen, click the Indexer tab.

Click Start in the Collection Rebuild Cycle area.

Note:

OracleTextSearch provides a Fast Rebuild function, which you can use through the Repository Manager Indexer function if your site uses the OracleTextSearch feature. For details, see Section 7.1.3.2, "Fast Rebuild."

4.2.2.4Configuring the Update or Rebuild

To set the parameters for a search index update or collection rebuild:

On the Repository Manager screen, click the Indexer tab.

Click Configurein either the Automatic Update Cycle portion of the screen or the Collection Rebuild Cycle portion.

Specify the number of content items (files) per indexer batch. This is the maximum number of files that the search index will process simultaneously.

Specify the content items (files) per checkpoint. This is the number of files that will go through all relevant indexing states at a time. You can have multiple batches of files indexed per checkpoint.

Specify the indexer debug level. This is the amount of information pertaining to each file to display in the server window.

Click OK.

4.2.2.5 Text File Full-Text Indexing

If you have configured the Oracle Content Server instance to use DATABASE.FULLTEXT or ORACLETEXTSEARCH as your indexing engine, the Oracle Content Server system uses the Outside In Content Access module to export content to a text file upon check-in. The text file is then passed to the full-text indexer for full-text indexing.

Note:

When the Outside In Content Access module converts a PostScript file, the conversion process produces text that contains extra characters. Unfortunately, this creates a file that is full-text indexed but cannot be full-text searched.

If you use DATABASE.FULLTEXT, a full-text search can be problematic on large documents. By default, the maximum document size that is indexed is 10MB. This can be changed by setting the MaxIndexableFileSize configuration variable in the Oracle Content Server repository. The default is MaxIndexableFileSize=10485760. If larger documents require full-text indexing, the value of MaxIndexableFileSize should be increased.

4.2.2.6Disabling Full-Text Indexing

You might want to disable full-text indexing if, for example, you want to conserve file space or if you do not require full-text searching for specific content types. Even if you disable full-text indexing, metadata is still indexed.

To disable full-text indexing on specific files:

Define a format in the Configuration Manager screen named application/noindex.

When a user checks in a file that they do not want to be indexed, they should select the application/noindex format. This applies to standard files, batch loads, and archived revisions.

4.2.2.7 Indexing Native Files by Default

The search index uses weblayout files for indexing by default. In certain situations it may be useful to index native files by default instead of weblayout files. For example, if a converted PDF file cannot be extracted and indexed because of processing issues, the native Word document or an alternate type of document could be extracted and indexed. Another example is if the primary file cannot be indexed because it is an .exe file, but it has an alternate .txt file, then the alternate file could be indexed.

To have the search index use native files for indexing by default, set the following parameter in the Oracle Content Server instance:

UseNativeFormatInIndex=true

4.2.2.8 Indexing E-mail and Attachments

Oracle Content Server software supports indexing of e-mail and e-mail attachments (such as original files and zip files). E-mail messages are indexed by default, and if a message contains an attachment it is extracted and indexed as full text. There is no change on what gets returned on search result: if a search finds information in a document, the document metadata is returned.All e-mail attachments supported by Outside In Technology are supported by the search index.

4.2.3 Managing Zone Text Fields

The functionality described in this section is only available if you have installed and enabled the Database Search Contains Operator feature.

4.2.3.1 About Zone Text Fields

The Database Search Contains Operator feature enables you to use the Contains search operator to search text fields when performing Database and Database Full Text searches on SQL Server and Oracle. You must first enable the text fields that can be queried using the Contains search operator. These text fields are called zone text fields.

When a text field is added as a zone text field, the text within the field is parsed and a full-text index for the field is created in the database. The database performs all the work of creating the index, and the index is dropped from the database if the text field is disabled as a zone text field. Therefore, there is no need to rebuild the collection after enabling or disabling text fields as a zone text fields.

Important:

Changing a text field to a zone text field can be a very time-consuming operation. The amount of time it takes to parse the text and create the full-text index depends on the number of content items in the Oracle Content Server repository and the amount of text stored in the text field. However, after the text field has been indexed, you should not experience significant performance issues when updating and adding content items.

When a text field has been enabled as a zone text field, the Contains search operator is available for the text field on the Advanced Search page. It is represented as the Has Word option in the list next to the text field.

Click the left arrow button to move the text fields to the Zone Text Fields list.

Click Update.

Important:

Changing a text field to a zone text field can be a very time-consuming operation. The amount of time it takes to parse the text and create the full-text index depends on the number of content items in the Oracle Content Server repository and the amount of text stored in the text field. However, when the text field has been indexed, you should not experience significant performance issues when updating and adding content items.

To disable zone text fields, complete the following steps:

Select the zone text fields in the Zone Text Fields list. You can use the [Ctrl] and [Shift] keys on your keyboard to select multiple fields.

Click the right arrow button to move the text fields to the Text Fields list.

Click Update.

When enabling and disabling zone text fields, consider the following:

If you start making changes to the lists and you then want to revert to the last saved lists, click Reset.

Custom text fields (the Comments field and any customer created text fields) are shared between the Database and Database search engines, and therefore changing the status of these text fields for one search engine also applies the changes to the other search engine.

Standard text fields (Author, Content ID, Content Type, Title, and so on) can be enabled or disabled independently for each search engine.

The database performs all the work of creating the indexes, and the index are dropped from the database if the text fields are disabled as zone text fields. Therefore, there is no need to rebuild the collection after enabling or disabling text fields as a zone text fields.

You must disable a zone text field before the field can be deleted from the Oracle Content Server instance using Configuration Manager. If you delete an enabled zone text field using Configuration Manager and then click Update Database Design, you will receive an error.

Disabling the zone text field drops the index for the field from the database, allowing the field to be deleted from the database. As an alternative to disabling the zone text field, you could log in to the database and issue a command to drop the index for the field, and then delete the field.

You might want to disable all zone text fields before uninstalling the feature. Otherwise, you are not able to delete the zone text fields from the Oracle Content Server instance unless you reinstall the feature to disable the zone text fields or drop the indexes for the zone text fields from the database manually.

4.2.3.3 Changing the MinTextFullFieldLength Variable

By default, text fields with a field length of 20 characters or less are not included in the Text Fields list. You can change this setting by modifying the MinFullTextFieldLength configuration variable. To change this variable, complete the following steps:

Using a text editor, open the config.cfg file located in the IntradocDir/config/ directory.

Add the MinFullTextFieldLength configuration variable, and set its value (the default value is 21). For example:

MinFullTextFieldLength=16

Save your changes to the config.cfg file.

Restart the Oracle Content Server instance.

4.2.3.4 Disabling Database Search Contains Operator

Before disabling the feature, you might want to disable all zone text fields. The database contains an index for each enabled zone text field (the indexes are dropped when the zone text fields are disabled). If the database contains an index for a field, it will not let you delete the field from your Oracle Content Server instance using Configuration Manager. For more information, see Section 4.2.3.2, "Enabling and Disabling Zone Text Fields."

If you disable the feature and later want to delete a field that is enabled as a zone text field, you can use one of the following options:

Reinstall the feature, disable the zone text field, use Configuration Manager to delete the field, and uninstall the feature.

Log in to the database and issue a command to drop the index for the field, and then use Configuration Manager to delete the field.

4.2.4 Indexing with Databases

If your system is set up to provide indexing and searching capabilities with databases, your system integrator would have added one of the following lines in IntradocDir/config/config.cfg:

ORACLETEXTSEARCH is supported in Oracle Database version 11.1.0.7 and newer.

Full-text Searching:

SearchIndexerEngineName=DATABASE.FULLTEXT

DATABASE.FULLTEXT is supported in SQL Server, and in Oracle Database (all supported versions).

The dbfulltextsearch script appropriate for the supported database would then be run.

By default, full-text indexing is applied to all converted files.

By default, the Oracle Content Server full-text indexes files that are passed through or converted to any of the following formats:

Oracle Supported Formats

pdf

html

htm

xls

hcsp

text

txt

doc

rtf

ppt

MS SQL Supported Formats

text

txt

htm

html

doc

msword

ms-word

ms-powerpoint

ppt

ms-excel

xls

For example, if you want to convert your Microsoft Word (.doc) files to text files instead of PDF, you can specify this in the Configuration Manager. That is, when you use the File Formats option to map the .doc file extension to a text format, then this defines how the file is converted to a web viewable format. In this case, the text file is fully indexed before it is passed to the web site.

For example, if you have used the Configuration Manager's File Formats option to map Corel WordPerfect (.wpd) files to use a text format and a contributor selects the use default option in the Format field on the checkin page, the file will be converted to text and full-text indexed. If the contributor selects Corel WordPerfect Document, the file will be passed through in its native format and will not be full-text indexed.

When you use full-text searching, a search is case sensitive for metadata, and case insensitive for full text. For Content ID, however, lowercase letters are converted to uppercase letters, so Content ID can not be searched with lowercase letters.

4.2.4.1 Database-Supported File Formats

If you define a file format to PASSTHRU in the native format, and the format name contains one of the types listed above (such as application/ms-excel.native), the passed through native file will be full-text indexed by default.

Alternatively, you can use configuration variables to control whether a document is full-text indexed. To manage the full-text indexing and search of specific document format types, add applicable entries to IntradocDir/config/config.cfg, and save the file. Full-text indexing configuration variables include:

4.2.4.1.1 FormatMap

The FormatMap configuration variable controls whether files of a specific format should be included in the full-text search index. It is a comma-delimited list of all the formats that will be full-text indexed. The decision is made by taking the MIME type assigned to a file, splitting the MIME type apart at any slash (/) or period (.), and then checking if that value is in the FormatMap list.

For example, application/vnd.msword will turn into a list of three items:

application

vnd

msword

If FormatMap has msword in its list, then the indexer engine will attempt to full-text index the file. the comparison test is not case sensitive.

4.2.4.1.2 ExceptionFormatMap

The ExceptionFormatMap configuration variable is used to exclude document formats from the FormatMap test. Any format that satisfies the ExceptionFormatMap test will not be full-text indexed. This test is done after splitting the MIME format at slashes (/), but not periods(.). For example, if msword is included in the exceptions list, then the MIME format application/msword is excluded but not application/vnd.mssword.

4.2.5 Searching Content Using the Oracle Query Optimizer Component

The Oracle Query Optimizer component is installed (enabled) by default with the Oracle Content Server system. The functionality only works with the Oracle database.

4.2.5.1 About The Oracle Query Optimizer Component

Oracle database does not automatically select the best execution plan for certain types of user queries. To counter this, the Oracle Query Optimizer adds hints to queries that force Oracle database to perform searches more efficiently.

The hints are based on an intrinsic knowledge of Oracle Content Server's table data distribution and its index selectivity. To take advantage of this knowledge, Oracle Query Optimizer uses a pre-defined Hint Rules Table to analyze the database query and then add appropriate hints to the query. In turn, the added hints improve Oracle's search performance.

Oracle Query Optimizer takes advantage of Content Server's data distribution in database tables and its index selection preferences. Based on these characteristics, the Hint Rules Table included with Oracle Query Optimizer contains pre-defined rules. The feature uses these rules to analyze a database query and to add one or more appropriate hints to the query to optimize the search performance.

In very large collections containing millions of content items, the Oracle Content Server software generally has a difficult time selecting an appropriate optimization strategy to resolve even simple queries. To counteract this problem, Oracle Query Optimizer examines the submitted query and, based on its analysis, reformats the query by adding appropriate hints to optimize the search process. To add hints, the feature uses Oracle Content Server hints, The Hint Rules Table, and The Hint Cache.

4.2.5.2 Query Optimization Process

The stages of the optimization process are completed in the following sequence:

The submitted query is analyzed to verify if it contains one or more hints and, if so, determine the type of hint; see "Stage 1: Query Analysis".

If the query's WHERE clause does not contain a hint, the optimization feature must parse out the WHERE clause; see "Stage 2: Parsing".

After parsing, each condition in the query's WHERE clause is evaluated against the Hint Rules Table in an attempt to qualify the condition and normalize the query; see "Stage 3: Normalization".

After the WHERE clause conditions are qualified and the query is normalized, a hint is selected or retrieved from the hint cache; see "Stage 4: Select Hint".

4.2.5.2.1 Stage 1: Query Analysis

In this stage, a query is checked for both Oracle (native) and Content Server hints. This is determined based on the hint syntax: Query Hints Syntax. A query that contains Oracle hints is passed through. A query that contains Content Server hints bypasses Stage 2: Parsing and Stage 3: Normalization. If a query contains multiple Content Server hints, the best hint is chosen. Queries that do not contain any hints must be parsed and normalized.

4.2.5.2.2 Stage 2: Parsing

In this stage, a query that does not contain any hints is sent through the query parser and the WHERE clause is parsed out. A WHERE clause consists of one or more conditions joined with either AND or OR conjunctions. For each condition, the field name, operator, and field value are extracted. The AND/OR conjunctions of the clause are preserved; the parentheses are dropped. Conditions must use the following format:

fieldnameoperatorvalue

For example, a properly formatted condition would be dID = 3. An incorrect condition would be 3 = dID.

4.2.5.2.3 Stage 3: Normalization

In this stage, normalization simplifies conditions, finalizes query operators, and provides a stable view of the WHERE clause for additional steps. The result of the normalization process produces a base for generating the cache key and the list of fields to use to search for hints.

Note:

To establish which database tables and columns have indexes, the Hint Rules Table is defined on Content Server resources and on the running system.

Qualifying WHERE Clause Conditions:

Each condition in the WHERE clause is checked against The Hint Rules Table. If a condition's field name is included in the Hint Rules Table, then it is qualified and the condition is considered to be normalized. The condition contains its table name and alias. Then the normalized conditions are sorted to ensure that the same set of conditions is always listed consistently.

Discarding WHERE Clause Conditions During Normalization:

During normalization, the following conditions are not considered relevant and are eliminated from further processing:

Join conditions.

Conditions that contain subqueries.

Conditions whose field names do not have entries in the Hint Rules Table and cannot be qualified.

OR conditions that contain more than one field. For example:

(dSecurityGroup = 'Secure' or dDocAccount LIKE 'prj%')

Conditions that contain the LIKE operator whose value begins with a wildcard.

Reformatting WHERE Clause Conditions:

In the normalization step, the query conditions are rewritten to consolidate complex query conditions. OR conditions are reevaluated as follows:

If all the fields are the same and all the operators are equal (or all the operators are LIKE and no values begin with a wildcard), the conditions are combined and changed to an IN query.

If the fields are the same but have different operators, the conditions are combined and the generic operator is assigned.

If the fields are different, the conditions are dropped.

For example, during normalization, the following condition is reformatted:

(dReleaseState = 'Y' OR dReleaseState = 'O")

It is reformatted as follows:

dReleaseState IN ('Y', 'O')

Finding Potential Range Queries:

The parsed query is analyzed to find potential range queries that are then consolidated during the normalization process. For example, the conditions dIndate > date1 and dInDate < date2 are changed to one condition with the operator 'range'.

4.2.5.2.4 Stage 4: Select Hint

In this stage, the normalized conditions are checked against the hint cache. If one or more conditions have applicable hints in the cache, they are included. If applicable hints are not found in the cache, the conditions are analyzed and the preference orders are compared to determine the best possible hint.

4.2.5.3 How Reformatted Queries Optimize Searches

The majority of queries in an Oracle Content Server instance involve a small, targeted set of content items or return a hundred rows, at most. Oracle Content Server software can easily scale to millions of content items. However, testing on an Oracle database with a collection containing 10 million content items indicates that the execution plan that Oracle selects is not the most efficient. Oracle generally does not choose the best optimization strategies to resolve many queries, even some that are trivial. The following examples explain this issue:

Because a fairly selective index is available (dRevClassID_2 for Revisions.dRevClassID), this query should access dRevClassID_2 and perform a sort on the rows that match the dRevClassID. However, in this query example, Oracle chooses to use the Revisions.dID index.

This choice is actually worse than performing a full table scan on the Revisions table because it does a full index scan and accesses the table to obtain the dRevClassID for each row. Obviously, resolving the query using this execution plan does not work well when the Oracle Content Server repository has over 10 million content items. In this case, it requires approximately 500 seconds to return the results.

However, the performance improves dramatically when the query is modified by adding a hint as follows:

The query is modified by adding the following hint to the SELECT clause:

/*+ INDEX(Revisions dRevClassID_2)*/

This forces Oracle database to choose the dRevClassID_2 index instead of the index for Revisions.dID. Because no more than a few content items share dRevClassID in this example, the modified query returns the results instantly.

4.2.5.3.2 Example 2: Reformatting a Query by Adding Multiple Hints

In a typical Oracle Content Server instance, most documents have a 'Y' (released) status for the dReleaseState with a dInDate earlier than the current date. However, only a few documents have an 'N' (new, not yet indexed) status for the dReleaseState. The following query is searching for content items that have not yet been released:

SELECT dID
FROM Revisions
WHERE Revisions.dReleaseState = N'N' AND Revisions.dStatus in
(N'DONE', N'RELEASED', N'DELETED')
AND Revisions.dInDate<={ts '2005-02-23 17:46:38.321'}

4.2.5.4 Types of Recognized Hints

Oracle Content Server queries can be static queries defined in various resources, data sources with additional dynamic WHERE clauses, and dynamic queries that are ad-hoc or defined in the application such as Archiver. Static queries can be updated with Oracle database hints. However, it is nearly impossible to predefine hints for ad-hoc queries and dynamic WHERE clauses.

Content Server hints use a database-neutral hint syntax that supports multiple hints in the same query. An Oracle Content Server hint can be used in any query, data source, and WHERE clause. However, it cannot be combined with an Oracle database hint. If a query contains both types of hints, Oracle Query Optimizer will retain the Oracle database hint and ignore the Content Server hint.

4.2.5.5 Query Hints Syntax

During the optimization processing stages, Oracle Query Optimizer recognizes the distinct syntaxes of both types of hints and correspondingly processes the submitted query. For more detailed information, see Section 4.2.5.2, "Query Optimization Process."

4.2.5.5.1 Oracle Hint Syntax

An Oracle hint uses the following format:

/*+ hint */

For example:

/*+ Index(Revisions dID)*/

4.2.5.5.2 Oracle Content Server Hint Syntax

The Oracle Content Server hint syntax is database neutral and can support multiple Oracle Content Server hints in the same query. During the optimization process, Oracle Content Server hints are evaluated and the best hints are formatted and added back to the query.

During the optimization process, a query that includes one or more Oracle Content Server hints is not parsed. Only Oracle Content Server hints are considered when choosing indexes.

Oracle Content Server Hint Syntax:

When a query undergoes the optimization process, Oracle Content Server hints are added to the reformatted query using the following syntax:

/*$tableName[ aliasName]:columnName[:operator [:<value>]][, …]*/

Where:

Values enclosed in angle brackets (<value>) are required.

Values enclosed in brackets ([value]) are optional.

Ellipses (…) indicates a repetition of the previous expression(s).

Query Before Optimization Process:

SELECT *
FROM Revisions, DocTypes, RoleDefinition
WHERE /*$Revisions:dStatus*/(Revisions.dStatus<>'DELETED' AND Revisions.dStatus<>'EXPIRED' AND Revisions.dStatus<>'RELEASED') AND Revisions.dDocType = DocTypes.dDocType AND /*$Revisions:dReleaseState*/Revisions.dReleaseState<>'E' AND (Revisions.dSecurityGroup = RoleDefinition.dGroupName AND RoleDefinition.dRoleName = ? AND RoleDefinition.dPrivilege > 0)

Reformatted Query with Oracle Content Server Hints Added:

After the query has undergone the optimization process, both indexes are used and are added to the native indexes.

4.2.5.6 Additional Supported Sort Constructs

Using Oracle sort constructs in search query clauses allows users greater flexibility when performing a query. Sort constructs specify the row data in two or more tables to be extracted, sorted, and combined. Essentially, the sort constructs serve the purpose of limiting the number of rows that are returned. Oracle Query Optimizer recognizes the following sort constructs:

Group by: Sorts a set of records and specifies how to group the results.

Order by: Sorts a set of records and specifies whether the results are to be returned in ascending or descending order.

Inner join: Sorts a set of records by looking for and returning those that match.

Outer join: Sorts a set of records by looking for and returning those that do not match.

4.2.5.7 The Hint Rules Table

The Hint Rules Table contains the rules that the optimization feature uses to determine the proper hints to add to dynamic queries or data sources during the optimization process. Using the Edit Hint Rules Form, a hint rule can be defined for a particular field and operator. A hint rule can also be defined based on values or date/number ranges. The hint rule table is extensible by other components, and can be updated while the Oracle Content Server instance is running.

The Hint Rules Table is scheduled to reload every night, and when a rule is added or modified. The hint value is recalculated at each reload.

Important:

Although the Hint Rules Table includes a column allowing multiple indexes to be used with each other, in Oracle only the bitmap index can be combined. This is because the Hint Rules Table was designed for core Content Server functionality.

Therefore, it might not be sufficient for a system with components that create additional tables or add additional metadata fields, or both. However, the Hint Rules Table can be extended or overwritten by other components to provide knowledge of additional tables, indexes and fields.

Explanation of First Hint Rule:

For this rule, if the WHERE clause contains the following condition the PK_Revisions index is used and added as a hint to the optimized query:

Revisions.dID = some_value

Explanation of Second Hint Rule:

For this rule, if the WHERE clause contains either of the following conditions the dDocName index is used and added as a hint to the optimized query:

Revisions.dDocName = some_value
Revisions.dDocName LIKE 'some_value'

Explanation of Third Hint Rule:

For this rule, if the WHERE clause contains the following condition the condition does not meet the requirements and cannot be qualified:

dStatus = 'DONE'

However, if the WHERE clause contains the following condition the dStatus index is used and added as a hint to the optimized query.:

dStatus = 'RELEASED'

4.2.5.8 Hint Rules Table Column Descriptions

4.2.5.8.1 Key

This column contains the unique name to identify the rule. A component can use the unique key to overwrite a particular rule. This key is usually identical to its index name because the index name is unique in the same database schema.

By default, Oracle uses a B+ Tree (binary tree) as the indexing structure to provide efficient access to logical records. B+ Tree indexes are most useful for queries involving a small number of result rows or when the user needs to execute queries using varying criteria (such as equality and range conditions). Because B+ Tree indexes store the indexed data values, these indexes are useful as sources of data if the requested value is the stored value.

However, bitmapped indexes offer substantial performance improvements with minimal storage cost compared to the default B+ Tree indexes. Bitmapped indexes are particularly effective for searching columns with poor selectivity due to having very few distinct values. Also, a bitmap is built for each value including the NULL value (which means the NULL is indexed). Overall, using bitmapped indexes is very efficient because the index lookup process is a bit-level operation and allows access to multiple indexes.

Note:

Because hint rules can be overwritten, Oracle Query Optimizer does not allow you to add a hint rule using an existing key. Therefore, it is important when you are creating your bitmapped indexes for columns that you assign unique keys.

Oracle recommends that you use bitmapped indexes for the table columns listed below, and set the index name to the corresponding column name.

Revisions table:

dIndexerState

dReleaseState

dProcessingState

dIsCheckedOut

dSecurityGroup

dStatus

WorkflowDocuments table:

dWfDocState

4.2.5.8.2 Table

This column identifies the specific database table.

4.2.5.8.3 Column

This column identifies the specific column within the database table listed in the Table column.

4.2.5.8.4 Operators

This column is a comma-delimited list of allowable operators. For more information about the valid operator options, see the Operators field and menu on the Hint Rules Configuration Page. The hint rule's operator is important in the decision of whether a hint rule will be applied to a condition.

For example, if the WHERE clause contains the following condition using the PK_Revisions index would be a very valuable hint to include in an optimized query:

Revisions.dID = 3

However, if the WHERE clause contains the following condition then using the PK_Revisions index would not be useful:

Revisions.dID > 3

4.2.5.8.5 Index

This column identifies the specific index to use in the optimized query if the condition meets the hint rule requirements.

4.2.5.8.6 Order

This column contains the preferred order to use when the rule is included in the Hint Rules Table. The highest ordered rules in a query are given precedence when deciding which hint to use.

The order values include:

5: This value indicates that the specified index is unique or does not match more than 50 rows for any value. For example, specifying dID with the Revisions, Documents, or DocMeta tables.

4: This value indicates that the specified index should be somewhat less selective. The specified value should typically match a few rows and, at the very most, several hundred rows. For example, specifying dDocTitle with the Revisions table.

3: This value indicates that the specified index matches less than a thousand rows. For example, specifying dInDate or dOutDate.

2: This value indicates that the specified index matches less than ten thousand rows.

1: This value indicates that the specified index matches more than ten thousand rows.

4.2.5.8.7 Values

This column is Idoc scriptable. This column can only be defined when the Operators column value is one of the following:

in or notIn: When you use either of these operators, the value should be a comma-delimited list enclosed in parenthesis.

range: When you use this operator, the value must use one of the following formats:

The operators in or notIn can substitute for the operators equal and notEqual, respectively, along with their matching values. For more information about operator options, see the Operators field and menu on the Edit Hint Rules Form.

The following use cases demonstrate how this column provides additional flexibility to the hint rules:

Use Case 1: State or Status Table Columns

Table columns that indicate a state or status such as dReleaseState or dStatus are biased regarding the finished states. For example, dReleaseState is predisposed for 'Y' (released) or 'O' (old version). Likewise, dStatus is predisposed for RELEASED. Therefore, in WHERE clauses, conditions such as dReleaseState = Y or dStatus = RELEASED match the majority of rows in the Revisions table. Thus, indexes for these two columns are almost useless. Conversely, the condition dReleaseState = N (new, not yet indexed) matches only a few rows. Consequently, indexes on this column would be very helpful.

Use Case 2: Date or Number Table Columns

Table columns that indicate a date or number exhibit similar behavior to state or status. For example, the condition dInDate < <$dateCurrent()$> matches most of the table rows and makes indexes on this field irrelevant. However, the combined conditions dInDate < <$dateCurrent()$> AND dInDate > <$dateCurrent(-1)$> usually match only a small set of rows and would benefit from using the corresponding index as a hint.

4.2.5.8.8 AllowMultiple

This column indicates whether the defined index is used with other indexes. In Oracle, only the bitmap index can be combined.

4.2.5.8.9 Disabled

This column indicates whether a hint rule has been disabled. Any rule in the table can be enabled/disabled. If you disable a hint rule, a value of 'Y' is displayed. Existing rules can be disabled to match the current Content Server state.

For example, if a Content Server instance contains only a few distinct content revClasses, each revClass may have thousands of revisions. Therefore, the dRevClass_2 index is not very effective. In this case, this corresponding hint rule should be disabled and you should add one or more new rules with different preference orders.

Note:

Although any rule in the table can be enabled/disabled, only the rules that are added using the Edit Hint Rules Form can be removed. The default hint rules that are included with the Oracle Query Optimization feature can only be disabled; they cannot be removed.

4.2.5.9 Edit Hint Rules Form

The Edit Hint Rules Form provides a way to add, remove, enable, or disable rules using the Hint Rules Configuration Page. You can add a new rule to reflect new tables and indexes. Existing rules can be removed or disabled to match the current state of Content Server. If you select a hint rule from the hint rule table, the Edit Hint Rules Form fields are automatically populated with the applicable values.

The hint rules configuration table is scheduled to reload every night and whenever a new rule is added or an existing rule is modified. The hint value is recalculated at each reload.

Although any rule in the table can be enabled/disabled, only the rules that are added through the Edit Hint Rules Form can be removed. The default hint rules that are included with the Oracle Query Optimizer component can only be disabled; they cannot be removed.

4.2.5.10 The Hint Cache

Oracle Query Optimizer also contains a hint cache to store dynamically generated hints. For example, a hint derived from a parsed query or data source is cached to maintain persistence. In this way, the hint cache provides stability for queries and data sources.

The hint cache is used during the optimization process to select hints for queries that do not contain Oracle or Content Server hints. The hint cache provides a mechanism to fine tune query hints. In addition, administrator can check/edit cache and change hint for queries at run time.

The hint cache is stored to disk every two hours and is reloaded when the Oracle Content Server instance is started.

4.2.5.10.1 Reusing Hint Cache Entries

The same query matches the same cache entry regardless of its values unless the new value does not satisfy the hint rule conditions. Two examples are included below to demonstrate how the same hint cache entry can and cannot be used for multiple queries.

Example 1: Using Similar Hint Cache Entries

In the following two queries, the same hint cache entry is used because both queries match the hint rule requirements.

QueryA:

SELECT *
FROM Revisions
WHERE dDocName = 'name1'

QueryB:

SELECT *
FROM Revisions
WHERE dDocName = 'name2'

Example 2: Using Different Hint Cache Entries

In the following two queries, the same hint cache entry cannot be used because QueryB violates the requirements for the dReleaseState hint rule. The dReleaseState hint rule requires that the dReleaseState values are neither Y (released) nor O (old revision).

4.2.5.10.2 Hint Cache Management

In the hint cache, you can add a new entry, edit an existing entry, or remove an existing entry using the Hint Cache Updater Page. When adding or editing hint cache entries, you must use the Oracle Content Server Hint Syntax. The ability to manage the hint cache is very useful for fine tuning query hints. The example below demonstrates the benefits of fine tuning a hint cache entry.

Example: Batchloading Unindexed Content

If you have just batchloaded 100K content items into the Content Server and they are not yet indexed, the index-based query used above (Example 2: Using Different Hint Cache Entries) would match all of the batchloaded documents.

QueryA:

If most of the batchloaded documents have not been indexed, the dReleaseState index that is used in this query is not the best choice. For the best results in this case, you should fine tune the hint cache entry to use both the dReleaseState and the dStatus indexes. Use the Hint Cache Updater Page to update hint cache entries.

SELECT dID
FROM Revisions
WHERE Revisions.dReleaseState = N'N' AND Revisions.dStatus in (N'DONE', N'RELEASED', N'DELETED') AND Revisions.dInDate<={ts '2005-02-23 17:46:38.321'}

4.2.5.10.3 Default Capacity Algorithm

By default, the hint cache has a maximum capacity of 1000 hints. The hint cache uses the midpoint insertion least-recently-used (LRU) algorithm which is similar to the one used by Oracle and mySQL. A new entry is inserted into the middle of the queue and each subsequent execution moves the entry up one spot.

When the number of hints in the cache exceed the maximum capacity, the entry at the bottom of the queue is removed from the cache. Thus, the LRU algorithm ensures that the most recently executed query hints are in the upper levels of the queue.

4.2.5.10.4 Origin of Hint Cache Keys

The hint cache key is generated from the normalized query; see Section 4.2.5.2.3, "Stage 3: Normalization." It consists of the qualified columns (columns that are qualified by table/alias names) and columns that have a hint rule defined. The cache key excludes conditions that contain joins or subqueries.

The following example illustrates how the cache key is generated from a given query:

SELECT DocMeta.*, Documents.*, Revisions.*
FROM DocMeta, Documents, Revisions
WHERE DocMeta.dID = Revisions.dID AND Revisions.dID=Documents.dID AND Revisions.dDocName='abc' AND Revisions.dStatus<>'DELETED' AND (Revisions.dReleaseState='U' OR Revisions.dReleaseState='I' OR Revisions.dReleaseState='Y') AND Documents.dIsPrimary<>0

4.2.5.10.5 Hint Cache Persistence

The hint cache is designed to be persistent. To ensure the persistence, the hint cache is saved to the file system every two hours. The persisted hint cache is reloaded when the Oracle Content Server instance is started.

The Hint Rules Table is refreshed and the new hint rule is added. The modifications are effective immediately.

Delete the old hint rule.

4.2.5.11.3 Disabling Hint Rules

Although any rule in the table can be enabled/disabled, only the rules that are added through the Hint Rules Configuration Page can be removed. The default hint rules that are included with the Oracle Query Optimization feature can only be disabled; they cannot be removed.

All of the applicable fields in the Edit Hint Rules Form are populated with the hint rule's values.

Click the Disable button.

The Hint Rules Table is refreshed and 'Y' is displayed in the Disabled column, indicating that the hint rule is deactivated.

4.2.5.11.4 Enabling Hint Rules

Although any rule in the table can be enabled/disabled, only the rules that are added through the Hint Rules Configuration Page can be removed. The default hint rules that are included with the Oracle Query Optimization feature can only be disabled; they cannot be removed.

All of the applicable fields in the Edit Hint Rules Form are populated with the hint rule's values.

Click the Enable button.

The Hint Rules Table is refreshed and the Disabled column is clear, indicating that the hint rule is reactivated.

4.2.5.11.5 Removing Hint Rules

Although any rule in the table can be enabled/disabled, only the rules that are added through the Hint Rules Configuration Page can be removed. The default hint rules that are included with the Oracle Query Optimization feature can only be disabled; they cannot be removed.

4.2.5.12.3 Editing a Converted Data Source or Query

After the data source or query is converted, the results are displayed above the Use Data Source checkbox. Because the conversion process clears the fields, the converted query can only be modified by entering new information in the fields. To edit information for a data source or query, see the applicable sections in Section 4.2.5.12.1, "Converting a Data Source."

4.2.5.13.4 Modifying an Existing Hint Cache Query Using Data Source

The data source-related fields are displayed on the Hint Cache Updater page.

Select the desired data source from the DS Name menu.

The data source query is displayed in below the DS Name field.

Enter the applicable information for the additional parameters, WHERE clause, and hints.

Click Update Cache to overwrite the previous hint cache.

The results are displayed in a text box above the Use Data Source checkbox. To see an example of successfully adding a new hint to a query and updating the hint cache, see the screen capture included in this section.

4.2.5.13.5 Modifying an Existing Hint Cache Using a Query

To modify a hint cache using a query:

On the Hint Cache Page, ensure that the Use Data Source checkbox is unselected.

The data source-related fields are hidden from the Query Converter page.

Enter the applicable information.

Click Update Cache to overwrite the previous hint cache.

The results are displayed above the Use Data Source checkbox. In the screen capture note that the new hint was added and the hint cache was updated.

4.3.1 Introduction to the File Store System

With the release of version 11gR1, the Oracle Content Server system implemented a file store system for data management, replacing the traditional file system for storing and organizing content. The FileStoreProvider component exposes the file store functionality in the Oracle Content Server interface and allows additional configuration options. For example, you can configure the Oracle Content Server instance to use binary large object (BLOB) data types to store content in a database, instead of using a file system. This functionality offers several advantages:

Helps overcome limitations associated with directory structure and number of files per directory in a file system approach.

Aids in distributing content more easily across systems, for better scaling of the Oracle Content Server system.

Allows for different types of storage devices not commonly associated with a file system, for example, content addressed storage systems and write-only devices necessary in some business uses.

Caution:

The FileStoreProvider component is installed, enabled, and upgraded by default during Oracle Content Server deployment. It should not be uninstalled or disabled after the default file store is upgraded. For more information, see Section 4.3.2, "FileStoreProvider Upgrade."

4.3.1.1 Data Management

The Oracle Content Server system manages content by tracking the storage of electronic files and their associated metadata. It provides the ability for users to store and access their checked in files, any associated information, and any associated renditions. This section discusses the data management methods historically used by the Oracle Content Server system and how they are addressed with the FileStoreProvider component.

4.3.1.1.1 File Management

The first half of data management is storing electronic files checked into an Oracle Content Server repository. With the Oracle Content Server system, file storage has typically been done with a traditional file system, storing electronic files in a hierarchical directory structure that includes vault and weblayout directories. By using the revision information specified by the content type, security group, and account (if used), files and their associated renditions are placed into particular directories within the vault and weblayout directories. For example, the primary and alternate files specified at check in are stored in subdirectories in the vault directory. The specific file location is defined to be the following:

IntradocDir/vault/dDocType/account/dID.dExtension

In this path name, dDocType is the content type chosen by the user on check in, dID is the unique system-generated identification that identifies this revision, and dExtension is the extension of the file checked in. In this hierarchical model, the system uses the dDocType metadata field to distribute the files within the hierarchy established in the vault directory. Similarly, any web rendition is distributed across the hierarchy within the IntradocDir/weblayout/groups directory. The web rendition is the file served out of a web server, and in the historical file system storage method, could be the native file, the alternate file, or a web-viewable file generated by Inbound Refinery or some other conversion application.

This straightforward determination of file storage location is helpful to component and feature writers, helping them understand where files are located and how to manipulate them. However, it also has the effect of limiting storage management. Without careful management of the location metadata, directories can become saturated, causing the system to slow down.

4.3.1.1.2 Metadata Management

The second half of data management is storing metadata associated with an electronic file. With the Oracle Content Server system, metadata management has typically been done using a relational database, primarily involving three database tables. Metadata enables users to catalogue content and provides a means for creating file descriptors to facilitate finding it within the Oracle Content Server repository. For users, the retrieval is done by the Oracle Content Server system, and how and where the file is stored can be completely hidden. For component and feature writers, who may need to generate or manipulate files, the metadata provides a robust means of access.

4.3.1.1.3 File Stores

The traditional file system model historically used by the Oracle Content Server system limits scalability. As data management needs grow, adding extra storage devices to increase storage space is not conducive to easy file sharing through a web-based interface. Complex, nested file structures could slow performance. Suppressing the creation of a duplicate web-viewable file when the native file format could be used could be difficult. As a consequence of dealing with large systems, for example over 100 million content items, the Oracle Content Server system has shifted to using a file store. This offers the advantages of scalability, flexibility, and manageability.

4.3.1.2 FileStoreProvider Features

The FileStoreProvider component enables you to define data-driven rules to store and access content managed by the Oracle Content Server system. FileStoreProvider offers the following features:

The ability to relocate files easily.

The ability to have the web-viewable file be optional.

The ability to manage and control directory saturation.

The ability to integrate with third-party storage devices.

An API to use, extend, and enhance different storage paradigms.

With FileStoreProvider, checked-in content and associated metadata are examined and assigned a storage rule based on criteria established by a system administrator. Criteria can include metadata, profiles, or other considerations. The storage rule determines how vault and web files are stored by the Oracle Content Server system and how they are accessed by a web server.

4.3.2 FileStoreProvider Upgrade

The FileStoreProvider component is installed, enabled, and upgraded by default for a new Oracle Content Server instance (with no documents in it). The upgrade includes creation of metadata fields with default values for the file store system (DefaultFileStore). Earlier versions of the Oracle Content Server software did not automatically upgrade the file store system, and sites could choose to not use the file store system, to not upgrade to use FileStoreProvider, and to uninstall the FileStoreProvider component and metadata fields.

If you do not want to upgrade FileStoreProvider from your current settings, prior to installation you must enter the configuration variable FsAutoConfigure=false in the Additional Configuration Variables field on the General Configuration page of the Oracle Content Server Admin Server page.

4.3.2.1 DefaultFileStore Settings

An Oracle Content Server system containing no documents and with the FileStoreProvider component automatically upgraded uses these DefaultFileStore settings.

The dispersion field is added in Path information for the storage rule and can be edited. The Web-viewable Path and Web URL File Path fields cannot be edited. The dispersion rule is added in web paths at the $dispersion$ location.

The dispersion rule allows you to specify :b for base 64 encoding of that part of the URL. For example, the following dispersion rule encodes the two parts to be base 64:

$dRevClassID[-9:-6:0:b]/$dRevClassID[-6:-3:0:b]

An Oracle Content Server system containing documents and which does not have the FileStoreProvider component upgrade will return an informational message that the Revisions table is not empty, therefore dispersion for the default storage rule is not set for DefaultFileStore.

4.3.2.2 Empty Storage Rule

If a site has used an earlier version of the Oracle Content Server system without using the file store system, then upgraded and implemented the FileStoreProvider component, or a site has uninstalled the FileStoreProvider component completely and also removed the metadata fields added by FileStoreProvider, then when a user checks in a document it will not have an associated storage rule (no xStorageRule field). When FileStoreProvider is implemented after these types of situations, users will find that documents checked in before FileStoreProvider was implemented will have an empty xStorageRule field. To fix this situation, users must perform an Update to the Content Information for those documents. The documents will be updated to the default value of the xStorageRule field and will be moved to the location specified by the storage rule. For details on xStorageRule, see Section 4.3.3.1.2, "Oracle Content Server Metadata Fields."

4.3.3 Configuring FileStoreProvider

A file store for data management is used in the Oracle Content Server system instead of the traditional file system for storing and organizing content. The FileStoreProvider component is installed and enabled by default during Oracle Content Server deployment. The FileStoreProvider component automatically upgrades the default file store (DefaultFileStore) to make use of functionality exposed by the component, including modifying the web, vault, and web URL path expressions.

Note:

Partitions are not required to run the Oracle Content Server system, but any attempt to check in content before creating a partition, changing the vault path root, or creating a new, well-formed storage rule will fail. For more information, see Section 4.3.3.3, "Understanding FileStoreProvider Storage Principles", including the sections on storage rules and path construction.

Note:

Oracle WebLogic Server does not support configuring its web server for the Oracle Content Server instance to add a new virtual directory and alias to point to the weblayout directory for each partition that is created. Partitions can be used for the vault files, and partitions are supported for web files.

Caution:

Resource files should not be edited directly. Proper modification of resource files should be done within the Oracle Content Server user interface or through additional component development. For more information on component development, see Chapter 6, "Managing Components".

Three resource tables are used to define and handle file paths. The defaults for the PathMetaData Table and PathConstruction Table cover most scenarios. The StorageRules Table stores the values specified when a storage rule is defined. These three tables are provider-specific, and as such are defined in the provider.hda file of the defaultfilestore directory. The defaultfilestore directory is located in the IntradocDir/data/providers directory. A fourth table, the FileSystemFileStoreAlgorithmFilters Table, requires a component along with Java code to modify.

4.3.3.1.1 Database Options

In some situations, content stored in a database may have to be forced onto a file system. One example would be when Oracle Inbound Refinery must have access to a file for conversion. Files forced onto a file system are considered temporary cache. The following configuration values are used to control when the temporarily cached files are to be cleaned up. Note that the system only cleans up files that have an entry in the FileCache Table.

Variable

Description

FsCacheThreshold

Specifies the maximum cache size, in megabytes. The default is 100. When the threshold is met, the Oracle Content Server instance starts deleting files that are older than the minimum age, as specified by the FsMinimumFileCacheAge parameter.

FsCleanUpCacheDuringIndex

Specifies if the cache will be cleaned during the indexing cycle. The default is false.

FsCleanUpCacheIndexingMax

Specifies the number of cache files to delete in each indexing cycle, which limits the load on the cycle. The default is to delete all eligible cache files for the indexing cycle.

FsMaximumFileCacheAge

Specifies the maximum age at which files are cached, expressed in days. The default is 365.

FsMinimumFileCacheAge

Specifies the minimum age at which cached files can be deleted, expressed in days. The default is 1. This parameter is used in conjunction with the FsCacheThreshold parameter to determine when to delete cached files.

4.3.3.1.2 Oracle Content Server Metadata Fields

FileStoreProvider adds several Oracle Content Server metadata fields and makes additional options available for use in configuration files.

xPartitionId: This metadata field is used in conjunction with the PartitionList table to determine the root location of the content item files. It is recommended that this field be hidden on the user interface, because the partition selection algorithm provides a value.

xWebFlag: This metadata field is used to determine whether a content item has a web-viewable file. Consequently, if the system has content items that have only vault files, then removing this metadata field causes the system to expect the presence of a web-viewable and may cause harm to the system. The metadata field can be specified by the configuration value WebFlagColumn.

xStorageRule: This metadata field is used to track the rule that was used to determine how the file is to be stored. The metadata field may be specified by the configuration value StorageRuleField.

Note:

These metadata fields are added by FileStoreProvider on startup and if deleted are added again when the Oracle Content Server instance restarts. If the metadata fields must be permanently deleted, set the configuration variable FsAddExtraMetaFields=false in the intradoc.cfg file to disable the automatic creation of the fields. The intradoc.cfg file is located in the DomainHome/ucm/cs/bin directory.

Setting the Default Storage Directory

A StorageDir parameter can be set equal to a root directory, used for all partitions where the PartitionRoot column value has not been specified. In this case the storage directory and the partition name is used to create the PartitionRoot parameter. The StorageDir parameter is set in the intradoc.cfg file, located in the DomainHome/ucm/cs/bin directory.

Standard FileStoreProvider Variables

In the provider.hda located in the IntradocDir/data/providers/defaultfilestore directory, the following parameters and classes are standard for a file system store:

4.3.3.2 Working with FileStoreProvider

When the FileStoreProvider default file store is upgraded, checked-in content and associated metadata are examined and assigned a storage rule based on criteria established by the system administrator. Criteria can include metadata, profiles, or other considerations. The storage rule determines how vault and web files are stored and accessed by Oracle Content Server software and how they are accessed by a web server. Files can be stored in a database or placed on one or more file systems or storage media. Partitions can be created to help manage storage location, but are not required.

Caution:

The FileStoreProvider component should not be disabled once it has been used with the Oracle Content Server system.

4.3.3.2.1 Adding or Editing a Partition

You can create partitions to define additional root paths to files managed by the Oracle Content Server system but requiring storage in different locations or on different types of media. You create partitions using the Partition Listing Page. When a new partition is created, the Oracle Content Server system modifies the PartitionList resource table in the fsconfig.hda file, located in the IntradocDir/data/filestore/config/ directory.

Note:

Oracle WebLogic Server does not support configuring its web server for the Oracle Content Server instance to add a new virtual directory and alias to point to the weblayout directory for each partition that is created. Partitions can be used for the vault files, and partitions are supported for web files.

To add a partition to the Oracle Content Server instance, perform these steps:

If there are records associated with the storage rule being edited, then the following rules can not be modified: FsWeblayoutDir (the weblayout directory) and FsHttpWebRoot (the HttpWebRoot and URL prefix).

If the web root used in the web URL file path defined in the storage rule is something other than the default weblayout directory defined for the Oracle Content Server system, you must add an alias or virtual directory in your web server for the web root used in the storage rule. Otherwise, the Oracle Content Server system does not know where to access the file. For information on adding virtual directories to your web server, see the documentation that came with your web server.

4.3.3.3 Understanding FileStoreProvider Storage Principles

When a content item is checked in to the Oracle Content Server system, it consists of metadata, a primary file selected by the user, and potentially an alternate file. The alternate file may also be selected and checked in by the user, and is presumed to be a web-viewable file. In a file system approach to the Oracle Content Server system, the primary file is stored in the vault directory at the root of the DomainHome directory and is called the native file. If an alternate file is checked in, it is also stored in the vault, but is copied to the weblayout directory or passed to a conversion application, such as Oracle Inbound Refinery. If no alternate file is checked in, then the native file is copied from the vault directory to the weblayout directory, existing in two places. If no alternate file is checked in and Oracle Inbound Refinery is installed, a rendition of the native file could be created and stored in weblayout directory.

In a file system approach to the Oracle Content Server system, storing content in specified directories defines a path to the content. You can access content from a browser by using a static web URL file path, when you know the content is in a specific location, or using a dynamic Oracle Content Server service request, such as GET_FILE, when you do not. With FileStoreProvider, content may or may not be stored in a file system. Consequently, a new approach to defining paths to the content must be taken.

Depending on how you set up FileStoreProvider, you may or may not have a static web URL. By using a dynamic Oracle Content Server service request, you can access content when you do not know the specific location. With FileStoreProvider, the static web URL is defined as the web URL file, and the dynamic access is simply called the web URL. Using the FileStoreProvider user interface, you can configure only the static web URL file path. However, you can decide to have the static web URL done as an Oracle Content Server service request, essentially making it dynamic.

4.3.3.3.1 Using Storage Rules on Renditions to Determine Storage Class

When content is checked in, all versions of the content managed by the Oracle Content Server system are considered renditions. These renditions include the native file, web-viewable file, and any other files that may have been rendered by Inbound Refinery or third-party conversion applications.

Renditions are grouped together into a storage class, which determines where and how a rendition is accessed. Storage classes are grouped together into a storage rule, which defines the vault, web, and web URL path expressions, through a storage class. Additionally, a storage rule determines if a rendition is not stored, as in a web-less file store, or if it is stored in a different device, such as a database rather than a file system.

The following examples illustrate how storage rules can determine where and how different content items can be stored.

Example 1

A storage rule is defined as File system only on the Storage Rule Name Dialog and Is Webless File Store is selected. In this scenario, the system makes a copy of the primary files and places them in the weblayout directory.

This traditional file system storage example typically offers the advantage of faster access time to content when compared with database storage. This advantage diminishes if the file system hierarchy is complex or becomes saturated, or as the quantity of content items increases.

Example 2

A storage rule is defined as File system only on the Storage Rule Name Dialog and Is Webless File Store is selected. In this scenario, no copy is made of the primary files and so the native files are the only renditions. Requests for web-viewable files are routed to the native files stored in the vault.

Note:

The web-less option of FileStoreProvider can specify that no web rendition be created. When this is used in conjunction with Inbound Refinery, a web rendition is always created and stored in either the file system or the database, depending on the storage rule in effect.

This traditional file system storage example, like the previous one, offers the advantage of faster access time to content. It also saves on storage space by not copying a version of the content from the vault directory to the weblayout directory. Instead, it redirects web-viewable access to the content in the vault directory. This is useful if most of the native files checked in are in a web-viewable format, or if the Oracle Content Server system is being used to manage content that is not required to be viewed in a browser.

Example 3

A storage rule is defined as JDBC Storage on the Storage Rule Name Dialog and no selection is made from the Renditions choice list. In this scenario, both the vault and web files are stored in the database.

This database storage example offers the advantage of integrating repository management with database management for consistent backup and monitoring processes, and helps overcome limitations associated with directory structure and number of files per directory in a file system approach.

Important:

When necessary, content items stored in a database can be forced onto the file system, for example, during indexing or conversion. The files on the file system are treated as temporary cache and deleted following the parameters specified in the config.cfg file located in the IntradocDir/config directory. For more information on the parameters used, see Section 4.3.4.7, "FileCache Table."

Example 4

A storage rule is defined as JDBC Storage on the Storage Rule Name Dialog and Web Files is selected from the Renditions choice list. In this scenario, the vault files are stored in the database and the web files are permanently stored on the file system.

This mixed approach of storing native files in a database but web-viewable files on a file system offers the advantages of database storage in the previous example (integrated backup and monitoring, overcoming file system limitations) for the native files, while providing speedy web access to web-viewable renditions. Like the first example, this advantage can be diminished if the file system structure is overly complex, or the quantity of files is extreme.

4.3.3.3.2 Understanding Path Construction and URL Parsing

The path to content stored in the Oracle Content Server system is defined in the PathExpression column of the PathConstruction Table. Paths are made up of pieces, with each piece separated by a slash (/). Each piece can be made of a static string or a sequence of dynamic parts. A dynamic part is encapsulated by a dollar sign ($). A part can be calculated using an algorithm, Idoc Script variable, environment variable, or a metadata lookup, and can have the following interpretations:

It can be a field defined in the PathMetaData table. If it is defined in the PathMetaData table, it can be mapped to an algorithm, for example, $dDocType$.

If it has the prefix #env., it is an environment variable, for example, $#env.VaultDir$ or $#env.WeblayoutDir$.

When parsed, the path expression turns into five pieces, interpreted according to the rules specified in the PathMetaData table, as follows:

$PartitionRoot$: mapped to the partitionSelection algorithm and uses the xPartitionId as a lookup into the PartitionList table to determine the partition root

/vault/: a string, so no calculation or substitution

$dDocType$: by the PathMetaData table this is a look up in the file parameters

$dDocAccount$: this is mapped to a documentAccount algorithm which takes dDocAccount and parses it into the standard Oracle Content Server account presentation with all the appropriate delimiters

$dID$$ExtensionSeparator$$dExtension$: this piece has three parts:

$dID$: similar to dDocType, this is defined in the file parameters and is a required field

$ExtensionSeparator$: determined by an algorithm and by default it returns '.'

$dExtension$: similar to dDocType

In the standard configuration for the web-viewable path, the URL contains variables to add the partition root to the web-viewable path, security, dDocType, and dispersion information, as well as the dDocName, rendition, and extension information. FsWeblayoutDir denotes $#env.WeblayoutDir$ by default.

In the standard configuration for the web URL file path, the URL contains variables to add the partition root to the web-viewable path, security, dDocType, and dispersion information, as well as the dDocName, rendition, and extension information. FsHttpWebRoot denotes $HttpWebRoot$ by default.

The groups separator indicates to the Oracle Content Server system that the directories that follow are the name of the security group and account to which the content item belongs. Accounts are optional and consequently computed by an algorithm. After the security information is the documents separator, which is immediately followed by the dDocType. Dispersion is optional. The last part of the URL is the dDocName, its rendition and revision information, and its format extension.

Because the URL is expected in this format, the Oracle Content Server system can successfully extract metadata from it. More importantly, it can determine the security information for the content item and derive the access privileges for a particular user.

The parsing guidelines have been expanded to allow for dispersion in the web directory. When $dRevClass$ is encountered, the system processes the dispersion information, then continues with dDocName and dWebExtension as before. This means that the system can now successfully parse URLs of the form:

4.3.4.1 PartitionList Table

The PartitionList table defines the partitions that are available for the partitionSelection algorithm. The table is defined in the fsconfig.hda file, located in the DomainHome/ucm/cs/data/filestore/config/ directory, and modified using the Add/Edit Partition Page in the Oracle Content Server user interface. The columns of the table are used as follows:

Column

Description

PartitionName

Specifies the name of the partition. This name is referenced in the path expression.

PartitionRoot

An argument passed into the partitionSelection algorithm.

IsActive

Determines if the partition is currently active and accepts new files.

CapacityCheckInterval

Specifies the interval in seconds used in determining the available disk space. This may not work on all platforms.

SlackBytes

Determines if there is sufficient space on a partition to store content. If the available space is lower than the slack bytes, the partition is deactivated and no longer used for contribution.

DuplicationMethods

Specifies how native files are treated when not converted to a web-viewable rendition.

copy (default): copies the native file to the web path.

link: Resolves the web path to the native file in the vault

Copy and Link rely on functionality of the operating system on which the Oracle Content Server instance is installed. As such, not all methods are available on all platforms

4.3.4.2 StorageRules Table

The StorageRules table defines the rules used for storing content items. The rule specifies which path expression to use for which storage class, and how content items are to be stored.

The table is defined in the provider.hda file, located in the DomainHome/ucm/cs/data/providers/defaultfilestore directory, and it can be modified using the Storage Rule Name Dialog in the Oracle Content Server user interface. The columns of the table are used as follows:

Column

Description

StorageRule

The name of the storage rule. Computed from a dynamic include and stored in the xStorageRule metadata field of a content item.

StorageType

Determines the storage implementation.

FileStorage: files are stored on the file system

JdbcStorage: files are stored in the database

IsWeblessStore

Used to specify if system allows web-less files.

true: By default, newly created content items do not have a web-viewable file. In certain circumstances it is necessary to insist on a web-viewable file. In such situations, an argument in the calling code can be used to specify that a web-viewable file must be created. Information regarding whether there is a web-viewable file is stored in the xWebFlag metadata field.

false: By default, newly created content items do have a web-viewable file.

RenditionsOnFileSystem

Used by JdbcStorage to determine if any files are to be stored on the file system instead of the database.

4.3.4.3 PathMetaData Table

The PathMetaData table defines what metadata is used to determine the location of a file. The metadata may come directly from a content item's metadata, or be calculated using an algorithm. The PathMetaData table is defined in the provider.hda file of the defaultfilestore directory. The defaultfilestore directory is located in the DomainHome/ucm/cs/data/providers/ directory.

The columns of the table are used as follows:

Column

Description

FieldName

Name of the field as it appears in the path expression.

GenerationAlgorithm

Specifies the algorithm used to resolve or compute the value for the field.

RequiredForStorage

Defines for which storage class the metadata is required.

#all: Both vault and web-viewable renditions require the metadata

web: Just the web-viewable rendition requires the metadata

vault: Just the native file rendition requires the metadata

The field is optional for all renditions not specified. Consequently, if this column is empty, then the metadata field is optional for all renditions or storage classes. If an algorithm has been specified, this value is empty. The algorithm uses the value specified in the ArgumentFields column to dictate which fields are required.

Arguments

Optional arguments passed into the algorithm specified in the GenerationAlgorithm field.

ArgumentFields

A comma-delimited list of fields required by the arguments defined in the Arguments column, and consequently required by the algorithm specified in the GenerationAlgorithm field.

weburl: Generated by the Oracle Content Server system. Tends to be GET_FILE.

weburl.file: Nicely constructed URL used to access the web-viewable rendition in a browser.

dispersion: Variable for dispersion of content on the file system.

FsWeblayoutDir: Variable for the web-viewable path for the weblayout directory.

FsHttpWebRoot: Variable for the web URL file path for the weblayout directory

PathExpression

Defines the path.

AutoCreateLimit

Specifies the depth of the directories that may be created.

StorageRule

Specifies to which storage rule this path construction belongs.

4.3.4.5 FileSystemFileStoreAlgorithmFilters Table

The FileSystemFileStoreAlgorithmFilters table is used to map an algorithm name to an implementation of the FilterImplementor interface. The algorithm can be referenced in the PathMetaData Table and is used to calculate the desired path field. The class implementing the algorithm must return the required metadata fields it uses for calculation, when the file parameters object is null. Through the ExecutionContext, the doFilter method is passed in information about the field, content item, and file store provider that initiated the call. In particular, for the file system provider, the algorithm will be passed the following information through the ExecutionContext. Bear in mind that other file store providers may choose to pass in more or possibly different information.

4.3.4.6 FileStorage Table

The FileStorage table is added to the Oracle Content Server system when FileStoreProvider is installed. It is used exclusively by the JdbcStorage storage type, when content is stored in a database. The FileStorage table contains the renditions of content items and uses the dID of the content item and rendition to uniquely identify what renditions belong to which content item.

4.3.4.7 FileCache Table

The FileCache table is added to the Oracle Content Server system when FileStoreProvider is installed. It is used exclusively by the JdbcStorage storage type to remember which renditions have been placed on a file system. Renditions stored in a database are placed on a file system when required for a specific event, for example indexing or conversion. These files are often temporary and deleted after a specified interval as part of a scheduled event.

4.3.5 FileStoreProvider Sample Implementations

This section list the contents of the tables contained in the provider definition file (provider.hda) for each of the examples. The provider.hda file does not need to be edited manually. Proper modification of the provider.hda file should be done within the Oracle Content Server user interface using the Add/Edit Partition Page, or through additional component development. The provided default options for other resource tables, such as PathMetaData Table, PathConstruction Table, and FileSystemFileStoreAlgorithmFilters Table, should have sufficient flexibility for most scenarios.

4.3.5.2.2 Defining the Path Construction

The second step is to define the path construction for each of the storage classes for the rule. In general, the last part of the path should be standard for all usage examples. If not, then the Oracle Content Server system does not work well with hcs* files. However, the root path can be changed without affecting functionality, assuming that changing the web URL file path root is properly acknowledged by the web server as a Oracle Content Server web root.

This setup only looks at the web path expression, which differs from the web URL only in its root. In other words, the web path is an absolute path on the file system, while the web URL is a URL served up by a web server.

Used by the PathMetaData table. This is a required field and must consequently be provided by the caller or descriptor creator. It is part of a content item's metadata information.

$dDocAccount$

This is mapped to a documentAccount algorithm which takes dDocAccount and parses it into the standard Oracle Content Server account presentation with all the appropriate delimiters.

/documents/

String.

$dDocType$

Used by the PathMetaData table. This is a required field and must consequently be provided by the caller or descriptor creator. It is part of a content item's metadata information.

$dispersion$

Implements dispersion of content on the file system.

!edisp

Indicates that dispersion has ended at the point where this marker is placed. Even if dispersion is empty there will be an ~edisp marker.

$dDocName$

Used by the PathMetaData table. This is a required field and must consequently be provided by the caller or descriptor creator. It is part of a content item's metadata information.

$RenditionSpecifier$

This is provided by the renditionSpecifier, which is only of interest if the system is creating additional renditions such as thumbnails. Otherwise, this returns an empty string.

$RevisionLabel$

The revision label is provided by the revisionLabel algorithm which, depending on the status of the content item, adds a '~dRevLabel' to the path.

$ExtensionSeparator$

The extensionSeparator algorithm is used here and by default it returns '.'.

$dWebExtension$

The dWebExtension is a required field for the web and web URL storage classes and is passed in through the file parameters.

4.3.5.3 Configuration for a Webless or Optional Web Store

In this example, the previous example storage rule is configured to have IsWeblessStore set to true and consequently the web-viewable file will not be created by default. However, if the document is processed through Inbound Refinery or WebForms or any other component that requires a web-viewable, the web file will be created. The location of the files is as above in the standard configuration. However, because a file might not have a web rendition, the web URL path must be adjusted. Also, note the use of weburl.file. This is used to compute the URL when the web-viewable actually exists. The metadata field xWebFlag is used to determine how the file is to be served up in the browser.

4.3.5.4 Configuration for Database Storage

To store files in the database, you need a storage rule that is of type JdbcStorage. By default, all content items belonging to this rule have their files stored in the database. However, even though the files are stored in the database, there is the presumption of an underlying file system and the system may need to temporarily cache a file on the file system. In particular, this may happen for indexing or for some conversions.

Tech Tip:

A rule can be configured to always store renditions belonging to a given storage class on the file system. This is most useful for systems that store vault files in the database, but web files on the file system.

4.3.5.4.1 Defining the Storage Rule

In the default rule below, all files are stored in the database, while the filesInWeb rule stores the vault files in the database and the web files on the file system.

4.3.5.5 Altered Path Construction and Algorithms

The previous examples have kept the file paths consistent with the standard configuration. For very large implementations, this can result in directory saturation and slow performance. The following examples aid in dispersing files over several storage options.

4.3.5.5.1 Using Partitioning

FileStoreProvider makes it easy to use partitions to create a sparser directory structure. By default, the xPartitionId metadata field is used and becomes a part of a content item revision's metadata information. It is recommended that this field is disabled on the Oracle Content Server user interface, instead letting the partition selection algorithm determine the partition to use. The partition selection algorithm looks at all the active partitions, and as a new content enters the system, the partitions are selected in order. Each partition has an entry in the PartitionList Table and can be declared active. The PartitionRoot is calculated from xPartitionId, where the value is a look up key into the PartitionList table. If no xPartitionId is specified, the system finds the next available and active partition and uses this value for the location calculation. The xPartitionId is then stored as part of the content item's metadata.

To use the partition selection, define the vault storage class in the PathConstruction table as follows:

Partitions can be deactivated using the "Add/Edit Partition Page" at any time if a system administrator needs to close a partition to contribution, for example if maintenance is required on the storage device.

4.3.5.5.2 Adding a Partition to the Weblayout Path

This example shows how to partition both vault and weblayout directories, and also maintain valid web URL file paths.

Add the partition root to the web-viewable path and web URL file path, and edit the variables $FsWeblayoutDir$ and $FsHttpWebRoot$ on the Storage Rule Name Dialog.

In order to keep the web URL file path consistent with the web-viewable path in the weblayout directory, the variable xPartitionId is used so that partition1 or partition2 is correctly replaced when creating the web URL file path.

Ensure that the web-viewable path and the web URL file path evaluate into the same path.

$FsWeblayoutDir$ represents $PartitionRoot$/weblayout/. For partition1 this evaluates to $#env.WeblayoutDir$/partition1/weblayout/. For partition2 this evaluates to $#env.WeblayoutDir$/partition2/weblayout/.

$FsHttpWebRoot$ represents $HttpWebRoot$/$xPartitionId$/weblayout/. For partition1 this evaluates to $HttpWebRoot$/partition1/weblayout/. For partition2 this evaluates to $HttpWebRoot$/partition2/weblayout/.

If you set up the partitions (partition1 and partition2) to use the partition root of $#env.VaultDir$/partition1 and $#env.VaultDir$/partition2 instead of the $#env.WeblayoutDir$ and $HttpWebRoot$ settings, then the weblayout file will end up stored in the vault directory. It then can be used only for partitioning the vault files.

4.3.5.5.3 Limiting the Number Files in a Directory

Another way of dispersing files is to alter the path so that files get partitioned out by the dRevClassID of the content item. In the example below, the directories are limited to 10,000 files plus extra files for additional renditions.

If your path expression contains $RevClassID[-12:-10:0]/$dRevClassID[-10:-8:0]$/$dRevClassID[-8:-4:0]$ and $dRevClassID is 1234567890, the result is 00/12/3456.

Note the $dRevClassID[-12:-10:0] in the path expression. This is interpreted as follows:

Get the characters starting at 12 back from the end of the string until you get the character 10 back from the end of the string.

Pad the resulting string to length 2, which 12-10, with 0 characters.

4.4 Mapping URLs with WebUrlMapPlugin

An Oracle Content Server system uses an Oracle WebLogic server, which has a built-in web server, to filter pages through a web browser. User requests are authenticated with the Oracle WebLogic server user store and communicated with the Oracle Content Server system.

The WebUrlMapPlugin component enables you to map shortened URLs to other URLs in g] using a substitution script for the mapping, which also enables you to map long URLs to abbreviated versions. The WebUrlMapPlugin component is installed (enabled) by default with the Oracle Content Server instance.

4.4.1 Script Construction

The shortened URLs that you can create generally use the following format:

http://myhostname.com/prefix/suffix

The actual mapping process is based on the part of the URL that follows the host name portion. To resolve the shortened URL, the Oracle Content Server instance compares the prefix to those in the list of defined WebUrlMapPlugin entries. If a match exists, the Oracle Content Server instance uses the map script that corresponds to the matching prefix to display the applicable document or Oracle Content Server page. For more information about the suffix, see Section 4.4.2, "Supported Variables for Referencing."

To construct a URL mapping entry using the WebUrlMaps Screen, you must establish a prefix and define the corresponding map.

Prefix

The prefix portion of the mapping entry is any abbreviation you want to use to identify URLs of a certain form. For example, if you want your short URL to return the dynamic conversions of documents, you can use idc as your prefix (for example, the abbreviated form of dynamic converter).

When you create your prefix, do not enter a slash (/) character at the beginning of it because the Oracle Content Server instance removes the first slash from the incoming URL before the prefix test is performed.

Caution:

Include a slash (/) at the end of your URL map prefix. Otherwise, your mappings can apply to many more URLs and interfere with standard Oracle Content Server operations.

Map

The map portion of the mapping entry is the Idoc Script code that the Oracle Content Server instance uses to resolve the shortened URL. You can use substitution tags (<!--$variable-->) in the map portion. Examples include:

<!--$cgipath-->

<!--$internetuser-->

<!--$suffix-->

These substitution tags are variables that refer to the applicable parameters of a URL.

Simple 'if' constructions are also supported. For example, the following script segment performs a test to determine whether a value exists and is not empty:

<!--$if myconfigvar-->something<!--$endif-->

4.4.2 Supported Variables for Referencing

The map portion of the URL mapping entry uses the following standard variables for referencing:

The CGI path

This is the current CGI path of the Oracle WebLogic Server web server filter's configured Oracle Content Server instance. The web server filter is configured to provide both communication and security for this Oracle Content Server instance. A typical example is /idcm1/idcplg.

The 'suffix' parameter

The value of the suffix variable (<!--$suffix-->) is derived from the part of the URL that follows the preliminary mapping 'prefix' and before the question mark (?). Any slashes (/) at the beginning of the suffix are removed before being substituted into this variable. For example, in the following URL, 'dc' is the mapping prefix followed by the suffix.

http://myhostname.com/dc/mydocumentname

After removing the slash, mydocumentname is used as the value for the suffix variable that is used as a substitution tag in the map portion of the mapping entry. Also, the suffix variable does not include any CGI parameters. Therefore, in the following URL, mydocumentname is still used as the suffix variable's value.

http://myhostname.com/dc/mydocumentname?a=1

To enforce the slash separation between the prefix and suffix, add the slash at the end of your prefix abbreviation.

Any plugin variable

For example, you could use the construct <!--$internetuser--> to substitute for the user ID of the currently logged-in user.

4.4.3 Add/Edit URL Mapping Entries

Enter the appropriate values in the Prefix and Map fields to edit the existing mapping entries, or define new entries, or both.

Click Update.

The screen refreshes and the Prefix and Map field values are saved. If all of the displayed fields are populated, two additional Prefix and Map field pairs are displayed after the screen is redisplayed.

Important:

The WebUrlMapPlugin feature is designed to support hundreds of mapping entries. However, be aware that thousands of mapping entries will impact performance of the web server.

4.4.4 Mapping Examples

4.4.4.1 Info Update Form

You can define a web URL mapping script that enables you to create a shortened URL to generate the Info Update Form for existing content items. You can write the mapping script to allow users to enter any identification variable for a particular document. For example, all URLs with the following format:

4.5.1About Providers

A provider is an Application Programming Interface (API) that establishes connection between the Oracle Content Server instance and outside entities. These entities can be:

Oracle WebLogic Server

LDAP servers

databases

server sockets

file store system

Inbound Refinery

4.5.1.1 Oracle Content Server Providers

By default, an Oracle Content Server instance has three system providers:

SystemDatabase: The system database.

SystemServerSocket: A server socket that listens for browser requests.

DefaultFileStore: A file store system.

In addition, you can create the following types of providers:

Outgoing: A connection initiated to an outside entity. You can use this type to communicate between Oracle Content Server instances. If you want to use SSL with an outgoing provider, see details in Section 4.5.1.3, "Security Providers."

Database: An information repository server that provides an API for connecting and communicating with it. This retrieves information and enables information to be changed in the database. Examples of this type are system databases.

Incoming: A connection initiated from an outside entity like a browser or client application. The provider listens on a specified port to be aware of incoming connections. If you want to use SSL with an incoming provider, see details in Section 4.5.1.3, "Security Providers."

Preview: An outgoing provider connection to Oracle Content Publisher, for use with the optional HTML Preview feature.

LDAP: A connection initiated to an LDAP (Lightweight Directory Access Protocol) server for managing external user access to the Oracle Content Server instance. This type of provider is supported by the ActiveDirectoryLdap component, which is installed (disabled) by default during Oracle Content Server installation. As of 11g Release 1 (11.1.1) its functionality is mostly superseded by JpsUserProvider, in particular for nested group support.

HTTP: A connection that allows communication between Oracle Content Server instances using the HTTP protocol. This type of provider requires the Proxy Credentials Extension component, which is installed (enabled) by default during Oracle Content Server installation.

JpsUserProvider: A connection to an Oracle WebLogic Server instance. This provider uses Java Platform Security (JPS) to perform user authentication, user authorization, and retrieval of user metadata through an Oracle WebLogic Server instance. This type of provider is supported by the JpsUserProvider component, which is installed (enabled) by default with the Oracle Content Server instance.

4.5.1.2Choosing an Appropriate Provider

The different types of providers described in the previous section are added under specific circumstances to work with various other Oracle products or utilities. The following subsections describe those conditions and the particular provider types that must be added in each scenario.

4.5.1.2.1When to Add an Outgoing Provider

Outgoing providers are added to use the Oracle Content Server Archiver utility and Inbound Refinery. If you want to use SSL or keepalive with an outgoing provider, see details in Section 4.5.1.3, "Security Providers."

Archiver Utility (Oracle Content Server): The Archiver is a utility within the core Oracle Content Server product that enables system administrators to copy and remove content and store it for future use. Users can query a set of content from the Oracle Content Server instance and export it to an archive. Archives can then be imported to other Oracle Content Server instances or can be imported back to the same instance with changed metadata fields.

An outgoing provider is required to use the Archiver Transfer feature, which is used to archive content across a firewall or between two systems that do not share a file system. For more information about the Transfer feature, the different types of transfers and the outgoing provider requirements, see the Managing Migration chapter for more information.

Inbound Refinery: The Inbound Refinery server processes content checked in to Oracle Content Server and converts it to specified formats. An outgoing connection to the Inbound Refinery server is necessary for communication with Oracle Content Server. For details, see Oracle Fusion Middleware Administrator's Guide for Conversion.

4.5.1.2.2When to Add a Database Provider

Database providers are added to use external databases.

Frequently, it is desirable or necessary to perform database queries on databases that are not the default Oracle Content Server database. In this case, customized database providers can be created that make it possible to access any data from any application, regardless of which database management system is handling the data. Using customized database providers to integrate external databases into a Oracle Content Server system, search results can be combined and viewed on a single search screen. Additionally, data can be imported from these external database sources.

Create an Oracle Content Server database provider to connect directly to the database through a JDBC connection, without using an Oracle WebLogic Server data source. This mode is provided for instances with pre-existing connections in their configurations.

4.5.1.2.3When to Add an Incoming Provider

Incoming providers are added to use WebDAV support and the Oracle Content Server Archiver utility. If you want to use SSL or keepalive with an incoming provider, see details in Section 4.5.1.3, "Security Providers."

Oracle WebDAV Support: With version 6.2 of Oracle Content Server, you could implement WebDAV (Web-Based Distributed Authoring and Versioning) support using an incoming provider and the Oracle Content Server integrated Tomcat servlet engine. In Oracle Content Server version 7.0 and later, however, WebDAV support is provided by a custom feature, so the provider and servlet engine are no longer necessary.

Archiver Utility (Oracle Content Server): The Archiver is a utility within the core Oracle Content Server product that enables system administrators to copy and remove content and store it for future use. Users can query a set of content from the Oracle Content Server instance and export, import, or replicate to another instance, or change metadata fields. Tasks most frequently performed involve transfer, backup, and reorganization of information within the system.

Generally, when data or content items are moved from one repository to another, the Archiver utility uses a push technology to relocate the files. However, occasionally your system might require that the files be pulled rather than pushed. In this case, an incoming provider must be created. For additional reference information about incoming providers and each specific field, see Section A.1.6.6, "Incoming Provider Page."

Consulting Services are required perform this operation.

4.5.1.2.4When to Add a Preview Provider

Preview providers are added to use HTML Preview and Content Categorizer.

HTML Preview: HTML Preview is a feature that provides users with instant feedback on how their content will display on the published web site. This feature enables users to modify the original content before it is actually checked in. HTML Preview also helps users ensure that correct metadata has been assigned to the content. During the installation process, a preview provider must be created. For additional overview and installation information about HTML Preview, see the Oracle Fusion Middleware Application Administrator's Guide for Content Server.

Content Categorizer: Content Categorizer suggests metadata values for documents being checked into Oracle Content Server or for existing documents that need to have metadata reapplied. For Content Categorizer to recognize structural properties of a document, the file must be converted to XML.

If you are using Content Publisher to set up a template for the required XML conversion process, the HTML Preview feature must be configured as a preview provider. (HTML Preview is a feature that enables users to preview their content and see what the converted output from Content Publisher will look like.)

4.5.1.2.5When to Add an LDAP Provider

Lightweight Directory Access Protocol (LDAP) is a directory service protocol that runs over TCP/IP. It provides high-level functionality to manage resources within a network and works with Oracle Content Server to manage security and user authentication. The LDAP directory service model is based on a collection of attributes and is used to access information stored in an information directory. As such, LDAP is used to validate a set of user name and password credentials against an authentication source. This process will grant privileges to a user to give them access to web resources.

An LDAP server provides a single source for user-related information that can be accessed from applications such as Oracle Content Server and other Oracle product modules. Instead of maintaining user information within the Oracle Content Server instance, you can integrate an LDAP directory to authenticate user credentials to the Oracle Content Server instance.

If you decide to use an LDAP server (other than Active Directory, which can be integrated directly with the Oracle Content Server instance), you must create an LDAP provider to set up communication between the Oracle Content Server instance and the LDAP server. When properly configured, the LDAP provider authorizes external users through the mapping properties that are linked to role assignments and account permissions (defined on the Ldap Provider page).

Although not required, you are encouraged to have Consulting Services assist you with creating an LDAP security model and deploying the LDAP integration. Contact your sales representative for more information.

LDAP integration is also useful with the following content management products and architectures:

Portlets on WebSphere: WebSphere users can access Oracle Content Server through the Oracle Content Integration Suite. This portal interface enables users and developers to retrieve, view, and download Oracle Content Server content items based on full text or metadata search queries. When using the Content Integration Suite, the WebSphere Application Server is recommended. If you are using a WebSphere Portal Server, the Oracle Content Portal Suite is a recommended addition to the Content Integration Suite.

The Content Integration Suite connects directly to the Oracle Content Server instance instead of the database. This direct connection avoids the authentication step at the web server and enables the developer total control over the authentication and authorization of users. The advantage is you can authenticate users at the Content Integration Suite layer however you want. You can integrate with an LDAP server at the application server level, or you can ask the Oracle Content Server instance to validate the passwords for you.

For more information about using WebSphere with the Content Integration Suite and the Content Portal Suite, see the documentation provided with the WebSphere Portal Server, WebSphere Application Server, Oracle Content Integration Suite, and Oracle Content Portal Suite.

Content Tracker: Content Tracker is a system that is built from a collection of software features that, when combined, enable users to use a standard browser to track content usage through an integrated set of analytical tools. The data provided by the Oracle Content Server instance is derived from logged data that includes web server log data, Oracle Content Server data, and user information. Content Tracker accesses this data, performs analysis on it, and produces descriptive reports. Integrating an LDAP directory server with Content Tracker is optional. However, if LDAP is used, an LDAP provider must be created.

As of 11g Release 1 (11.1.1), LDAP provider functionality is superseded by JpsUserProvider, in particular for cases such as nested group support.

The JpsUserProvider component is installed and enabled as a system component when the Oracle Content Server instance is installed against an Oracle WebLogic Server instance. It also is available as a standard Oracle Content Server component. You can edit the JpsUserProvider configuration from the Providers page in the Oracle Content Server instance. The connection configuration also can be edited through the jps-config.xml file to use identity and credential stores.

It is unlikely that a site would ever add a JpsUserProvider in addition to the system-defined JpsUserProvider. Adding another such provider could cause problems for the Oracle Content Server installation.

If you want to authenticate against a JPS store, JpsUserProvider can be used to share the same security storage as another application on an Oracle WebLogic Server instance. For example, you could use JpsUserProvider to share security storage with Image and Processing Manager software installed on an Oracle WebLogic Server instance.

4.5.1.3.1 About Security Providers

The SecurityProviders component can be used to add security by extending the functionality of basic incoming and outgoing socket providers with two new types of providers:

Secure Socket Layer (SSL) provider

Keepalive provider

Appropriate use of security providers, along with keys and certificates, can improve the security for network and Internet communication with the Oracle Content Server instance. Benefits of using the SecurityProviders component include the following:

Security providers enable use of certificates for socket or server authentication.

Keepalive and connection pooling logic help avoid SSL expense overhead by reducing the amount of SSL socket creation and teardown.

The SecurityProviders component is installed (enabled) by default with the Oracle Content Server instance.

References

To use the SecurityProviders component it is necessary to be familiar with socket providers, security and authentication, SSL, keepalive, and other aspects of security for network communication. The following sources of information can be useful in working with the SecurityProviders component:

This online document is available from Sun Microsystems at www.sun.com. It contains an extensive Related Documents section that includes web links to reference books, security standards, government security policies and regulations, and a list of books on cryptography and SSL.

keytool Key and Certification Management Tool

This online document is available from Sun Microsystems at www.sun.com.

RSA's Public Key Cryptography Standards

This online document is available from RSA at www.rsasecurity.com.

RSA's Cryptography FAQ

This online document is available from RSA at www.rsasecurity.com.

SSL Certificate FAQ

This online document is available from The Linux Documentation Project at www.tldp.org.

Terminology

The following table shows definitions for some of the security terms used in this section. For detailed information refer to the list of information references or to security and authentication standards sources.

Term

Description

Certificate

A digital signature that verifies the identity and public key for an entity (a person or company). A certificate can be issued by a Certification Authority or by an individual entity.

Certificate Authority (CA)

An entity that issues certificates for other entities, and is recognized as a well-known and trusted source for certificates, such as VeriSign and Thawte.

Keystore

A file or database of information for keys, used for authentication processing.

Private key

Information packaged as a key that is known only to the entity that issues it. Private keys are used in generating signatures.

Public key

Information packaged as a key that is publicly associated with an entity. Public keys are used in verifying signatures.

SSL

Secure Socket Layer, a protocol for secure network communication using a combination of public and secret key technology.

Truststore

A file or database of keys that the trust manager has determined can be trusted.

4.5.1.3.2 Planning to Use Security Providers

It is recommended that you determine how you want to use security providers before implementing SSL socket providers or keepalive socket providers. Examine the keepalive and SSL connection types and determine whether additional configuration is required to use the security providers you select, such as a need to create keystores or a truststore. See the additional sources of information listed in Section 4.5.1.3.1, "About Security Providers."

The following sections provide more information about the SSL and keepalive provider types, including the Java classes used to control the behavior of the provider types, and additional configuration that may be necessary.

The keepalive feature enables persistent connections and the pooling of socket connections for service requests. The setup for keepalive connections is most useful in situations where connection setup and teardown can take a considerable amount of time, and you want to minimize the time spent on that activity. The SecurityProviders component provides two keepalive socket providers: incoming and outgoing.

The following Java classes are used to set up the keepalive incoming socket provider:

Java Class

Description

Provider Class

idc.provider.ExtendedSocketIncomingProvider

Connection Class

idc.provider.KeepaliveSocketIncomingConnection

Server Thread Class

idc.server.KeepaliveIdcServerThread

The following Java classes are used to set up the keepalive outgoing socket provider:

Java Class

Description

Provider Class

idc.provider.KeepaliveSocketOutgomingProvider

Connection Class

idc.provider.KeepaliveSocketOutgomingConnection

Request Class

idc.server.KeepaliveServerRequest

SSL Connections

The SSL provider setup enables the use of SSL connections in a keepalive environment. This setup is recommended over a simple SSL provider setup because it helps minimize the cost of SSL socket setup and teardown. The SecurityProviders component provides two SSL socket providers with keepalive: incoming and outgoing.

The following Java classes are used to set up the SSL keepalive incoming socket provider:

Java Class

Description

Provider Class

idc.provider.ssl.SSLSocketIncomingProvider

Connection Class

idc.provider.KeepaliveSocketIncomingConnection

Server Thread Class

idc.server.KeepaliveIdcServerThread

The following Java classes are used to set up the keepalive SSL outgoing socket provider:

Java Class

Description

Provider Class

idc.provider.KeepaliveSocketOutgoingProvider

Connection Class

idc.provider.ssl.SSLSocketOutgoingConnection

Request Class

idc.provider.KeepaliveServerRequest

Additional Configuration

Depending on which type of security provider you choose, there can be additional required configuration.

Keepalive and SSL outgoing providers—The Add Provider page includes a Num Connections field, which specifies the number of connections to pool.

Request Client Auth option—If clients are able, they should authenticate themselves when they make a connection.

Require Client Auth option—Clients must authenticate themselves in order to make a connection.

SSL providers may also require setup of a keystore or keystores, and a truststore, for both the client and server, depending on the value of the Request Client Auth option, the value of the Require Client Auth option, and what type of Certification Authority signed the certificates handled by these options. For information on keystores and truststore refer to Section 4.5.1.3.3, "Keystores and Truststore."

4.5.1.3.3 Keystores and Truststore

SSL providers may require use of keystores and may require a truststore. Keystores are files that hold public and secret key information for use in SSL. A truststore contains certificates that have been determined to be trusted. If a certificate used on the server and client is signed by a well-known Certification Authority (CA) such as VeriSign or Thawte, then a truststore is not necessary, because the default JVM truststore contains the certificates of these CAs. Truststores are needed when certificates used by the SSL providers are self-signed or signed by a private CA. If SSL providers require keystores, and a truststore, then they must be created and managed.

The following sections provide overview information about keystores and truststore.

The following examples present different situations and uses for keystores and a truststore.

The server requires a keystore containing a signed SSL certificate in order to create SSL sockets.

The server requests or requires client authentication, which may require a truststore. If the client's certificate is not signed by a well-known CA, then the server will need a truststore containing that CA's certificate.

The server requests or requires client authentication, which may require that the client have a keystore in which it stores the certificate the client presents for authentication.

The server uses a certificate that hasn't been signed by a well-known CA, therefore the client will require a truststore that contains the server's certificate.

Specifying Keystore and Truststore Information

In order to use keystore and truststore information, the SSL incoming and outgoing providers require that a file named sslconfig.hda be set up in the providers directory (next to the provider.hda file). The sslconfig.hda file contains configuration information you specify for your keystore and truststore. It has a format similar to the following example. For security reasons, there is no web interface to assist in editing this file; all edits must be done manually using a text editor. Make certain no trailing spaces are included at the end of each line of this or any .hda file.

This section describes the basic process for generating a keystore. You must determine the specific requirements and names for keys and keystores you want to create for your SSL providers. You can store keystore files wherever you want, because the sslconfig.hda file contains full paths for its KeystoreFile config settings. However, it is recommended that keystore files are stored in the IntradocDir/data/providers/provider_name directory (next to the provider.hda and sslconfig.hda files) or in the IntradocDir/config/ directory. Aliases and passwords are set using the provider page in the Oracle Content Server instance.

For detailed information on how to use the keytool utility to generate a keystore, see the document titled keytool Keyand Certification Management Tool, available online from Sun at www.sun.com.

Note:

The Java keytool utility has a feature that prevent direct interaction with private keys. This feature means that a certificate that is generated using keytool is "stuck" in the keystore; there is no way to retrieve the private key portion of the certificate. Inversely, there is no way for keytool to import a pre-existing certificate into a Java keystore.

The Portecle Java keystore allows the import and export of private keys from Java keystores. For information refer to portecle.sourceforge.net.

To use keytool you must have the utility in your path when you enter the command.

Create a key in a keystore. The following command-line example shows how to create a key entry with the name alias in a keystore with the name keystore. This command prompts for a keystore password, for information that will be used to generate the key, and for a password for the key itself. If the password on the key is different from the password on the keystore, then the values KeystoreAlias and KeystoreAliasPassword are required to retrieve the key.

keytool -genkey -v -alias alias -keystore keystore

Generate a certificate signing request.The following command-line example shows how to generate a certificate signing request for the key entry named alias in the keystore named keystore, which is then stored in the file named csr_file. This file can be sent to a CA to be signed.

keytool -certreq -v -alias alias -keystore keystore -file csr_file

Import the CA's certificate into the keystore. The keytool checks the chain of trust on the user's certificate upon import. If the certificate was signed by a CA that is not well-known and the keytool knows nothing about the CA, the certificate is rejected. Therefore any certificate from a CA that is not well-known must first be imported into the keystore to permit the user's certificate to successfully be imported in the next step. The following command line example shows how to import a certificate in a file named cert_file into the keystore named keystore:

keytool -import -v -alias ca_alias -keystore keystore -file cert_file

Import the signed certificate back into the keystore. Once the certificate signing request has been received by a CA and the signed certificate is sent back from the CA, the certificate can be read into the keystore entry identified by alias.The following command line example shows how to import the signed certificate.

This section describes the basic process for generating a truststore. A truststore is necessary when an SSL provider uses keys that have not been signed by a well-known Certification Authority. A truststore contains only public certificates that have been verified by the person managing the truststore (the trust manager) for the Oracle Content Server instance. You must determine the specific requirements and name for the truststore you want to create. You can store a truststore file wherever you want, because the sslconfig.hda file contains a full path for a TruststoreFile config setting. However, it is recommended that a truststore file is stored in the IntradocDir/data/providers/provider_name directory (next to the provider.hda and sslconfig.hda files) or in the IntradocDir/config/ directory.

For detailed information on how to use the keytool utility to generate a truststore refer to the document titled keytool Key and Certification Management Tool, available online from Sun at www.sun.com.

To use keytool you must have the utility in your path when you enter the command.

Click Add. The Providers Page is displayed, with the new provider added to the Providers table.

Restart the Oracle Content Server instance.

4.5.2.7 Adding a JPS User Provider

A defaut JPS User provider, which integrates with Oracle JPS (for an Oracle WebLogic Server instance), is provided with installation of the Oracle Content Server system. It is unlikely that another JPS User provider would be needed. However, if you need to add a user provider, follow these steps:

Using a web browser, access your Oracle Content Server home page and select Administration, then Providers.

Ensure that you intend to delete the provider and not just edit the information. When you delete a provider, the provider name and all of its related information is permanently removed from the Providers table.