Login

Developing and Implementing Applications

This article, the first of three parts, focuses on the design and creation of applications that use the database. It is excerpted from chapter five of the book Oracle Database 10g DBA Handbook, written by Kevin Loney and Bob Bryla (McGraw-Hill/Osborne, 2005; ISBN: 0072231459).

Managing application development can be a difficult process. From a DBA’s perspective, the best way to manage the development process is to become an integral part of teams involved in the process. In this chapter, you will learn the guidelines for migrating applications into databases and the technical details needed for implementation, including the sizing of database objects.

This chapter focuses on the design and creation of applications that use the database. These activities should be integrated with the database-planning activities described in Chapter 3 and Chapter 4. The following chapters in this part of the book address the monitoring and tuning activities that follow the database creation.

Implementing an application in a database by merely running a series of create table commands fails to integrate the creation process with the other major areas (planning, monitoring, and tuning). The DBA must be involved in the application development process in order to correctly design the database that will support the end product. The methods described in this chapter will also provide important information for structuring the database monitoring and tuning efforts.

The first section of this chapter addresses overall design and implementation considerations that directly address performance. The following sections focus on implementation details such as resource management, using stored outlines, sizing tables and indexes, quiescing the database for maintenance activities, and managing packaged applications.

Tuning by Design: Best Practices

At least 50 percent of the time—conservatively—performance problems are designed into an application. During the design of the application and the related database structures, the application architects may not know all the ways in which the business will use the application data over time. As a result, there may be some components whose performance is poor during the initial release, whereas other problems will appear later as the business usage of the application changes.

In some cases, the fix will be relatively straightforward—changing an initialization parameter, adding an index, or rescheduling large operations. In other cases, the problem cannot be fixed without altering the application architecture. For example, an application may be designed to heavily reuse functions for all data access—so that functions call other functions, which call additional functions, even to perform the simplest database actions. As a result, a single database call may result in tens of thousands of function calls and database accesses. Such an application will usually not scale well; as more users are added to the system, the CPU burden of the number of executions per user will slow the performance for the individual users. Tuning the individual SQL statements executed as part of that application may yield little performance benefit; the statements themselves may be well-tuned already. Rather, it is the sheer number of executions that leads to the performance problem.

The following best practices may seem overly simplistic, but they are violated over and over in database applications, and those violations directly result in performance problems. There are always exceptions to the rules—the next change to your software or environment may allow you to violate the rules without affecting your performance. In general, though, following these rules will allow you to meet performance requirements as the application usage increases.

{mospagebreak title=Do As Little As Possible}

End users do not care, in general, if the underlying database structures are fully normalized to Third Normal Form or if they are laid out in compliance with object-oriented standards. Users want to perform a business process, and the database application should be a tool that helps that business process complete as quickly as possible. The focus of your design should not be the achievement of theoretical design perfection; it should always be on the end user’s ability to do his or her job. Therefore, you should simplify the processes involved at every step in the application.

This can be a difficult point to negotiate with application development teams. If application development teams insist on perfectly normalized data models, DBAs should point out the number of database steps involved in even the simplest transaction. For example, inserts for a complex transaction (such as a line item for an invoice) may involve many code table lookups as well as multiple inserts. For a single user this may not present a problem, but with many concurrent users this design may lead to performance issues or locking issues. From a performance-planning perspective, inserts should involve as few tables as possible, and queries should retrieve data that is already stored in a format that is as close as possible to the final format requested by the users. Fully normalized databases and object-oriented designs tend to require a high number of joins during complex queries. Although you should strive to maintain a manageable data model, the first emphasis should be on the functionality of the application and its ability to meet the business’s performance needs.

In Your Application Design, Strive to Eliminate Logical Reads

In the past, there was a heavy focus on eliminating physical reads—and although this is still a good idea, no physical reads occur unless logical reads require them.

Let’s take a simple example. Select the current time from DUAL. If you select down to the second level, the value will change 86,400 times per day. Yet there are application designers who repeatedly perform this query, executing it millions of times per day. Such a query likely performs few physical reads throughout the day. Therefore, if you are focused solely on tuning the physical I/O, you would likely disregard it. However, it can significantly impact the performance of the application. How? By using the CPU resources available. Each execution of the query will force Oracle to perform work, using processing power to find and return the correct data. As more and more users execute the command repeatedly, you may find that the number of logical reads used by the query exceeds all other queries. In some cases, multiple processors on the server are dedicated to servicing repeated small queries of this sort. If multiple users need to read the same data, you should store it in a table or in a package variable.

Consider the following real-world example. A programmer wanted to implement a pause in a program, forcing it to wait 30 seconds between the completion of two steps. Because the performance of the environment would not be consistent over time, the programmer coded the routine in the following format (shown in pseudo-code):

Is this a reasonable approach? Absolutely not! It will do what the developer wanted, but at a significant cost to the application. What’s more, there is nothing a database administrator can do to improve its performance. In this case, the cost will not be due to I/O activity—the DUAL table will stay in the instance’s memory area—but rather due to CPU activity. Every time this program is run, by every user, the database will spend 30 seconds consuming as many CPU resources as the system can support. In this particular case the select SysDate from DUAL query accounts for over 40 percent of all the CPU time used by the application. All of that CPU time is wasted. Tuning the database initialization parameters will not solve the problem. Tuning the individual SQL statement will not help; the application design must be revised to eliminate the needless execution of commands. For instance, in this case the developer could have used a sleep command at the operating system level or within a program to enforce the same behavior without the database accesses.

For those who favor tuning based on the buffer cache hit ratio, this database has a hit ratio of almost 100 percent due to the high number of completely unnecessary logical reads without related physical reads. The buffer cache hit ratio compares the number of logical reads to the number of physical reads; if 10 percent of the logical reads require physical reads, the buffer cache hit ratio is 90 percent. Low hit ratios identify databases that perform a high number of physical reads; extremely high hit ratios such as found in this example may identify databases that perform an excessive number of logical reads. You must look beyond the buffer cache hit ratio to the commands that are generating the logical reads and the physical reads.

Remember that you are tuning an application, not a query. When tuning database operations, you may need to combine multiple queries into a single procedure so that the database can be visited once rather than multiple times for each screen. This bundled-query approach is particularly relevant for “thin-client” applications that rely on multiple application tiers. Look for queries that are interrelated based on the values they return, and see if there are opportunities to transform them into single blocks of code. The goal is not to make a monolithic query that will never complete; the goal is to avoid doing work that does not need to be done. In this case, the constant back-and-forth communication between the database server, the application server, and the end user’s computer is targeted for tuning.

This problem is commonly seen on complex data-entry forms in which each field displayed on the screen is populated via a separate query. Each of those queries is a separate trip to the database. As with the example in the previous section, the database is forced to execute large numbers of related queries. Even if each of those queries is tuned, the burden from the number of commands—multiplied by the number of users—will consume the CPU resources available on the server. Such a design may also impact the network usage, but the network is seldom the problem—the issue is the number of times the database is accessed.

Within your packages and procedures, you should strive to eliminate unnecessary database accesses. Store commonly needed values in local variables instead of repeatedly querying the database. If you don’t need to make a trip to the database for information, don’t make it. That sounds simple, but you would be amazed at how often applications fail to consider this advice.

There is no initialization parameter that can make this change take effect. It is a design issue and requires the active involvement of developers, designers, DBAs, and application users in the application performance planning and tuning process.

For Reporting Systems, Store the Data the Way the Users Will Query It

If you know the queries that will be executed—such as via parameterized reports—you should strive to store the data so that Oracle will do as little work as possible to transform the format of the data in your tables into the format presented to the user. This may require the creation and maintenance of materialized views or reporting tables. That maintenance is, of course, extra work for the database to perform—but it is performed in batch mode and does not directly affect the end user. The end user, on the other hand, benefits from the ability to perform the query faster. The database as a whole will perform fewer logical and physical reads because the accesses to the base tables to populate and refresh the materialized views are performed infrequently when compared to the end-user queries against the views.

Avoid Repeated Connections to the Database

Opening a database connection may take more time than the commands you execute within that connection. If you need to connect to the database, keep the connection open and reuse the connection. Within the database you may be able to use stored procedures, packages, and other methods to maintain connections while you are performing your processing.

One application designer took normalization to the extreme, moving all code tables into their own database. As a result, most operations in the order-processing system repeatedly opened database links to access the code tables, thus severely hampering the performance of the application. Again, tuning the database initialization parameters is not going to lead to the greatest performance benefit; the application is slow by design.

Use the Right Indexes

In an effort to eliminate physical reads, some application developers create numerous indexes on every table. Aside from their impact on data load times, many of the indexes may never be needed to support queries. In OLTP applications, you should not use bitmap indexes; if a column has few distinct values, you should consider leaving it unindexed. The optimizer supports “skip-scan” index accesses, so it may choose an index on a set of columns even if the leading column of the index is not a limiting condition for the query.

Do It As Simply As Possible

Once you have eliminated the performance costs of unnecessary logical reads, unneeded database trips, unmanaged connections, and inappropriate indexes, take a look at the commands that remain.

{mospagebreak title=Go Atomic}

You can use SQL to combine many steps into one large query. In some cases, this may benefit your application—you can create stored procedures and reuse the code and thus reduce the number of database trips performed. However, you can take this too far, creating large queries that fail to complete quickly enough. These queries commonly include multiple sets of grouping operations, inline views, and complex multirow calculations against millions of rows.

If you are performing batch operations, you may be able to break such a query into its atomic components, creating temporary tables to store the data from each step. If you have an operation that takes hours to complete, you almost always can find a way to break it into smaller component parts. Divide and conquer the performance problem.

For example, a batch operation may combine data from multiple tables, perform joins and sorts, and then insert the result into a table. On a small scale, this may perform satisfactorily. On a large scale, you may have to divide this operation into multiple steps:

Create a work table. Insert rows into it from one of the source tables for the query, selecting only those rows and columns that you care about later in the process.

Create a second work table for the columns and rows from the second table.

Create any needed indexes on the work tables. Note that all the steps to this point can be parallelized—the inserts, the queries of the source tables, and the creation of the indexes.

Perform the join, again parallelized. The join output may go into another work table.

Perform any sorts needed. Sort as little data as possible.

Insert the data into the target table.

Why go through all these steps? Because you can tune them individually, you may be able to tune them to complete much faster individually than Oracle can complete them as a single command. For batch operations, you should consider making the steps as simple as possible. You will need to manage the space allocated for the work tables, but this approach can generate significant benefits to your batch-processing performance.

Eliminate Unnecessary Sorts

As part of the example in the preceding section, the sort operation was performed last. In general, sort operations are inappropriate for OLTP applications. Sort operations do not return any rows to the user until the entire set of rows is sorted. Row operations, on the other hand, return rows to the user as soon as those rows are available.

Consider the following simple test: Perform a full table scan of a large table. As soon as the query starts to execute, the first rows are displayed. Now, perform the same full table scan but add an order by clause on an unindexed column. No rows will be displayed until all the rows have been sorted. Why does this happen? Because for the second query Oracle performs a SORT ORDER BY operation on the results of the full table scan. Because it is a set operation, the set must be completed before the next operation is performed.

Now, imagine an application in which there are many queries executed within a procedure. Each of the queries has an order by clause. This turns into a series of nested sorts—no operation can start until the one before it completes.

Notethat union operations perform sorts. If it is appropriate for the business logic, use a union all operation in place of a union, because a union all does not perform a sort (because it does not eliminate duplicates).

NOTE

Aunion all operation does not eliminate duplicate rows from the result set, so it may generate different results than a union.

Eliminate the Need to Query Undo Segments

When performing a query, Oracle will need to maintain a read-consistent image of the rows queried. If a row is modified by another user, the database will need to query the undo segment to see the row as it existed at the time your query began. Application designs that call for queries to frequently access data that others may be changing at the same time force the database to do more work—it has to look in multiple locations for one piece of data. Again, this is a design issue. DBAs may be able to configure the undo segment areas to reduce the possibility of queries encountering errors, but correcting the fundamental problem requires a change to the application design.

Tell the Database What It Needs to Know

Oracle’s optimizer relies on statistics when it evaluates the thousands of possible paths to take during the execution of a query. How you manage those statistics can significantly impact the performance of your queries.

Keep Your Statistics Updated

How often should you gather statistics? With each major change to the data in your tables, you should reanalyze the tables. If you have partitioned the tables, you can analyze them on a partition-by-partition basis. As of Oracle Database 10g, you can use the Automatic Statistics Gathering feature to automate the collection of statistics. By default, that process gathers statistics during a maintenance window from 10 P.M to 6 A.M. each night and all day on weekends.

Because the analysis job is usually a batch operation performed after hours, you can tune it by improving sort and full table scan performance at the session level. If you are performing the analysis manually, increase the settings for the DB_FILE_MULTIBLOCK_READ_COUNT and PGA_AGGREGATE_TARGET parameters within your session prior to gathering the statistics. If you are not using PGA_AGGREGATE_TARGET, increase SORT_AREA_SIZE instead. The result will be enhanced performance for the sorts and full table scans the analysis performs.

Hint Where Needed

In most cases, the cost-based optimizer (CBO) selects the most efficient execution path for queries. However, you may have information about a better path. You may give Oracle a hint to influence the join operations, the overall query goal, the specific indexes used, or the parallelism of the query.

Maximize the Throughput in the Environment

In an ideal environment, there is never a need to query information outside the buffer cache; all of the data stays in memory all of the time. Unless you are working with a very small database, however, this is not a realistic approach. In this section, you will see guidelines for maximizing the throughput of the environment.

Use Disk Caching

If Oracle cannot find the data it needs in the buffer cache or PGA, it performs a physical read. But how many of the physical reads actually reach the disk? If you use disk caching, you may be able to prevent 90 percent or more of the access requests for the most-needed blocks. If the database buffer cache hit ratio is 90 percent, you are accessing the disks 10 percent of the time—and if the disk cache prevents 90 percent of those requests from reaching the disk, your effective hit ratio is 99 percent. Oracle’s internal statistics do not reflect this improvement; you will need to work with your disk administrators to configure and monitor the disk cache.

Use a Larger Database Block Size

There is only one reason not to use the largest block size available in your environment for a new database: if you cannot support a greater number of users performing updates and inserts against a single block. Other than that, increasing the database block size should improve the performance of almost everything in your application. Larger database block sizes help keep indexes from splitting levels and help keep more data in memory longer. If you are experiencing buffer busy waits during inserts, increase the settings for the freelists parameter setting at the object level (if you are using Automatic Segment Space Management, the freelists parameter does not apply).

{mospagebreak title=Store Data Efficiently at the Block Level}

Oracle stores blocks of data in memory. It is in your best interest to make sure those blocks are as densely packed with data as possible. If your data storage is inefficient at the block level, you will not gain as much benefit as you can from the caches available in the database.

By default, the pctused parameter is set to 40 for all database blocks, and pctfree is set to 10. If you use the defaults, then as rows are added to the table, they will also be added to a block until that block is 90 percent full; at that point the block will be removed from the “freelist” and all new inserts will use other blocks in the table. Updates of the rows in the block will use the space reserved by the pctfree setting. Rows may then be deleted from the block, but the block will not be added back to the freelist until the space usage within the block drops below the pctused setting. This means that in applications that feature many delete and insert operations of rows, it is common to find many blocks using just slightly above the pctused value of each block. In that case, each block is just over 40 percent used, so each block in the buffer cache is only that full—resulting in a significant increase in the number of blocks requested to complete each command. If your application performs many delete and insert operations, you should consider increasing pctused so the block will be re-added to the freelist as quickly as possible.

NOTE

If you use Automatic Segment Space Management, thepctusedsetting is managed by the database.

If the pctfree setting is too low, updates may force Oracle to move the row (called a migrated row). In some cases, row chaining is inevitable, such as when your row length is greater than your database block size or when you have more than 254 columns in a table. When row chaining and migration occur, each access of a row will require accessing multiple blocks, thus impacting the number of logical reads required for each command. You can detect row chaining by analyzing the table and then checking its statistics via USER_TABLES.

Design to Throughput, Not Disk Space

Take an application that is running on eight 9GB disks and move it to a single 72GB disk. Will the application run faster or slower? In general, it will run slower because the throughput of the single disk is unlikely to be equal to the combined throughput of the eight separate disks. Rather than designing your disk layout based on the space available (a common method), design it based on the throughput of the disks available. You may decide to use only part of each disk. The remaining space on the disk will not be used by the production application unless the throughput available for that disk improves.

Avoid the Use of the Temporary Segments

Whenever possible, perform all sorts in memory. Any operation that writes to the temporary segments is potentially wasting resources. Oracle uses temporary segments when the SORT_ AREA_SIZE parameter (or PGA_AGGREGATE_TARGET, if it is used) does not allocate enough memory to support the sorting requirements of operations. Sorting operations include index creations, order by clauses, statistics gathering, group by operations, and some joins. As noted earlier in this chapter, you should strive to sort as few rows as possible. When performing the sorts that remain, perform them in memory.

Favor Fewer, Faster Processors

Given the choice, use a small number of fast processors in place of a larger number of slower processors. The operating system will have fewer processing queues to manage and will generally perform better.

Divide and Conquer Your Data

If you cannot avoid performing expensive operations on your database, you can attempt to split the work into more manageable chunks. Often you can severely limit the number of rows acted on by your operations, substantially improving performance.

Use Partitions

Partitions can benefit end users, DBAs, and application support personnel. For end users, there are two potential benefits: improved query performance and improved availability for the database. Query performance may improve because of partitionelimination. The optimizer knows what partitions may contain the data requested by a query. As a result, the partitions that will not participate are eliminated from the query process. Because fewer logical and physical reads are needed, the query should complete faster.

NOTE

The Partitioning Option is an extra-cost option for the Enterprise Edition of the database software.

The availability improves because of the benefits partitions generate for DBAs and application support personnel. Many administrative functions can be performed on single partitions, allowing the rest of the table to be unaffected. For example, you can truncate a single partition of a table. You can split a partition, move it to a different tablespace, or switch it with an existing table (so that the previously independent table is then considered a partition). You can gather statistics on one partition at a time. All these capabilities narrow the scope of administrative functions, reducing their impact on the availability of the database as a whole.

Use Materialized Views

You can use materialized views to divide the types of operations users perform against your tables. When you create a materialized view, you can direct users to query the materialized view directly or you can rely on Oracle’s query rewrite capability to redirect queries to the materialized view. As a result, you will have two copies of the data—one that services the input of new transactional data, and a second (the materialized view) that services queries. As a result, you can take one of them offline for maintenance without affecting the availability of the other. Also, the materialized view can pre-join tables and pre-generate aggregations so user queries perform as little work as possible.

Use Parallelism

Almost every major operation can be parallelized—including queries, inserts, object creations, and data loads. The parallel options allow you to involve multiple processors in the execution of a single command, effectively dividing the command into multiple smaller coordinated commands. As a result, the command may perform better. You can specify a degree of parallelism at the object level and can override it via hints in your queries.

{mospagebreak title=Test Correctly}

In most development methodologies, application testing has multiple phases, including module testing, full system testing, and performance stress testing. Many times, the full system test and performance stress test are not performed adequately due to time constraints as the application nears its delivery deadline. The result is that applications are released into production without any way to guarantee that the functionality and performance of the application as a whole will meet the needs of the users. This is a serious and significant flaw and should not be tolerated by any user of the application. Users do not need just one component of the application to function properly; they need the entire application to work properly in support of a business process. If they cannot do a day’s worth of business in a day, the application fails.

This is a key tenet regarding identifying the need for tuning: If the application slows the speed of the business process, it should be tuned. The tests you perform must be able to determine if the application will hinder the speed of the business process under the expected production load.

Test with Large Volumes of Data

As described earlier in this chapter, objects within the database function differently after they have been used for some time. For example, the pctfree and pctused settings may make it likely that blocks will be only half-used or rows will be chained. Each of these causes performance problems that will only be seen after the application has been used for some time.

A further problem with data volume concerns indexes. As a B*tree index grows in size, it may split internally—the level of entries within the index increases. As a result, you can picture the new level as being an index within the index. The additional level in the index increases the negative effect of the index on data load rates. You will not see this impact until after the index is split. Applications that work acceptably for the first week or two in production only to suddenly falter after the data volume reaches critical levels do not support the business needs. In testing, there is no substitute for production data loaded at production rates while the tables already contain a substantial amount of data.

Test with Many Concurrent Users

Testing with a single user does not reflect the expected production usage of most database applications. You must be able to determine if concurrent users will encounter deadlocks, data consistency issues, or performance problems. For example, suppose an application module uses a work table during its processing. Rows are inserted into the table, manipulated, and then queried. A separate application module does similar processing—and uses the same table. When executed at the same time, the two processes attempt to use each other’s data. Unless you are testing with multiple users executing multiple application functions simultaneously, you may not discover this problem and the business data errors it will generate.

Testing with many concurrent users will also help to identify areas in the application where users frequently use undo segments to complete their queries, thus impacting performance.

Test the Impact of Indexes on Your Load Times

Every insert, update, or delete of an indexed column may be slower than the same transaction against an unindexed table. There are some exceptions—sorted data has much less of an impact, for example—but the rule is generally true. The impact is dependent on your operating environment, the data structures involved, and the degree to which the data is sorted.

How many rows per second can you insert in your environment? Perform a series of simple tests. Create a table with no indexes and insert a large number of rows into it. Repeat the tests to reduce the impact of physical reads on the timing results. Calculate the number of rows inserted per second. In most environments you can insert tens of thousands of rows per second into the database. Perform the same test in your other database environments so you can identify any that are significantly different from the others.

Now consider your application. Are you able to insert rows into your tables via your application at anywhere near the rate you just calculated? Many applications run at less than 5 percent of the rate the environment will support. They are bogged down by unneeded indexes or the type of code design issues described earlier in this chapter. If your application’s load rate decreases—say, from 40 rows per second to 20 rows per second—your tuning focus should not be solely on how that decrease occurred but also on how the application managed to get only 40 rows per second inserted in an environment that supports thousands of rows inserted per second.

Make All Tests Repeatable

Most regulated industries have standards for tests. Their standards are so reasonable that all testing efforts should follow them. Among the standards is that all tests must be repeatable. To be compliant with the standards, you must be able to re-create the data set used, the exact action performed, the exact result expected, and the exact result seen and recorded. Pre-production tests for validation of the application must be performed on the production hardware. Moving the application to different hardware requires retesting the application. The tester and the business users must sign off on all tests.

Most people, on hearing those restrictions, would agree that they are good steps to take in any testing process. Indeed, your business users may be expecting that the people developing the application are following such standards, even if they are not required by the industry. But are they followed? And if not, then why not? The two commonly cited reasons for not following such standards are time and cost. Such tests require planning, personnel resources, business user involvement, and time for execution and documentation. Testing on production-caliber hardware may require the purchase of additional servers. Those are the most evident costs—but what is the business cost of failing to perform such tests? The testing requirements for validated systems in some health industries were implemented because those systems directly impact the integrity of critical products such as the safety of the blood supply. If your business has critical components served by your application (and if it does not, then why are you building the application?), you must consider the costs of insufficient, rushed testing and communicate those potential costs to the business users. The evaluation of the risks of incorrect data or unacceptably slow performance must involve the business users. In turn, that may lead to an extended deadline to support proper testing.

In many cases, the rushed testing cycle occurs because a testing standard was not in place at the start of the project. If there is a consistent, thorough, and well-documented testing standard in place at the enterprise level when the project starts, the testing cycle will be shorter when it is finally executed. Testers will have known long in advance that repeatable data sets will be needed. Templates for tests will be available. If there is an issue with any test result, or if the application needs to be retested following a change, the test can be repeated. Also, the application users will know that the testing is robust enough to simulate the production usage of the application. If the system fails the tests for performance reasons, the problem may be a design issue (as described in the previous sections) or a problem with an individual query.

{mospagebreak title=Standard Deliverables}

How do you know if an application is ready to be migrated to a production environment? The application development methodology must clearly define, both in format and in level of detail, the required deliverables for each stage of the life cycle. These should include specifications for each of the following items:

Entity relationship diagram

Physical database diagram

Space requirements

Tuning goals for queries and transaction processing

Security requirements

Data requirements

Query execution plans

Acceptance test procedures

In the following sections, you will see descriptions of each of these items.

Entity Relationship Diagram

The entity relationship (E-R) diagram illustrates the relationships that have been identified among the entities that make up the application. E-R diagrams are critical for providing an understanding of the goals of the system. They also help to identify interface points with other applications and to ensure consistency in definitions across the enterprise.

Physical Database Diagram

A physical database diagram shows the physical tables generated from the entities and the columns generated from the defined attributes in the logical model. A physical database diagramming tool is usually capable of generating the DDL necessary to create the application’s objects.

You can use the physical database diagram to identify tables that are most likely to be involved in transactions. You should also be able to identify which tables are commonly used together during a data entry or query operation. You can use this information to effectively plan the distribution of these tables (and their indexes) across the available physical devices to reduce the amount of I/O contention encountered.

In data warehousing applications, the physical database diagram should show the aggregations and materialized views accessed by user queries. Although they contain derived data, they are critical components of the data access path and must be documented.

Space Requirements

The space requirements deliverable should show the initial space requirements for each database table and index. The recommendations for the proper size of tables, clusters, and indexes are shown in the “Sizing Database Objects” section, later in this chapter.

Changes to the application design may have significant impact on the application’s performance. Application design choices may also directly affect your ability to tune the application. Because application design has such a great effect on the DBA’s ability to tune its performance, the DBA must be involved in the design process.

You must identify the performance goals of a system before it goes into production. The role of expectation in perception cannot be overemphasized. If the users have an expectation that the system will be at least as fast as an existing system, anything less will be unacceptable. The estimated response time for each of the most-used components of the application must be defined and approved.

It is important during this process to establish two sets of goals: reasonable goals and “stretch” goals. Stretch goals represent the results of concentrated efforts to go beyond the hardware and software constraints that limit the system’s performance. Maintaining two sets of performance goals helps to focus efforts on those goals that are truly mission-critical versus those that are beyond the scope of the core system deliverables. In terms of the goals, you should establish control boundaries for query and transaction performance; the application performance will be judged to be “out of control” if the control boundaries are crossed.

Security Requirements

The development team must specify the account structure the application will use, including the ownership of all objects in the application and the manner in which privileges will be granted. All roles and privileges must be clearly defined. The deliverables from this section will be used to generate the account and privilege structure of the production application (see Chapter 10 for a full review of Oracle’s security capabilities).

Depending on the application, you may need to specify the account usage for batch accounts separately from that of online accounts. For example, the batch accounts may use the database’s autologin features, whereas the online users have to manually sign in. Your security plans for the application must support both types of users.

Like the space requirements deliverable, security planning is an area in which the DBA’s involvement is critical. The DBA should be able to design an implementation that meets the application’s needs while fitting in with the enterprise database security plan.

Data Requirements

The methods for data entry and retrieval must be clearly defined. Data-entry methods must be tested and verified while the application is in the test environment. Any special data-archiving requirements of the application must also be documented because they will be application specific.

You must also describe the backup and recovery requirements for the application. These requirements can then be compared to the enterprise database backup plans (see Chapter 12 for guidelines). Any database recovery requirements that go beyond the site’s standard will require modifying the site’s backup standard or adding a module to accommodate the application’s needs.

Query Execution Plans

Execution plans are the steps that the database will go through while executing queries. They are generated via the explain plan or set autotrace command, as described in Chapter 8. Recording the execution plans for the most important queries against the database will aid in planning the index usage and tuning goals for the application. Generating them prior to production implementation will simplify tuning efforts and identify potential performance problems before the application is released. Generating the explain plans for your most important queries will also facilitate the process of performing code reviews of the application.

If you are implementing a third-party application, you may not have visibility to all the SQL commands the application is generating. As described in Chapter 9, you can use the STATSPACK utility to monitor the most resource-intensive queries performed between two points in time. You can take STATSPACK “snapshots” before and after a test period and then evaluate the execution paths for the most common or most resource-intensive queries during a test period. See Chapter 9 for details on STATSPACK implementation and the use of the repository introduced with Oracle Database 10g.

Acceptance Test Procedures

Developers and users should very clearly define what functionality and performance goals must be achieved before the application can be migrated to production. These goals will form the foundation of the test procedures that will be executed against the application while it is in the test environment.

The procedures should also describe how to deal with unmet goals. The procedures should very clearly list the functional goals that must be met before the system can move forward. A second list of noncritical functional goals should also be provided. This separation of functional capabilities will aid in both resolving scheduling conflicts and structuring appropriate tests.

NOTE

As part of acceptance testing, all interfaces to the application should be tested and their input and output verified.

Resource Management and Stored Outlines

You can use stored outlines to migrate execution paths between instances, and you can use the Database Resource Manager to control the allocation of system resources among database users. Stored outlines and resource management are important components in a managed development environment. The Database Resource Manager gives DBAs more control over the allocation of system resources than is possible with operating system controls alone.

NOTE

As of Oracle 10g, you can use SQL profiles to further refine the execution path selected.