Category Archives: DB2

Post navigation

This jdbc information is based on Netezza (7.2.0) JDBC for InfoSphere Information Server11.5. so, here are a few pointers for building an IBM InfoSphere Information Server (IIS) isjdbc.config file.

Where to place JAR files

For Infosphere Information Server installs, as a standard practice, create a custom jdbc file in the install path. And install any download Jar file not already installed by other applications in the jdbc folder. Usually, jdbc folder path looks something like this:

Effective Practices

Use of an Enterprise Scheduler

The scheduling service in InfoSphere information Server (IIS) leverages the operating system (OS) scheduler, the common enterprise scheduler can provide these capabilities beyond those of a common OS scheduler:

Improved ability to monitor and alert for mission critical process that may be delayed or failing

Improved ability to automate disparate process requirements within and across systems

Improved load balancing to optimize use of resources or to compensate for loss of a given resource

Improved scalability and adaptability to infrastructure or application environment changes

Use of data Source Timestamps

When they exist or can be added to data, ‘created’ and ‘last updated’ timestamps can greatly reduce the impact of Change Data Capture (CDC) operations. Especially, if the data warehouse, data model and load process store that last success run time of CDC jobs. This reduces the number of rows required to be processed and reduces the load on the RDBMS and/or ETL application server. Leveraging ‘created’ and ‘last updated’ can, also, greatly reduce processing time required to perform the same CDC processes.

Event Based Scheduling

Event based scheduling, when coupled with an Enterprise scheduler, can increase data availability, distribute work opportunistically. Event based scheduling can allow all or part of a process stream to begin as soon as predecessor data sources have completed the requisite processes. This can allow processes to begin soon as possible, which can reduce resource bottlenecks and contention. This, potentially, allows data to be made available earlier than a static time based schedule. Event based scheduling can also delay processing, should the source system requisite processing completion be delayed; thereby, improving data accuracy in the receiving system.

Integrated RDBMS Maintenance

Integrating RDBMS Maintenance into the process job stream can perform on demand optimization as the processes move through their flow, improving performance. Items such as indexing, distribution, and grooming, maintenance at key points ensures that the data structures are optimized for follow on processes to consume.

Application Server and Storage Space Monitoring and Maintenance

Monitoring and actively clearing disk space can not only improves overall performance, and reduce costs, but it also improves application stability.

Data Retention Strategies

Data Retention strategies, an often overlooked form of data maintenance, which deals with establishing policies ensure only truly necessary data is kept and that information by essential category, which is no longer necessary is purged to limit legal liability, limit data growth, storage costs, and improve RDBMS performance.

Use Standard Practices

Use of standard practices both, application and industry, allows experienced resources to more readily understand the major application activities, their relationships, dependency, design and code. This facilitates resourcing and support over the life cycle of the application.

Until 1996, the Central Processing Unit (CPU) and core were accentually the same thing. These days, a CPU’s relationship could be 1-to-1 or 1-to many cores. Therefore, cores are essentially subcomponents of the CPU architecture.

Tuning SQL is one of those skills, which is part art and part science. However, there are a few fundamental approaches, which can help ensure optimal SQL select statement performance.

Structuring your SQL

By Structuring SQL Statements, much performance can be gained through good SQL statement organization and sound logic.

Where Clause Concepts:

Use criteria ordering and Set Theory thinking. SQL can be coupled with set-theory to aid conception of the operations being conducted. Order your selection criteria to execute criteria which arrives at the smallest possible row set first. Doing so, reduces the volume of rows to be processed by follow-on operations. This does require an understanding of the data relationships to be effective.

SQL First Select Criteria Pie Chart

Join Rules (equi joins, etc.)

When constructing your joins, consider these rules:

Join on keys and indexed columns: The efficiency of your program improves when tables are joined based on indexed columns, rather than on non-indexed ones.

Use Joins Instead of Subqueries: A join can be more efficient than a correlated subquery or a subquery using IN. Use caution when specifying ORDER BY with a Join: When the results of a join must be sorted, limiting the ORDER BY to columns of a single table can cause the database to avoid a sort.

Provide Adequate Search Criteria: When possible, provide additional search criteria in the WHERE clause for every table in a join. These criteria are in addition to the join criteria, which are mandatory to avoid Cartesian products

Order of Operations SQL & “PEMDAS”

To improve your SQL, careful attention needs to be paid to the mathematical order of operations; especially, parentheses since they not only set the order of operation, but also the boundaries of each subset operation.

PEMAS is “Parentheses, Exponents, Multiplication and Division, and Addition and Subtraction”.

Use parentheses () to group and specify the order of execution. SQL observes the normal rules of arithmetic operator precedence.

Precedence

Operator(s)

Operation(s)

Notes

1

( )

Parentheses

If the parentheses are nested, the expression in the innermost pair is evaluated first. If there are several un-nested parentheses, then parentheses are evaluated left to right.

2

*
/
%

Multiplication
Division
Modulus

If there are several, evaluation is left to right.

3

+
–

Addition
Subtraction

If there are several, evaluation is left to right.

Index Leveraging (criteria ordering, hints, append, etc.)

Avoid Full Table Scans: within the scope of a SQL statement, there are many conditions that will cause the SQL optimizer to invoke a full-table scan. Avoid Queries:

with NULL Conditions (Is NUll, Is Not NUll)

Against Unindexed Columns

with Like Conditions

with Not Equals Condition (<>, !=, not in)

with use built-in Function (to_char, substr, decode, UPPER)

Use UNION ALL instead of UNION if business rules allow

UNION: Specifies that multiple result sets are to be combined and returned as a single result set. Query optimizer performs extra work to return to avoid duplicate rows.

UNION ALL: Incorporates all rows into the results. This includes duplicates. Query optimizer just needs to concatenate the result sets with no extra work

Use stored procedures instead of ad hoc queries when possible. Stored procedures are precompiled and cached

Avoid cursor use when possible

Select only the rows needed

Use NOLOCK hint in select statement to avoid blocking

Commit transactions in smaller batches

Whenever possible use tables instead of views

Make sure comparison columns whether using JOIN or WHERE clause are exactly same data type. For example if we are comparing Varchar column to nchar columns the query optimizer has to do a CONVERT before comparing the values

Note: You do not necessarily need to remove all full table scans from your query’s execution plan. Tables with few rows, few columns, or thin columns may fit into few database blocks. In this case, a full table scan will always be the most efficient access

If we consider Materialized Views (MV) in their simplest form, as a point in time stored query result, then materialized views serve two primary purposes Performance Optimization and Semantics Simplification.

Performance Optimization

There are several ways in which materialized views can improve performance:

FacilitateDatabase Optimizers: in some databases can be partitioning and indexing which are considered by database optimizers. Also, some databases, in which more than one materialized view has been applied to a table, the database optimizer will consider the all the associated materialized views when optimizing queries.

Reduced Network Workloads: by the use of database replication and/or mass deployment techniques to materialized views, they can be distributed to more local proximity to the consumers, thereby, reducing the data volume across the network and provided business continuation/disaster recovery capabilities, should the primary site become temporarily unavailable.

Precalculation and/or Preaggregation: Performing calculation and aggregation of information upon creation of a materialized view, eliminates the need to perform these functions on an on-demand basis as various consumers submit requests.

Data Subsets: by applying filters to eliminate unnecessary data (e.g. history data no long in common reporting use) or unnecessary data attributes (e.g. unused columns or columns intended for other information purposes) the impact of filter for these items is reduced and is effectively eliminated for consumers of the materialized view.

Semantics Simplification

Materialized views can be used to simplify the semantics provided to information consumers with Ad hoc capabilities and/or to simplify the construction of reporting and analytics objects. Depending on the database and/or integration tools in use to create them, materialized views can simplify the consumer experience by:

Reduce or Eliminate Join Coding: When constructed materialized views can perform the joins and populate the materialized view with the value results of from the join table, thereby, eliminating the need for the consumer to perform this function as an ad hoc user or in the semantics of reporting and analytics tools

Pre-application of Business Rules: When constructed materialized views can apply business rules to facilitate queries by adding indicator flags and preapplying special business logic to data and populating the materialized view with the value results, thus, eliminating the need for the consumer to perform this function as an ad hoc user or in the semantics of reporting and analytics tools.

Precalculation and/or Pre-aggregation: Performing calculation and aggregation of information upon creation of a materialized view allow the consumer to use the results without need to build the calculations and/or aggregations in to the ad hoc query or in the semantics of reporting and analytics tools. This also, helps to ensure information accuracy and consistency.

Data Subsets: By prefiltering the data during the creation of the view unnecessary or unused data and columns are not available to consumers and do not need to be filter out of ad hoc queries or in the semantics of reporting and analytics tools

When working with different databases syntax can cause questions and confusion. Recent having been ask what the difference was between a left join and a left outer join, a subject which I hadn’t thought about in a while, I thought a simple explanation might be in order. Actually, there is no difference between a left join and a left outer join, other than syntax. Both perform the exact same operation in SQL, where the Left (Outer) Join will retain those rows for which there was a match in both tables and, also, retain those rows which exist only in the left (controlling) table of the join.