OracleBIBlog Search

Friday, June 18, 2010

It's a dark and stormy Friday night, and as you drift off into slumberland in your warm, dry and cozy bed, a thought pops up in your mind: "If a butterfly flaps its wings in Costa Rica tonight, will the CFO's Executive Financial Summary dashboard still work?" Suddenly you are wide awake and feeling a bit of paranoia. You jump out of bed, fire up your laptop, log into the corporate VPN and then into OBIEE...

This sounds like the beginning of a bad BI horror story -- and hopefully for all of you a scenario like this truly is just that - a story. But we've all experienced some variation of this scene. The OBIEE equivalent of butterflies do flap their wings in Costa Rica, and the result does sometimes alter the OBIEE equivalent of weather patterns in Japan. Change the name of a presentation column... or the mapping rules of a logical column... or the aggregation rules of a logical table... or a minor extraction rule in the warehouse ETL logic... or even simply merge your local RPD work into a MUD repository... and you risk causing an error in a report that might not be noticed until days after the change -- usually when the report is most needed.

The real truth is that an OBIEE system just like any other software implementation has some inherent fragility that requires vigorous regression testing strategies to keep it running smoothly. But while most "traditional" software systems have very clearly defined behaviors that lend themselves well to control by means of various specific testing strategies, a Business Intelligence application by nature does not always necessarily have completely predictable outcomes - especially true in OBIEE given the complexity of the BI Server and, for example, how it constructs queries against the underlying data.

Given the complexity of the OBIEE architecture and the fleeting nature of the data, establishing an adequate regression testing strategy is a challenge.

One excellent approach is to identify commonly executed queries with Usage Tracking (or just identify a specific request's logical query), then execute the corresponding logical sql using nqcmd.exe. This is an excellent approach but can respond to only two results: success or failure. In most cases this is entirely adequate.

Another approach favored of course by Oracle is to implement their Application Testing Suite, which looks ike a promising tool but frankly I have not heard of any usage in the real world -- if anyone has direct experience with this tool, feel free to respond. Moreover it goes without saying that license fees would be a significant factor in evaluating its total ROI.

A third way to address regression testing in OBIEE is to set up a series of webcat (HTTP) requests that can be executed on a scheduled basis by an automated web testing utility. Several such utilities exist - LoadRunner being the most prevalent - but as with Oracle's offering, most require licensing.

One popular web testing utility distributed as open-source software and therefore free of licensing costs has gained some following among the developer community: a Firefox plugin called Selenium (the IDE flavor, to be precise), which can execute a wide range of web page interactions and combine them into test scripts. These test scripts can then be assembled into test suites and exported to a full-fledged program in a variety of commonly used languages/frameworks (C#, Java, Perl, PHP, Python, Ruby), at which point this program can be scheduled for execution just like any other.

While Selenium is highly flexible, fairly easy to use and very cheap (at least in terms of licensing costs), it also has some drawbacks, particularly when used to test an OBIEE system. In the hopes of blazing the "Selenium for OBIEE" trail, I have attempted my own small POC by writing some basic test scripts against the Paint webcat. In doing so I identified some gotchas and workarounds that will at the very least help you successfully construct OBIEE test scripts using this very capable tool. I will explore those workarounds and advanced configurations in later posts. But my overall opinion is that Selenium is a very capable testing tool which deserves serious consideration for use in an OBIEE environment.

Here are what I consider general strengths of Selenium:

Flexible

Can execute a variety of actions: open urls, click links, interact with prompts, execute javascript ...

Can be designed to respond to specified conditions either by failing OR by simply logging and continuing

Friday, June 4, 2010

Recently I worked on a project that used Non-System Session variables for some of the reports. While on the project I learned a nice trick on how to modify value of these variables in Answers to perform testing without having to jump out to a dashboard.

In this sample Answers report I'm using a Non-System Session variable called ReportDate to calculate the employees most recent department and compensation for head count purposes.The report below shows the head count and compensation numbers as of May 31, 2010 since that is what I defaulted the session variable to in the initialization block.

Now I can modify the value of the ReportDate session variable by clicking on the "Advanced" tab at the top and then scrolling down to the "Prefix" portion of the page. Within the "Prefix" box I can enter the following syntax to change the date to January 1, 2009: SET VARIABLE ReportDate='01/01/2009';

Here is the new screen shot of the report showing the break out of employees for January 1, 2009.

If you need to modify multiple Non System Session Variable values simply delimit them with a comma like so...SET VARIABLE ReportDate='01/01/2009', SessionVariable2='Something', SessionVariable3='SomethingElse';

Data compression drives aggregation (at least 10 x, preferably closer to 100x). This could be implemented for Service Request tables

Try aggregation based on dimensions instead of factsAggregation to the ratio of 50:1

Caching Strategy

Proper caching strategy needs to be implemented as this is among the best practice of optimization . Caching must be implemented based upon users by groups

Set the physical table cache property on Mater table not aliases

Set cache persistence time based of refresh schedule

Ideally do not opt Cache never expires in the physical table properties

Caching logs may be monitored for user sessions and queries may be analyzed across various users to optimize

Use ODBC extension functions to purge cache which has the following advantages

Analytics Cache Purge ODBC Extension Functions

ETL routines can proactively call ODBC extension functions to manage cache. This ensures that obsolete cache is purged as soon as possible

Purging Cache by Logical Query is a new capability. This will be particularly helpful in purging cache associated with logical queries that are used in Cache Seeding.

Purging the entire cache set in one command is a new capability that can be used in various scenarios including development and testing environments.

SAPurgeCacheByQuery will purge a cache entry that exactly matches the logical query plan derived from a specified query. This function takes one, and only one, parameter representing query text. USE_ADVANCED_HIT_DETECTION. The default value is NO which will have the pre-existing cache hit detection behavior. The value YES will invoke the two pass algorithm which will examine a greater search space and is potentially more expensive than the default behavior. you should set the value to YES when you wish to maximize the number of cache hits they obtain

The data in the databse needs to be archived from time to time in order to manage the data size. The idea is to keep only a certain amount of data in the warehouse and archive the rest of the data. For example, keep only the last 12 months of data in the warehouse, and archive older data from the warehouse.

This database archive strategy will help limit the number of rows across tables with huge data

2. Database Partition strategy

A good database partition strategy will help query performance since the query will only need to look at a specific partition to obtain the result. For example: The Organization table can be partition based on Active Flag ; The Service Request table can be partitioned based on X_CALC_FLG = 'Y’

3. Database Indexes

A database index should be added for the most queried columns. The database index should also be rebuild from time to time to ensure its effectiveness. ALTER INDEX MONITORING USAGE command that collects statistics over a period of time.

Increase Parallel Query by configuring the database server for an optimal degree of parallel processing.

Given sufficient resources, this can greatly improve query

Make sure that cost-based optimization is enabled on databases

Set the tablespace to at least the same as the transactional database size. Make sure the temporary tablespace has adequate space

Create histogram statistics for these indices to enable the optimizer to better perform queries on critical tasks

The ETL implementation team should use analyze table commands specific to their database environment, that can be executed after the respective session has completed. This can be achieved by using the post-session command property of each sessionTry to break up ETL processing so data is extracted and staged remotely then compressed for further local processing

Remote ETL servers extract and stage to local flat file, compress and send the flat file to central ETL server

Central ETL Server performs the ‘stage to data warehouse’ process

Manage I/O Traffic --> Manage the input and output accesses to disk storage by striping the disk storage. The best and simplest action is to install disk storage arrays (RAID), the second best is to stripe volumes using a Logical Volume Manager.

De fragment tables to which the data is extracted

See how much memory is on the available server and set the values for SORT_AREA_SIZE and SORT_AREA_RETAINED_SIZE rather high. 20, 30 or 40 MB per session is not uncommon in a data warehouse situation. Also set HASH_AREA_SIZE rather large

Subscribe

BI/EPM Video on You Tube

Loading...

Disclaimer

Opinions expressed are entirely our own and do not reflect the position of BI Consulting Group, Oracle or any other corporation. Do NOT take anything written here, unless explicitly mentioned otherwise, to be BICG policy.