OLAP AND OLAP SERVER DEFINITIONS

OLAP: ON-LINE ANALYTICAL PROCESSING

On-Line Analytical Processing (OLAP) is a category of software technology that
enables
analysts, managers and executives to gain insight into data through fast, consistent,
interactive access to a wide variety of possible views of information that has been
transformed from raw data to reflect the real dimensionality of the enterprise as
understood by the user.

calculations and modeling applied across dimensions, through
hierarchies
and/or
across members

trend analysis over sequential time periods

slicing subsets for on-screen viewing

drill-down to deeper levels of consolidation

reach-through to underlying detail data

rotation to new dimensional comparisons in the viewing area

OLAP is implemented in a multi-user client/server mode and offers consistently
rapid
response to queries, regardless of database size and complexity. OLAP helps the
user
synthesize enterprise information through comparative, personalized viewing, as
well
as
through analysis of historical and projected data in various "what-if" data model
scenarios.
This is achieved through use of an OLAP Server.

OLAP SERVER

An OLAP server is a high-capacity, multi-user data manipulation engine
specifically
designed to support and operate on multi-dimensional data structures. A multi-
dimensional structure is arranged so that every data item is located and accessed
based
on
the intersection of the dimension members which define that item. The design of
the
server and the structure of the data are optimized for rapid ad-hoc information
retrieval
in
any orientation, as well as for fast, flexible calculation and transformation of raw
data
based on formulaic relationships. The OLAP Server may either physically stage the
processed multi-dimensional information to deliver consistent and rapid response
times
to
end users, or it may populate its data structures in real-time from relational or other
databases, or offer a choice of both. Given the current state of technology and the
end
user requirement for consistent and rapid response times, staging the
multi-dimensional
data in the OLAP Server is often the preferred method.

Definitions:

The objective of multi-dimensional analysis is for end users to gain insight into the
meaning contained in databases. The multi-dimensional approach to analysis aligns
the
data content with the analyst's mental model, hence reducing confusion and
lowering
the
incidence of erroneous interpretations. It also eases navigating the database,
screening
for
a particular subset of data, asking for the data in a particular orientation and
defining
analytical calculations. Furthermore, because the data is physically stored in a
multi-
dimensional structure, the speed of these operations is many times faster and more
consistent than is possible in other database structures. This combination of
simplicity
and
speed is one of the key benefits of multi-dimensional analysis.

A group of data cells arranged by the dimensions of the data. For example, a
spreadsheet
exemplifies a two-dimensional array with the data cells arranged in rows and
columns,
each being a dimension. A three-dimensional array can be visualized as a cube with
each
dimension forming a side of the cube, including any slice parallel with that side.
Higher
dimensional arrays have no physical metaphor, but they organize the data in the way
users
think of their enterprise. Typical enterprise dimensions are time, measures,
products,
geographical regions, sales channels, etc.
Synonyms: Multi-dimensional Structure, Cube, Hypercube

A calculated member is a member of a dimension whose value is determined from
other
members' values (e.g., by application of a mathematical or logical operation).
Calculated
members may be part of the OLAP server database or may have been specified by
the
user
during an interactive session. A calculated member is any member that is not an
input
member.

A single datapoint that occurs at the intersection defined by selecting one member
from
each dimension in a multi-dimensional array. For example, if the dimensions are
measures,
time, product and geography, then the dimension members: Sales, January 1994,
Candy
Bars and United States specify a precise intersection along all dimensions that
uniquely
identifies a single data cell, which contains the value of candy bar sales in the United
States for the month of January 1994.
See: Member Combination

Members of a dimension that are included in a calculation to produce a
consolidated
total
for a parent member. Children may themselves be consolidated levels, which
requires
that
they have children. A member may be a child for more than one parent, and a
child's
multiple parents may not necessarily be at the same hierarchical level, thereby
allowing
complex, multiple hierarchical aggregations within any dimension.

Multi-dimensional databases generally have hierarchies or formula-based
relationships
of
data within each dimension. Consolidation involves computing all of these data
relationships for one or more dimensions, for example, adding up all Departments
to
get
Total Division data. While such relationships are normally summations, any type of
computational relationship or formula might be defined.
Synonyms: Roll-up, Aggregate
See: Formula, Hierarchical Relationships, Children, Parents

Derived data is produced by applying calculations to input data at the time the
request
for
that data is made, i.e., the data has not been pre-computed and stored on the
database.
The purpose of using derived data is to save storage space and calculation time,
particularly for calculated data that may be infrequently called for or that is
susceptible
to
a high degree of interactive personalization by the user. The tradeoff is slower
retrievals.
See: Pre-calculated Data

A dimension is a structural attribute of a cube that is a list of members, all of which
are
of
a similar type in the user's perception of the data. For example, all months,
quarters,
years, etc., make up a time dimension; likewise all cities, regions, countries, etc.,
make
up
a geography dimension. A dimension acts as an index for identifying values within
a
multi-dimensional array. If one member of the dimension is selected, then the
remaining
dimensions in which a range of members (or all members) are selected defines a
sub-cube.
If all but two dimensions have a single member selected, the remaining two
dimensions
define a spreadsheet (or a "slice" or a "page"). If all dimensions have a single
member
selected, then a single cell is defined. Dimensions offer a very concise, intuitive
way
of
organizing and selecting data for retrieval, exploration and analysis.

Drilling down or up is a specific analytical technique whereby the user navigates
among
levels of data ranging from the most summarized (up) to the most detailed (down).
The
drilling paths may be defined by the hierarchies within dimensions or other
relationships
that may be dynamic within or between dimensions. For example, when viewing
sales
data for North America, a drill-down operation in the Region dimension would then
display Canada, the eastern United States and the Western United States. A further
drill-
down on Canada might display Toronto, Vancouver, Montreal, etc.

A formula is a database object, which is a calculation, rule or other expression for
manipulating the data within a multi-dimensional database. Formulae define
relationships
among members. Formulae are used by OLAP database builders to provide great
richness
of content to the server database. Formulae are used by end users to model
enterprise
relationships and to personalize the data for greater visualization and insight.

Formulae with all operands within a dimension are common, even in non-OLAP
systems:
e.g., Profit = Sales - Expense might appear in a simple spreadsheet product. In an
OLAP
system, such a calculation rule would normally calculate Profit for all combinations
of
the
other dimensions in the cube (e.g., for all Products, for all Regions, for all Time
Periods,
etc.) using the respective Revenue and Expense data from those same dimensions.
Part
of
the power of an OLAP system is the extensive multi-dimensional application of
such
a
simply stated rule, which could be specified by the OLAP application builder or
created
by
the end user in an interactive session.
The true analytical power of an OLAP server, however, is evidenced in its ability to
evaluate formulae where there are members from more than one dimension. An
example
is a multi-dimensional allocation rule used in business unit profitability applications.
If,
for
example, a company has a Business Unit dimension and one of the business units
(XYZ)
is
funding a special advertising campaign for Product A, and the other business units
which
also sell Product A are willing to share the advertising costs in proportion to their
sales
of
the product, then the formula would be:

ADVERTISING EXPENSE = (PRODUCT A SALES/TOTAL CORPORATION
PRODUCT A SALES) *
ADVERTISING EXPENSE FOR PRODUCT A FOR BUSINESS UNIT
XYZ

Here, Advertising is from the Measures dimension wherever it intersects with other
dimensions (e.g., Business Unit, Product), but Product A Sales is more specific; it is
Sales
from the Measures dimension restricted to the Product A member from the Product
dimension. The Advertising Expense to be shared is the Advertising Expense for
Product
A spent by Business Unit XYZ that the business units which have non-zero sales of
Product A agreed to share. These references to several dimensions within the same
rule
make it a Cross-Dimensional Formula.
GENERATION,
HIERARCHICAL

Two members of a hierarchy have the same generation if they have the same
number
of
ancestors leading to the top. For example, the top member of a dimension is from
Generation 1. There may be two or more members in Generation 1 if there are
multiple
hierarchies in the dimension.
NOTE: The terms generation and level are both necessary to describe sub-groups of
dimension members, since, for example, although two siblings share the same
parent
and
are therefore of the same generation, they won't be from the same level if one of the
siblings has a child and the other doesn't.
Synonyms: Peer, Sibling
See: Level, Hierarchical Relationships, Parent, Children

Any dimension's members may be organized based on parent-child relationships,
typically
where a parent member represents the consolidation of the members which are its
children.
The result is a hierarchy, and the parent/child relationships are hierarchical
relationships.

Members of a dimension with hierarchies are at the same level if, within their
hierarchy,
they have the same maximum number of descendants in any single path below. For
example, in an Accounts dimension which consists of general ledger accounts, all of
the
detail accounts are Level 0 members. The accounts one level higher are Level 1,
their
parents are Level 2, etc. It can happen that a parent has two or more children
which
are
different levels, in which case the parent's level is defined as one higher than the
level
of
the child with the highest level.
See: Generation, Hierarchical

A dimension member is a discrete name or identifier used to identify a data item's
position
and description within a dimension. For example, January 1989 or 1Qtr93 are
typical
examples of members of a Time dimension. Wholesale, Retail, etc., are typical
examples
of members of a Distribution Channel dimension.
Synonyms: Position, Item, Attribute

A special data item which indicates that the data in this cell does not exist. This
may
be
because the member combination is not meaningful (e.g., snowmobiles may not be
sold
in
Miami) or has never been entered. Missing data is similar to a null value or N/A,
but
is
not the same as a zero value.

A computer language that allows one to specify which data to retrieve out of a
cube.
The
user process for this type of query is usually called slicing and dicing. The result of
a
multi-dimensional query is either a cell, a two-dimensional slice, or a
multi-dimensional
sub-cube.

Navigation is a term used to describe the processes employed by users to explore a
cube
interactively by drilling, rotating and screening, usually using a graphical OLAP
client
connected to an OLAP server.

Nesting is a display technique used to show the results of a multi-dimensional query
that
returns a sub-cube, i.e., more than a two-dimensional slice or page. The
column/row
labels will display the extra dimensionality of the output by nesting the labels
describing
the members of each dimension.
For example, the display's columns may be:

January

February

March

Actual

Budget

Actual

Budget

Actual

Budget

Prod A

Prod B

Prod A

Prod B

Prod A

Prod B

Prod A

Prod B

Prod A

Prod B

Prod A

Prod B

These columns contain three dimensions, nested in the user's preferred arrangement.
Likewise, a report's rows may contain nested dimensions:

End user applications that can request slices from OLAP servers and provide two-
dimensional or multi-dimensional displays, user modifications, selections, ranking,
calculations, etc., for visualization and navigation purposes. OLAP clients may be
as
simple as a spreadsheet program retrieving a slice for further work by a
spreadsheet-
literate user or as high-functioned as a financial modeling or sales analysis
application.

A page dimension is generally used to describe a dimension which is not one of the
two
dimensions of the page being displayed, but for which a member has been selected
to
define the specific page requested for display. All page dimensions must have a
specific
member chosen in order to define the appropriate page for display.

The page display is the current orientation for viewing a multi-dimensional slice.
The
horizontal dimension(s) run across the display, defining the column dimension(s).
The
vertical dimension(s) run down the display, defining the contents of the row
dimension(s).
The page dimension-member selections define which page is currently displayed. A
page
is much like a spreadsheet, and may in fact have been delivered to a spreadsheet
product
where each cell can be further modified by the user.

Pre-calculated data is data in output member cells that are computed prior to, and
in
anticipation of, ad-hoc requests. Pre-calculation usually results in faster response to
queries at the expense of storage. Data that is not pre-calculated must be
calculated
at
query time.
See: Derived Data/Members, Output Data

Reach through is a means of extending the data accessible to the end user beyond
that
which is stored in the OLAP server. A reach through is performed when the OLAP
server
recognizes that it needs additional data and automatically queries and retrieves the
data
from a data warehouse or OLTP system.

To change the dimensional orientation of a report or page display. For example,
rotating
may consist of swapping the rows and columns, or moving one of the row
dimensions
into
the column dimension, or swapping an off-spreadsheet dimension with one of the
dimensions in the page display (either to become one of the new rows or columns),
etc.
A
specific example of the first case would be taking a report that has Time across (the
columns) and Products down (the rows) and rotating it into a report that has
Product
across and Time down.
An example of the second case would be to change a report which has Measures
and
Products down and Time across into a report with Measures down and Time over
Products across. An example of the third case would be taking a report that has
Time
across and Product down and changing it into a report that has Time across and
Geography down.
Synonym: Pivot

Restricting the view of database objects to a specified subset. Further operations,
such
as
update or retrieve, will affect only the cells in the specified subset. For example,
scoping
allows users to retrieve or update only the sales data values for the first quarter in
the
east
region, if that is the only data they wish to receive.

A selection is a process whereby a criterion is evaluated against the data or
members
of
a
dimension in order to restrict the set of data retrieved. Examples of selections
include
the
top ten salespersons by revenue, data from the east region only and all products
with
margins greater than 20 percent.
Synonyms: Condition, Screen, Filter

A slice is a subset of a multi-dimensional array corresponding to a single value for
one
or
more members of the dimensions not in the subset. For example, if the member
Actuals
is
selected from the Scenario dimension, then the sub-cube of all the remaining
dimensions
is
the slice that is specified. The data omitted from this slice would be any data
associated
with the non-selected members of the Scenario dimension, for example Budget,
Variance,
Forecast, etc. From an end user perspective, the term slice most often refers to a
two-
dimensional page selected from the cube.

A multi-dimensional data set is sparse if a relatively high percentage of the possible
combinations (intersections) of the members from the data set's dimensions contain
missing data. The total possible number of intersections can be computed by
multiplying
together the number of members in each dimension. Data sets containing one
percent,
.01
percent, or even smaller percentages of the possible data exist and are quite
common.
See: Dense