MDX Essentials: Basic Set Functions: The Filter() Function

Monday Feb 9th 2004 by William Pearson

Share:

Filter out records you do not want without compromising the efficiencies of MDX sets. In this lesson, we will explore using the powerful Filter() function to return subsets of larger sets, based upon conditions we supply.

About the Series ...

This is the sixteenth article of the series, MDX Essentials.
The series is designed to provide hands-on application of the fundamentals of
the Multidimensional Expressions (MDX) language, with each tutorial
progressively adding features designed to meet specific real-world needs.

What We Accomplished in our Last Article

In the last article of
the series, Basic Set
Functions: The Except() Function,
we explored the useful Except() function, whose purpose is to return the
difference between two sets. In addition to discussing the purpose and
operation of the Except() function, we focused on the treatment of
duplicates by the function. We examined the syntax surrounding its uses, and
illustrated its application in practice exercises, providing hands-on exposure
to the use of Except().

We first undertook a
multi-step example in which we exposed default handling of duplicates by the
function, then explored an additional example where we practiced the use of the
ALL flag to override the Except() function's default duplicate
handling. Throughout the practice exercises, we discussed the results we
obtained with each step's execution, remarking on the distinguishing
characteristics of optional flag settings.

Introduction

In
this lesson, we will expose what many consider one of the most useful functions
in the MDX arsenal, the Filter() function. The general purpose of the Filter()
function is to allow us to filter out parts of a set that we do not need in a
given situation, and to return a subset of a larger set as a result. Uses of
the Filter() function, as with many MDX functions, can range from the
sublimely simple to the impressively advanced, and it can be used in many
innovative ways. The objective, of course, is the support of precise analysis
to meet our business needs. We will see in this article how the Filter()
function is a prime example of the efficiency and precision we can attain by using
judiciously chosen functions from our MDX toolsets.

Along
with an introduction to the Filter() function, this lesson will include:

an examination of the syntax surrounding the function;

illustrative examples of the uses of the function in practice
exercises;

a brief discussion of the MDX results we obtain in the
practice examples.

The Filter() Function

The Filter() function, according to
the Analysis Services Books Online, "returns the set resulting from
filtering a specified set based on a search condition." That is, it
returns a subset of the specified set based upon the action of a filter,
whose operation is selectively enforced by a condition in the function.
As we shall see, the uses of the function are varied, and its innovative
employment with the correct logic can result in the efficient delivery of very
precise results.

We will examine the syntax for the Filter() function
in general, and then will undertake practices exercises to reinforce the concepts.
We will first create an unfiltered query, and then build to the placement of
the function into the query, to meet an illustrative business requirement. In
this way, we will be able to see clearly how Filter() operates to create
a subset from a set that is already known to us.

Next, we'll explore a more complex scenario where we make
innovative use of the Filter() function to meet a more sophisticated
requirement. This will give us a richer understanding of the potential that the
Filter() function offers, and allow us to examine it from a couple of
different perspectives.

Discussion

Filter() allows us to filter a set based upon
a particular search condition. In other words, the function
allows us to return a subset of a set that the query extracts from a
cube, based upon whether it meets certain criteria. Members of the retained
subset are returned in their natural order. While the basic use of the Filter()
function is quite straightforward, we can use it to return complex results, as
potential combinations with other functions and operators are myriad.

Examples that occur in a business scenario might include the
need to see only the members of a group of employees whose organizational
salary costs have increased over the prior year, or to perhaps return a list of
customers (out of the larger set of all customers) whose average monthly
expenditures at our retail outlets have decreased.

Let's look at some syntax illustrations to further clarify
the operation of Filter().

Syntax

Syntactically, the set
to be returned by the Filter() function is placed within the parentheses
to the right of Filter, and separated by a comma from the specified
criteria that the returned subset is to meet, or the search condition
applied. The syntax is shown in the following string:

Filter(<<set>>, <search condition>)

Remarks

The Filter() function returns the tuples of the
specified set that meet the criteria of the search condition. The
returned subset represents the portion of the larger, specified set that evaluates
as "true" with regard to the logical expression contained in the
Boolean search condition.

The order of the returned subset is the same as that found
in the larger specified set. If there are no tuples in the specified set for
which the search condition is true, an empty set is returned by the function.

The following example expression illustrates use of the Filter()
function in its most basic form. Suppose we are asked by a group of FoodMart
information consumers to present total Warehouse Sales for the cities in
which we operate warehouses, where the total 1998 sales of each city is
$ 10,000 and above. We might approach this need with an expression similar to
this:

The query, with the measure Warehouse
Sales specified for the column, the results of the Filter() function
specified for the rows, and the year 1998 from the Time dimensionused as slicer, would result in the return of the set depicted in Table
1.

Warehouse

Sales

Vancouver

21,730.73

Mexico City

10,662.54

San Andres

22,291.58

Acapulco

23,817.12

Orizaba

20,294.02

Camacho

23,140.65

Hidalgo

14,279.90

Los Angeles

23,998.14

San Diego

19,462.39

Portland

25,343.95

Salem

29,796.57

Seattle

26,692.80

Tacoma

30,336.79

Table 1: Results of a Query with Filter(), Selecting
Warehouse Sales as the Measure and 1998 as the Slicer

In the example above, we use the Filter() function,
in combination with the .Members function (see MDX
Members: Introducing Members and Member Functionsfor a tutorial
on this and related family functions), to enumerate the cities for whom the
data reflects warehouse sales of greater than $ 9,999.

We will get a chance to see these components in operation
again, as we practice the use of the Filter() function in the section
that follows. We will look at a relatively simple instance, then a slightly
more complex one, as we explore the use of the syntax we have discussed, within
the steps of our examples.

Practice

Let's reinforce our understanding of the basics we have
covered so far, by using the Filter() function in a manner that
illustrates its operation in reducing a specified set to return only a
subset whose tuples meet its search condition. As we have done
throughout the series, we will employ the MDX Sample Application for
constructing and executing the MDX we examine, and for viewing the result
datasets we obtain.

1.
Start the MDX
Sample Application.

2.
Clear the top
area (the Query pane) of any queries or remnants that might appear.

3.
Ensure that FoodMart
2000 is selected as the database name in the DB box of the toolbar.

4.
Select the Warehouse
cube in the Cube drop-down list box.

We will next compose a straightforward query, to gain an
understanding of the use of the Filter() function to return a subset of
a specified set that we will illustrate through a business need scenario. Let's
say that a group of information consumers within the organization come to us
with a requirement: they wish to see a simple report that lists the Product
Categories that have experienced a decrease in shipping volume
between the first and second quarters of 1998. Our query
will focus on Shipping Units, a value that is captured monthly within
the FoodMart organization and which is stored in the Warehouse cube.

First, we will create a core query to retrieve the data for
all Product Categories. This will generate a result dataset which we can
next compare to the same query with the Filter() function in place, to
reinforce our understanding of its operation.

NOTE:A word of explanation for
those that might be joining the MDX Essentials
series with this article: The layout I use in my code illustrations might seem
unlike that of other references (indeed, it's rare to find any two references
that illustrate MDX expressions and queries with identical presentations!); my
intent here is to attempt to make the components of the coding easily
understandable by organizing them in a clear manner.

The
code will execute in virtually any case, whether it is arranged exactly as
shown or not, assuming that the elements contained in the example are typed in
correctly. I have learned from experience with large audiences that there are,
beyond doubt, a few "purists" out there that maintain there is a "correct"
way to arrange the code - and everything else; I don't belong to any such "clubs,"
however, and am focused more on clarity in my examples. Please do not take my
eccentricities as being reflective of any "manual of style."

We see the total Units Shipped volumefor all Product Categories, returned for the first two quarters
of year 1998. A cursory review reveals that some categories experienced
increases in volumes shipped between quarters, while others experienced a
decline in volume.

We are
thus placing the Filter() function within our rows axis. The modified
query appears in the Query pane as depicted in Illustration 2.

Illustration 2: The Query - with the Addition of the
Filter() Function

9.
Replace the
existing comment line for the query with the following:

-- MDX16-2: Tutorial Query No. 2

10.
Execute the
query by clicking the Run Query button on the toolbar.

The filtered
dataset is returned, as shown in Illustration 3.

Illustration 3: Result Dataset - With Filter() Function in
Place

Our modified query contains a Filter() function similar to the one we presented in the Remarks section above, but with a slight complication. The example not only demonstrates the operation of the function, but also accomplishes a complementary objective: It shows how we can easily compare the same measure under different dimensional scenarios - that is, in two different sets. In the current example, Units Shipped in two different quarters is compared by adding a Boolean expression as the search condition.

11. Select File -> SaveAs, and name the query MDX16-2. Leave the query open, once again.

As we have already noted, the Filter() function can be used in conjunction with other functions to produce more complex results. Let's compound the business requirement we outlined earlier, and create an illustrative example of just such a case.

Let's say that the information consumers, while satisfied with the response we have supplied using the previous query, have been emboldened to ask for more. (This is probably a scenario well known to readers who act in a support capacity of this nature in the business world.) The consumers now state that they want to see the list of products that have experienced a decrease in shipping volume, as before, but this time they would like to see the products concerned within the context of the warehouses involved. In other words, they wish to see the products whose shipping volumes declined, grouped by the warehouses involved.

We can handle this easily with the introduction of a CrossJoin() function in our specification of the rows axis. To do so, we will take the following steps:

Our modified query meets the objective established by the information consumers, and displays the products with declining shipping volumes over the Q1 and Q2 operating periods, by the warehouse location. We can thus see that, regardless of the "compound" nature of the set in our Filter() function, the results are returned with accuracy and completeness.

15. Select File -> SaveAs, and name the query MDX16-3.

Many far more sophisticated uses for the Filter() function are possible, and, as we build our functional toolset within the MDX Essentials series in the months ahead, we will employ the function in numerous ways. It is often more efficient to ask for subset of a larger set than to specify each of the targeted set's member tuples.

Summary...

In this lesson, we explored the powerful Filter() function, whose general purpose is to filter out parts of a set to return a precise subset to meet our needs. The Filter() function offers many flexible and innovative uses, and, as we have seen, can incorporate the operation of additional functions within its search condition to allow us to build substantial logic into our filtering efforts. The Filter() function is an excellent tool for the support of precise analysis.

In addition to discussing the purpose and operation of the Filter() function, we illustrated its application in a couple of practice exercises. We began with an unfiltered query, to which we then applied Filter(), to show its effectiveness in meeting a business need to return a subset of the initially unfiltered dataset. We then substituted a more complex Filter() function into our query to provide a solution within a second example, illustrating the way that we can combine other functions with a Filter() function to return more sophisticated datasets. Finally, throughout the creation and execution of the practice examples, we discussed the results we obtained to confirm our understanding of the function and its operation.