ClientDataSet Aggregates and GroupState

Abstract: This article describes how to use aggregates to calculate simple statistics, as well as how to use group state to improve your user interfaces.

One of the advantages to using a ClientDataSet in your applications is the
large number of features it enables. In this
article I continue my series on using ClientDataSets with a look at both
aggregates and group state. Aggregates are objects that can perform the automatic
calculation of basic descriptive statistics based on the data stored in a
ClientDataSet. Group state, by comparison, is information that identifies the
relative position of a record within a group of records, based on an index.
Together, these two features permit you to add easy-to-maintain capabilities to
your applications.

If you are unfamiliar with either aggregates or group state, you might be
wondering why I am covering these two features together in this article. The
answer is simple. Both are associated with grouping level, which is an
index-related feature. Because the discussion of aggregates necessarily involves
grouping level, the coverage of group state is a natural addition. This article
begins with a look at aggregates. Group state is covered in a later section.

Understanding Aggregates

An aggregate is an object that can automatically perform simple descriptive
statistical calculations across one or more records in a ClientDataSet. For
example, imagine that you have a
ClientDataSet that contains a list of all purchases by your customers. If each
record contains a field that identifies the customer, the number of items
purchased, and the total value of the purchase, an aggregate can calculate the
sum of all purchases across all records in the table. Yet another aggregate can
calculate the average number of items purchased by each customer. In all, a
total of five statistics are supported by aggregates. These are count, minimum,
maximum, sum, and average.

There are two types of objects that you can use to create aggregates:
TAggregate objects and TAggregateField objects. A TAggregate is a
TCollectionItem descendant, and a TAggregateField is a descendent of the TField
class. While these two aggregate types are similar in how you configure them,
they differ in their use. Specifically, a TAggregateField, because it is a
TField descendent, can be associated with a data-aware control, permitting the
value of the aggregate to be displayed automatically. By comparison, a
TAggregate is an object whose value you must explicitly read at runtime.

One characteristic shared by both types of aggregates is that they require
quite a few specific steps to configure. If you have never used aggregates in
the past, be patient. If your aggregates do not appear to
work at first, you probably missed one or more steps. However, after you get
comfortable configuring aggregates, you will find that they are relatively
simple to use.

Because TAggregateField instances are somewhat easier to use, they will be
discussed in the following section. Using TAggregates is discussed later in this
article.

Creating Aggregate Fields

Aggregate fields are virtual, persistent fields. While they are similar to other virtual,
persistent fields, such as calculated and lookup fields, there is one very
important difference. Specifically, introducing one or more aggregate fields
does not preclude the automatic, runtime creation of dynamic fields. By
comparison, creating at least one other type of persistent field, such as a data
field, lookup field, or calculated field, prevents the ClientDataSet from
creating other TFields for that ClientDataSet at runtime. As a result, it is
always safe to create aggregate fields at design-time, whether or not you intend to instantiate
any other TField instances at design-time.

As mentioned earlier, adding an aggregate field requires a number of specific
steps in order to configure it correctly. These are:

Add the aggregate field to a ClientDataSet. This can be done at
design-time using the Fields Editor, or at runtime using the
TAggregateField's constructor.

Set the aggregate field's Expression property to define the calculation
that the aggregate will perform

Set the aggregate field's IndexName property to identify the index to base
grouping level on

Set the aggregate field's GroupingLevel property to identify which records
to perform the aggregation across

Set the aggregate field's Active property to True to activate it

Set the aggregate field's Visible property to True

Set the AggregatesActive property of the ClientDataSet to which the
aggregate is associated to True

Because there are so many steps here, it is best to discuss creating
aggregate fields using an example. Use the following steps in Delphi or Kylix to
create a simple project to which an aggregate field will be added.

Create a new project.

Add to your main form a DBNavigator, a DBGrid, a ClientDataSet, and a
DataSource.

Set the Align property of the DBNavigator to alTop, and the Align property
of the DBGrid to Client. Next, set the DataSource property of both the
DBNavigator and the DBGrid to DataSource1. Now set the DataSet property of
the DataSource to ClientDataSet1.

Set the Filename property of the ClientDataSet to the Orders.cds table (or
Orders.xml),
located in Borland's shared data directory. If you installed your software
using the default installation paths in Windows, you will find this file in c:Program
FilesCommon FilesBorland SharedData.

Your main form should now look something like the following:

Adding the Aggregate Field

At design-time, you add aggregate fields using the ClientDataSet's Fields
Editor. Use the following steps to add an aggregate field

Right-click the ClientDataSet and select Fields Editor.

Right-click the Fields Editor and select New Field (or press Ctrl-N). Delphi displays the
New Field dialog box.

Set Name to CustomerTotal and select the Aggregate radio button in
the Field type area. Your New Field dialog box should now look like the
following

Click OK to close the New Field dialog box. You will now see the newly
added aggregate field in the Fields Editor, as shown here.

Defining the Aggregate Expression

The Expression property of an aggregate defines the calculation the
aggregate will perform. This expression can consist of constants, field
values, and aggregate functions. The aggregate functions are AVG, MIN, MAX, SUM,
and COUNT. For example, to define a calculation that will total the AmountPaid
field in the Orders.cds table, you use the following expression:

SUM(AmountPaid)

The argument of the
aggregate function can include two or more fields in an expression, if you like.
For example, if you have two fields in your table, one named Quantity and the
other named Price, you can use the following expression:

SUM(Quantity * Price)

The expression can also include constants. For example, if the tax rate is
8.25%, you can create an aggregate that calculates total plus tax, using
something similar to this:

SUM(Total * 1.0825)

You can also
set the Expression property to perform an operation on two aggregate functions,
as shown here

MIN(SaleDate) - MIN(ShipDate)

as well as perform an operation between an expression function and a
constant, as in the following

MAX(ShipDate) + 30

You cannot, however, include an aggregate function as the expression of
another aggregate function. For example, the following is illegal:

SUM(AVG(AmountPaid)) //illegal

Nor can you use a calculation between an aggregate function and a field.
For example, if Quantity is the name of a field, the following expression is
illegal:

SUM(Price) * Quantity //illegal

In
this particular case, we want to calculate the total of the AmountPaid field. To
do this, use the following steps:

Select the aggregate field in the Fields Editor.

Using the Object Inspector, set the Expression property to SUM(AmountPaid) and
its Currency property to True.

Setting Aggregate Index and Grouping Level

An aggregate needs to know across which records it will perform the
calculation. This is done using the IndexName and GroupingLevel properties of
the aggregate. Actually, if you want to perform a calculation across all records
in a ClientDataSet, you can leave IndexName blank, and GroupingLevel set to 0.

If you want the aggregate to perform its calculation across groups of
records, you must have a persistent index whose initial fields define the group.
For example, if you want to calculate the sum of the AmountPaid field separately
for each customer, and a customer is identified by a field name CustNo, you must
set IndexName to the name of a persistent index whose first field is CustNo. If
you want to perform the calculation for each customer for each purchase date,
and you have fields named CustNo and SaleDate, you must set IndexName to the
name of a persistent index that has CustNo and SaleDate as its first two fields.

The persistent index whose name you assign to the IndexName property can have
more fields than the number of fields you want to group on. This is where
GroupingLevel comes in. You set GroupingLevel to the number of fields of the
index that you want to treat as a group. For example, imagine that you set
IndexName to an index based on the CustNo, SaleDate, and PurchaseType fields. If
you set GroupingLevel to 0, the aggregate calculation will be performed across
all records in the ClientDataSet. Setting GroupingLevel to 1 performs the
calculation for each customer (since CustNo is the first field in the index).
Setting GroupingLevel to 2 will perform the calculation for each customer for
each sale date (since these are the first two fields in the index).

It is interesting to note that the TIndexDef class, the class used to define
a persistent index,
also has a GroupingLevel property. If you set this property for the index, the index
will contain
additional information about record grouping. So long as you are setting an
aggregate's GroupingLevel to a value greater than 0, you can improve the performance
of the aggregate by setting the persistent index's GroupingLevel to a value at
least as high as the aggregate's GroupingLevel. Note, however, that a persistent
index whose GroupingLevel property is set to a value greater than 0 takes a little
longer to generate and update, since it must also produce the grouping information.
This overhead is minimal, but should be considered if the speed of index
generation and maintenance is a concern.

The following steps walk you through the process of creating a persistent
index on the CustNo field, and then setting the aggregate field to use this
index with a grouping level of 1.

Select the ClientDataSet in the Object Inspector and select its IndexDefs
property. Click the ellipsis button of the IndexDefs property to display the
IndexDefs collection editor.

Click the Add New button in the IndexDefs collection editor toolbar to add
a new persistent index.

Select the newly added index in the IndexDefs collection editor. Using the
Object Inspector, set the Name property of this IndexDef to CustIdx,
its Fields property to CustNo, and its GroupingLevel property to 1.
Close the IndexDefs collection editor.

With the ClientDataSet still selected, set its IndexName property to CustIdx.

Next, using the Fields Editor, once again select the aggregate field. Set
its IndexName property to CustIdx, and its GroupingLevel property to 1. The
Object Inspector should look something like the following

Making the Aggregate Field Available

The aggregate field is almost ready. In order for it to work, you must
set the aggregate field's Active property and its Visible property to True. In
addition, you must set the ClientDataSet's AggregatesActive property to True.
After doing this, the aggregate will be automatically calculated when the
ClientDataSet is made active.

With aggregate fields, there is one more step, which is associating the
aggregate with a data-aware control (if this is what you want to do). The
following steps demonstrate how to activate the aggregate, as well as make it
visible in the DBGrid.

With the aggregate field selected in the Object Inspector, set its Active
property to True and its Visible property to True.

Next, select the ClientDataSet and set its AggregatesActive property to
True and its Active property to True.

Now, right-click the DBGrid and select Columns. This causes the Columns
collection editor to be displayed.

Click the Add All button on the Columns collection editor toolbar to add
persistent columns for each dynamic field in the ClientDataSet.

Now click the Add New button on the Columns collection editor toolbar to
add one more TColumn.

With this new TColumn selected, set its FieldName property to
CustomerTotal. In order to see this calculated field easily, drag the new
column to a higher position in the Columns collection editor. For example,
move this new column to the third position within the Columns collection
editor.

That's it. If you have followed all of these steps, your newly
added aggregate field should be visible in the third column of your DBGrid,
as shown in the following figure.

A couple of additional comments about active aggregates are in order here. First, the
ClientDataSet's AggregatesActive property is one that you might find yourself
turning on and off at runtime. Setting AggregatesActive to False is extremely
useful when you must add, remove, or change a number of records at runtime. If
you make changes to a ClientDataSet's data, and these changes affect the
aggregate calculation, these changes will be much slower if AggregatesActive is
True, since the aggregate calculations will be updated with each and every
change. After making your changes, setting AggregatesActive to True will cause
the aggregates to be recalculated.

Rather than turning all aggregates off or on, the Active property of
individual aggregates can be manipulated at runtime. This can be useful if you
have many aggregates, but only need one or two to be updated during changes to
the ClientDataSet. Subsequently turning other aggregates back on will
immediately trigger their recalculation. At runtime you can read the
ClientDataSet's ActiveAggs TList property to see which aggregates are currently
active for a given grouping level.

Creating Aggregate Collection Items

Aggregate collection items, like aggregate fields, perform the automatic
calculation of simple descriptive statistics. However, unlike aggregate fields,
they must be read at runtime in order to use their value. Aggregate collection
items cannot be
hooked up to data-aware controls. But with that exception in mind, nearly all
other aspects of the configuration of aggregate collection items is the same as
for aggregate fields.

The following steps demonstrate how to add and use aggregate collection items
in a project. These steps assume that you have been following along with the
steps provided earlier in this article to define the aggregate field.

Select the ClientDataSet in the Object Inspector and select its Aggregates
property. Click the ellipsis button for the Aggregates property to display the
Aggregates collection editor.

Click the Add New button on the Aggregates collection editor's toolbar to add
two aggregates to your ClientDataSet.

Select the first aggregate in the Aggregates collection editor. Using the
Object Inspector, set the aggregate's Expression property to AVG(AmountPaid), its
AggregateName property to CustAvg, its IndexName property to CustIdx,
its GroupingLevel property to 1, its Active property to True, and its
Visible property to True.

Select the second aggregate in the Aggregates collection editor. Using the
Object Inspector, set its Expression property to MIN(SaleDate), its
AggregateName property to FirstSale, its IndexName property to CustIdx,
its GroupingLevel property to 1, its Active property to True, and its
Visible property to True.

Add a PopupMenu from the Standard page of the component palette to your
project. Using the Menu Designer (double-click the PopupMenu to display this
editor), add a single MenuItem, setting its caption to About this customer.

Set the PopupMenu property of the DBGrid to PopUpMenu1.

Finally, add the following event handler to the Add this customer MenuItem:

procedure TForm1.Aboutthiscustomer1Click(Sender: TObject);begin
ShowMessage('The average sale to this customer is '
+
Format('%.2m',
[StrToFloat(ClientDataSet1.Aggregates[0].Value)]) +'. The first sale to this customer was on '+
DateToStr(ClientDataSet1.Aggregates[1].Value));end;

If you now run this project, your main form should
look something like that shown in the follow figure.

To see the values calculated by the aggregate collection items, right-click a
record and select About this customer. The displayed dialog box should look
something like the following figure:

Understanding Group State

Group state refers to the relative position of a given record within its group. Using
group state you can learn whether a given record is the first record in its
group (given the current index), the last record in a group, neither the last nor
the first record in the group, or the only record in the group. You determine group
state for a particular record by calling the ClientDataSet's GetGroupState
method. This method has the following syntax:

function GetGroupState(Level: Integer): TGroupPosInds;

When
you call GetGroupState, you pass an integer indicating grouping level. Passing a
value of 0 (zero) to GetGroupState will return the current record's relative
position within the entire dataset. Passing a value of 1 will return the current
record's group state with respect to the first field of the current index,
passing a value of 2 will return the current record's group state with respect to
the first two fields of the current index, and so on.

GetGroupState returns a set of TGroupPosInd flags. TGroupPosInd is declared
as follows:

TGroupPosInd = (gbFirst, gbMiddle, gbLast);

As
should be obvious, if the current record is the first record in the group,
GetGroupState will return a set containing the gbFirst flag. If the record is the
last record in the group, this set will contain gbLast. When GetGroupState is
called for a record somewhere in the middle of a group, the gbMiddle flag is
returned. Finally, if the current record is the only record in the group,
GetGroupState returns a set containing both the gbFirst and gbLast flags.

GetGroupState can be particularly useful for suppressing redundant
information when displaying a ClientDataSet's data in a multi-record view, like
that provided by the DBGrid component. For example, consider the preceding figure of the main form.
Notice that the CustomerTotal aggregate field
value is displayed for each and every record, even though it is being calculated
on a customer-by-customer basis. Not only is the redundant aggregate data
unnecessary, it makes reading the data more difficult.

Using GetGroupState you can test whether or not a particular record is the
first record for the group, and if so, display the value for
CustomerTotal field. For records that are not the first record in their group
(based on the CustIdx index), you simply skip printing. Determining group state
and suppressing or displaying the data can be achieved by adding an OnGetText
event handler to the CustomerTotal aggregate field. The following is an example
of how this event handler might
look:

If you want to also suppress the CustNo field, you must add persistent fields
for all of the fields in the ClientDataSet that you want to appear in the grid, and then add the following event
handler to the CustNo field's OnGetText event handler:

The following figure shows the main form from the running CDSAggs project,
which demonstrates the techniques described is the article. Notice that the
CustNo and CustomerTotals fields are displayed only for the first record in each
group (when grouped on CustNo). You can download this sample project from Code
Central by clicking
here.

About the Author

Cary Jensen is President of Jensen Data Systems, Inc., a Texas-based training
and consulting company that won the 2002 Delphi Informant Magazine Readers
Choice award for Best Training. He is the author and presenter for Delphi
Developer Days (www.DelphiDeveloperDays.com), an information-packed Delphi
(TM) seminar series that tours North America and Europe, and Delphi Developer
Days Power Workshops, focused Delphi (TM) training. Cary is also an
award-winning, best-selling co-author of eighteen books, including Building
Kylix Applications (2001, Osborne/McGraw-Hill), Oracle JDeveloper (1999, Oracle
Press), JBuilder Essentials (1998, Osborne/McGraw-Hill), and Delphi In Depth
(1996, Osborne/McGraw-Hill). For information about onsite training and
consulting you can contact Cary at cjensen@jensendatasystems.com, or visit his
Web site at www.JensenDataSystems.com.

Click here for a
listing of upcoming seminars, workshops, and conferences where Cary Jensen is
presenting.