MS Access for the Business Environment: Access Query Techniques: Using the TOP Keyword

Monday Jan 5th 2004 by William Pearson

Share:

Find the first (n) records in an
Access query. In this lesson, Bill Pearson explores the use of the TOP keyword
to return a specific number or percentage of records at the top or bottom of an
ordered query result dataset.

About the Series ...

This article continues the series, MS Access for the
Business Environment. The primary focus of this series is an
examination of business uses for the MS Access relational database management
system. The series is designed to provide guidance in the practical application
of data and database concepts to meet specific needs in the business world. The
majority of the procedures I demonstrate in this article and going forward will
be undertaken with MS Access 2003, although most of the concepts that we
explore in the series will apply to earlier versions of MS Access, as well.

Introduction to this
Tutorial

In our
last lesson, Access
Query Techniques: Subqueries, Part I,we
introduced subqueries within the context of MS Access, and practiced their
creation and use in rendering a result dataset to meet illustrative business
needs. As a part of our exploration of subqueries, we examined the syntax
surrounding the use of a straightforward sample subquery; we then began an
illustrative, hands-on example of the use of a subquery in a multi-step
practice exercise. After evolving an initial query, we created a second query,
within which we nested the first, to demonstrate the operation of a subquery in
our practice exercise. Finally, we briefly discussed various aspects of the
results datasets that we obtained in each step of the practice examples.

Before moving to our next
article, Access Query Techniques: Subqueries,
Part II,we will
first consider a keyword whose use we will see again in the second half of our
exploration of subqueries, as well as elsewhere. We will overview the TOP
keyword, and get a feel for its operation in an undistracted scenario, before
proceeding into the creation of a subquery that leverages the keyword to
illustrate more sophistication in the use of subqueries than we saw in Part
I.

The objective of this lesson is
to introduce the TOP keyword as it is used in Microsoft Jet SQL, and to
explore ways we can use it to meet illustrative business needs. To accomplish
this, we will follow our approach in previous lessons, and perform the
following activities surrounding our examination of the TOP keyword:

an examination of the syntax surrounding the use of TOP;

an illustrative, hands-on example of the use of TOP in a practice
exercise;

a similar example where we use the optional PERCENT
keyword, in conjunction with TOP, to return a percentage, versus a
number, of rows in an illustrative dataset;

a brief discussion of the results datasets that we obtain in each
of the practice examples.

We will use the Northwind sample database that
installs with MS Access for purposes of this lesson, in an attempt to make our
discussion useful to virtually anyone who has access to the application.

Introduction to the TOP
Keyword

The need to limit a list to a "top" number or
percentage is quite common in business, as it is in many other facets of life.
The "top ten" customers with regard to revenue, for example, for a
given year would be a useful list for most organizations, in that it would be
likely to present a clear idea of which customers have contributed the most to
the all-important profit margins. By the same token, a "bottom"
number or percentage can prove useful, as well; it can serve, for example, as
an indicator of a good place to initiate steps to improve performance, or to
take other remedial or preventative action.

We can use the TOP capability in the QBE pane of MS Access
by simply typing (or selecting one of the defaults for) the number by which we
wish to narrow a query's results dataset into the combo box provided for that
purpose, as shown in Illustration 1, should we be using the Design view
approach to writing a simple query to avoid writing "direct SQL." As
we mentioned in our last lesson, however, the more real life business requirements
we encounter, the more likely we will find ourselves in scenarios where we need
to understand the SQL that is constructed behind the queries we create within
the graphical environments. This lesson will focus on the construction of the
SQL required to use the TOP keyword, along with the optional PERCENT
keyword, to help us to leverage the TOP capabilities to their fullest,
and to prepare us for their use in Access Query
Techniques: Subqueries, Part II.

Illustration 1: Using TOP from the Top Values Combo Box in
the Design View

Discussion

As we noted in the introduction,
the TOP keyword allows us to restrict the number of rows returned in a
results dataset to a number that we specify. The TOP keyword works in
conjunction with the ORDER BY clause, and, in fact, is rendered useless
without it. For example, if we order our results by a given value, in descending
order, and we use TOP to specify twenty values, then it is obvious
that the twenty values returned, being the top twenty in the results dataset,
are the largest of the entire set.

Should we have specified ascending in the ORDER BY
clause, the "top twenty" returned by the same TOP construction
would actually be the bottom twenty values of the entire results
dataset, as the first twenty entries in the ordered set would begin with the
lowest value and continue in ascending order through the twentieth value.

It thus becomes clear why TOP is ineffective without
the ORDER BY clause: the absence of order in the returned dataset would
mean that the "top twenty" records selected from our dataset would be
in no particular sequence, resulting in a haphazard selection of twenty records
with no discernable pattern.

Using the PERCENT keyword in conjunction with the TOP
keyword allows us to return a percentage of dataset rows, in
contrast to the number of rows that we get when we use TOP alone.
But like the solitary TOP keyword, the combination of TOP and PERCENT
returns its results from the top or the bottom of the dataset, depending upon
the selection within the ORDER BY clause.

Syntax

Let's take a look an example
query, where we employ the TOP keyword as a means of examining the
syntax involved:

SELECT TOP 10 ProductName, UnitPrice
FROM Products
ORDER BY UnitPrice DESC

The
example query delivers the dataset shown in Illustration 2.

Illustration 2: Example Query Results - TOP Keyword

In this
simple example, we are asking that the top ten records, in terms of Unit Price,
be delivered from the population of the Products table; this results
because we have asked for the top ten in a list of Products, which is
sorted from highest to lowest with respect to Unit Price. The ten
highest priced Products, therefore, compose the result dataset.

Now let's
inject the PERCENT keyword into our example, to explore the syntax
surrounding its use.

SELECT TOP 10 PERCENT ProductName, UnitPrice
FROM Products
ORDER BY UnitPrice DESC

The
modified query delivers the dataset shown in Illustration 3.

Illustration 3: Example Query Results - TOP and PERCENT
Keywords

We have
modified our query to request the records that compose ten percent of the rows
in the Products table. This turns out to be ten
percent of 77 total rows (a simple Select * FROM Products shows 77
to be the total number of records in the results dataset), or, as our
illustration indicates, eight rows (the closest we can get with whole rows)
from our ordered range.

Let's move into a couple of hands-on illustrations to reinforce
our understanding of these concepts, using the Northwind sample database.

Practice

We will again reinforce our understanding of the rudiments
by working through a multi-step practice example that illustrates the concepts
in operation.

As we have in past lessons, we will create a simple query,
then evolve it to expose each of the concepts that we have discussed. The
query that we build will be our tool for constructing and executing the SQL we
examine together, and for viewing the results datasets we obtain.

Let's say, for purposes of our exercise, that we have been
approached by organizational information consumers with a special year-end business
requirement: Accounting, at the direction of the external auditors, has asked
that we help them to provide information needed for a year-end, operational
cutoff test that the auditors would like to conduct. While the audit team has
the backup for the sales figures that the organization has generated for the
year 1997, they wish to verify the reports that they have for accuracy and
completeness.

As a part of their test work, the audit team wishes to
review a sample list of the last independently recorded sales shipments for the
year, to ensure that they are properly reflected in the organization's
financial ledgers. To accomplish this, they want a report from the database
showing the last twenty orders of 1997 - by shipment date, (revenue is properly
recorded when earned, meaning that the shipping date is more relevant
than the order date.) They will use this sample list to verify that the sales
transactions to which the shipping records relate appear to be properly booked
in the financial system.

To summarize, we need to generate a list of the last twenty
shipments of 1997. This turns out to provide another excellent use of the TOP
keyword, as we will soon see. We will build the query through multiple steps,
so as to comment on its various components in a manner that reinforces the
concepts involved. First, we will create a core SELECT query, and then
we will add the refinements required to meet the information consumer's needs.

We will start MS
Access and proceed, taking the following steps:

1.
Go to the Start button on the PC, and then navigate to the Microsoft
Access icon, as we have in previous lessons in this series.

2.
Click the icon
to start MS Access.

MS
Access opens, and may display the initial dialog. If so, close it.

3.
Select File
-> Open from the top menu, and navigate
to the Northwind sample database (the file might also be accessed from
the Open a File menu atop the task pane, if it has not been disabled
previously, at the right side of the main window in MS Access.)

4.
Select Northwind.mdb.

The splash screen may appear.

NOTE: As we have noted earlier in the series,
we can preclude the appearance of the splash screen each time we enter the
sample Northwind database by checking the "Don't show this
screen again." checkbox. For now, we will leave it unchecked.

5.
Click OK.

The
splash screen disappears, and is replaced by the Main Switchboard.

6.
Click the Display
Database Window, or get there by an alternative approach.

We
arrive at the Database Window, which appears as depicted in Illustration
4.

Illustration 4: Inside Access, Northwind Main Switchboard

7.
Click Queries,
under Objects in the Database window.

The existing queries
appear.

8.
Click the New
button on the Database window toolbar.

The New
Query dialog appears, as shown in Illustration 5.

Illustration 5: The New Query Dialog

9.
Ensuring that
the Design View option is selected, click OK.

The Select
Query dialog appears by default, with the Show Table dialog
appearing in front, as shown in Illustration 6.

Illustration 6: The Select Query and Show Table Dialogs

10.
Click Close
on the Show Table dialog.

The Show
Table dialog closes, leaving only the Select Query dialog present.

We will now move to SQL view, as we did in our
previous lesson, because we wish to use direct SQL, and to work with
queries at a level that goes beyond working with Design view.

11.
Select SQL
View using the View Selector button in the main toolbar (it appears
under the File item on the main menu).

The SQL view editor appears, complete with a SELECT
keyword in place, followed by the ubiquitous ending character for MS Access
queries, the semicolon (";").

We have again arrived at a point where we can enter, display
and / or modify a query using SQL directly, as we will throughout this lesson.
As we found in earlier lessons, we can do many things here that might prove
difficult (if not impossible) in Design view, or within the realm of wizardry.
While we can construct many queries within the query design grid, SQL view
is clearly the environment from which to gain a basic understanding of SQL.

We will begin by composing a simple core SELECT
query, to generate a result dataset from which we will meet the core consumer
requirement; we will begin by listing all transactions as a means of
establishing the layout of the final results dataset, whereby we want to
provide the shipping date, order and customer identification, and the total amount
of the order shipped. It appears that the data we need can be found within the Orders
and Order Details tables, and so our query will center on these tables
throughout the session.

We can run the query at this stage, after saving it using File
in the main menu.

13.
Save the query
as ACC08-01.

14.
Select Query
--> Run from the main menu.

The resulting dataset appears as partially shown in Illustration
7. We note that 2,155 records are returned, per the counter at the
bottom of the Data view.

Illustration 7: The Core Query Results Dataset (Partial
View)

We have assembled our basic SELECT query, and have run
it to obtain the full population of the information fields we require from the Orders
and Order Details tables. Our objective in doing is to get a layout
established for the results we ultimately seek, and to ascertain that the
calculation we have inserted for the Total amount of each record is
working correctly (a quick test of the math will indicate that the Total
calculation is, indeed, accurate). Beginning our query development in this way
will also allow us to obtain a familiarity for the underlying data as we
proceed.

We note that the above query delivers a record set
containing multiple instances of the same Order ID in many cases. This
is because the Order Details table breaks order information into a
separate line item for each product, so that, for example, if a customer
purchases three products in the same order, three records will exist with the
same Order ID in the Order Details table.

Our information consumers want a single line item for each
order, so we will take care of that feature next.

15.
Ensure that
the query is saved as ACC08-01.

Let's modify the query
to generate summed totals by Order ID. We will do this via two
additions to our existing SQL, as shown in the next step.

16.
Shift back to SQL
view, once again, if necessary.

17.
Add the
following, immediately before UnitPrice,and just after
ShipCountry,in the SELECT clause:

SUM(

18.
Insert a right
parenthesis ")" between "Discount)" and "AS Total".

The
objective is simply to enclose the calculation within the parentheses of a SUM
function. The affected line within the SELECT clause should appear
as below:

We
note that the Order IDs are combined into single summed records. We can
see also that the total number of records returned in the dataset is 830,
which serves to confirm the fact that the product transactions with the same Order
IDs have been combined.

22.
Save thequery
as ACC08-02.

Our
next step will be to limit the transactions to the year 1997. Our
results dataset currently displays the combined Order information for
all years in the table. We can restrict the data to the year under examination
with the addition of a simple WHERE clause, as we shall see.

23.
Shift back to SQL
view, once again.

24.
Add the following
WHERE clause between the existing FROM and GROUP BY
clauses:

WHERE ShippedDate Between #1/1/1997#
And #12/31/1997#

This
will restrict the records returned to those representing transactions with Shipped
Dates that lie between January 1st and December 31st,
inclusive.

We can
see that the transactions now appear to be limited to those that occurred in
1997, a circumstance that is also confirmed by the fact that the total number
of records has again decreased markedly, this time to just under 400.

We are
now ready to insert the TOP keyword to drive the generation of the "top"
twenty records. Recall that the ORDER BY clause actually makes this
work. We are simply going to order our existing result dataset by Shipped
Date, descending, so that we obtain the records by date, most recent to
oldest. Then, when we insert the TOP keyword, we are essentially asking
for the "top twenty" of that list (which is ordered, again, most
recent to oldest). We are thus saying "give me the twenty most recent"
- or the twenty last - transactions that occurred in 1997.

27.
Shift back to SQL
view, once again.

28.
Add the
following to the SELECT clause, to the immediate right of the word SELECT:

TOP 20

29.
Add the
following ORDER BY clause to the end of the query (be sure to remove the
semicolon (";") that ends the line above, if it is present).

We
immediately see the effects of the TOP keyword, together with the ORDER
BY clause through which it is driven. We see, as well, how we can use TOP
with dates, as well as the more intuitive number data types, to deliver a
results dataset from a range of dated transactions. An examination of the
number of records in the dataset, however, presents a perplexing phenomenon:
the counter indicates "22," instead of the twenty records we
might have expected.

The
reason for the apparent deviation from the expected results is simply a
manifestation of the behavior of the TOP keyword in the event of a "tie."
When equal values are present in the range from which we are selecting the "top"
records, all records that have the equal value are returned.

To
prove this concept, take the following step.

32.
Select SQL
View using the View Selector button once again.

The Query
appears.

33.
Change the 20
after TOP to 19.

34.
Save thequery
as ACC08-05.

35.
Select Query
--> Run from the main menu.

The resulting dataset appears as partially shown in Illustration
14, and is limited to nineteen rows, because the nineteenth Shipped Date
from the end of 1997 has only one occurrence, 16-Dec-1997.

Illustration 14: Results of Modification of the TOP
Keyword

Now
let's take a quick look at the TOP keyword used in conjunction with the
optional PERCENT keyword.

36.
Select SQL
View using the View Selector button for a final time.

The SQL view editor appears.

37.
Change the 19
after TOP to 10.

38.
Add the word PERCENT
immediately to the right of the 10 inserted above.

The PERCENT
keyword in the above case instructs the query to return 10 percent of the
rows (that is, approximately 40 rows, as we shall see).

39.
Change the
last word in the query (DESC in the ORDER BY clause) to ASC.

In
this example, ASC directs that the ORDER of the range be from
January, 1997 dates to December 1997 dates. TOP 10 PERCENT therefore
will return the "top" ten percent of the rows from the January
/ beginning end of the range.

40.
Save thequery
as ACC08-06

The
query should now resemble that shown in Illustration 15.

Illustration 15: The SELECT Query with TOP Modifications and
PERCENT Add

41.
Select Query
--> Run from the main menu.

The resulting dataset appears as partially shown in Illustration
16.

Illustration 16: Results of Modification of the TOP
Keyword

We see that the first forty transactions for 1997 appear. A
quick test of the accuracy of the result dataset can be performed by simply
running the query again with the TOP 10 PERCENT keyword combination
removed. The result set that returns indicates a total row count of
approximately 398 rows. As we stated in the Syntax
section above, the records returned using the TOP and PERCENT
combination is composed of the stated numeric percent (ten, in this
case) of the total rows (398 per the "proof" test), in whole
rows (for a total of forty rows).

NOTE: Another way to see the total population that we have subjected
to our keyword combination would be simply to substitute 100 (or "100
percent) in the place of the 10 in TOP 10 PERCENT. The result
dataset again returns 398 records.

42.
Close the query, without saving
if you performed the test step, to return it to the TOP 10 PERCENT
example we saved as ACC08-06.

Within
the context of the practice example we have explored, we can easily see the usefulness
of the TOP keyword, together with its optional "modifier" PERCENT,
to help us to meet the business needs of information consumers. While we took
a look at only a couple of possible uses, there are many scenarios where these
keywords can be invaluable in helping us to meet our objectives. We will
provide a further illustration of such an instance, where we can use TOP
within a subquery scenario, in our next article, and perhaps in prospective
articles over the months to come.

Conclusion and Summary

In this article, we introduceda keyword that will again take
the stage in our next article, Access Query
Techniques: Subqueries, Part II, which continues the
exploration of subqueries we began in our previous lesson. We overviewed the TOP
keyword, both with and without the optional PERCENT keyword, to examine
its construction and operation independently from other distractions, to
prepare us for its use in our next lesson. In Part II, we will
create a subquery that leverages the keyword to illustrate more sophistication
in the use of subqueries than we saw in our introduction to them in Part I.

We introduced the TOP keyword as it is used in
Microsoft Jet SQL, and explored ways we can use it to meet illustrative business
needs. To accomplish this, we examined the syntax surrounding the use of the TOP,and the optional PERCENT, keywords, then undertook illustrative
practice exercises where we received hands-on exposure to the concepts. We
then discussed the results datasets that we obtained in each of the practice
examples.