Database Administrators Stack Exchange is a question and answer site for database professionals who wish to improve their database skills and learn from others in the community. It's 100% free, no registration required.

I am having issues with the statement below. The main purpose of the query is to pull a list of customers into an Access db from ODBC (our customers are stored in DB2 on IBM i). The query is intended to look for a customer (OM01U1.OM01015) that has not had a delivery (RS2101F.DELDATE) in over 13 months but is not already suppressed (OM01U1.OM01068). I do need all of the columns in order to research the equipment and financial records so I can't remove them. The statement is pulling from several tables though. I believe that the issue may be with the "Group By". I am also getting duplicate outlets (OM01U1.OM01015) when I do receive results. I only need to see the outlet one time with it's last delivery date (RS2101F.DELDATE). Please don't beat me up too bad, I did not create this query I just sort of inherited it. I know it is way off and there is probably a better way for me to get the information I need. I appreciate any help guys!

What error are you getting? What is happening that you aren't expecting? Usually when I have a query problem with multiple tables, I peel back the query and start with the main table. I then slowly add the additional tables as I confirm the expected results each step along the way.
–
Mike WillsAug 13 '12 at 20:54

1

Before answering, please note that this question is tagged as iSeries, meaning the data is stored in DB2 for IBM i. Thank you.
–
WarrenTAug 13 '12 at 23:01

Please do us a favor next time, and give an example without quite so many extraneous columns. Once we've seen 2 columns used the same way, we dont need to see 20. The principle is the same. It just makes it harder to plow through the example, and get to a good solution. ;-)
–
WarrenTAug 18 '12 at 12:40

If some of these files might not have matching rows,
then those files might need to use LEFT JOIN,
and the values for the columns for that table will be default values or null,
depending on the file definition.
Columns in tables on this system are most frequently defined as NOT NULL WITH DEFAULT.

You have a GROUP BY query, but did not specify any aggregate function. You will want the MAX() function on your DELDATE. You should isolate this part of the query into a "common table expression" [CTE], which starts as "WITH".

WITH r2 as
(SELECT OUTNUM,
max(DELDATE) as lastdelivery
FROM RS2101F
WHERE DELDATE >= (thirteenMonthsAgo)
GROUP BY OUTNUM
)
SELECT ...
...
FROM OMO1U1
LEFT JOIN r2 ON OM01U1.OM01015 = r2.OUTNUM
LEFT JOIN CF30 ...
...

DELDATE appears to be in cyymmdd format, where 'c' is 0 for 1900's and 1 for 2000's. You may want to calculate the value for <thirteenMonthsAgo> in your program, rather than with an SQL expression, unless you have a UDF for that.

I believe that the joins are working now. I am getting an error message for the "r2" now in the WITH statement. What is the r2? The error message states it's not a valid token that I can use "ALL CS CHG NON RR RS UR NC"?
–
KNGAug 14 '12 at 17:29

r2 is the "correlation name" of the common table expression that follows in parentheses. r2 can then be used as a table reference, as in the join, where it represents the table that would result from the common table expression.
–
WarrenTAug 14 '12 at 23:44

When you are overwhelmed with a long complex query, involving so many pieces, it is often a good strategy to break it down into smaller parts, get them working, then build it back up, in small stages. Focus first on a "mind-sized bite". Don't bite off more than you can chew in a single mouthful. Start simple, and add things back in layers.

You probably want to isolate your grouping logic away from the main query. Grouping by dozens of fields is rarely necessary, and if you think about it, does not model reality too well. Making sure long lists of fields match up also makes the code more prone to overlooked errors as well.

One strategy to break things down into smaller, more manageable chunks, could be to put some of the functionality into an SQL View, or in MS-Access terms, a query. If you have everything working that way, then you can decide if you want to try to merge it all back into one large query. But sometimes its handy to have those re-usable components hanging around for next time.

Also, when you start by working with smaller, simpler queries, you will find it much easier to find accidental syntax errors, such as a missing comma for example.

You can also employ subqueries, or common table expressions, to help break things into chunks, or to combine chunks you have already worked out. Personally, I tend to think of common table expressions as being a handy step-wise building block. It might look something like this:

WITH a as
( -- first step
select ...
...
), b as
( -- second step
select ...
...
)
SELECT ... -- third step
FROM a
JOIN b on a.key1 = b.ky1
LEFT JOIN tablec as c on b.ky2 = c.xyz

[When it comes to commas, I often like to put them at the start of lines, rather than at the end, making them (or their absence) easier to spot. Your eye can easily look down a line at the beginning, but when at the end it's not as fast and simple. I find it also makes it easier to add lines at the end of a group, or move lines around. It may look odd at first, but it has helped, since small errors like a missing comma can be so hard to spot sometimes.]

If you strip the grouping logic that you have shown, removing the other tables, it looks like this.

This still has some problems. You said in the question that you wanted your final result set to have one row per Outlet.
But this will give you one summary record for every combination of values in all of those fields in your GROUP BY clause. I doubt that is truly what you want. It would be conceivable that you might have wanted one row per piece of equipment, but you are using COUNT() on the equipment so that seems out. Since you said you wanted one row per Outlet, that should be the only column in your GROUP BY clause. Every other column not in your GROUP BY clause must use an aggregate function. Some of them it may make sense to use MAX(), others might not make sense, and for these you may want to consider whether those columns should come from another source or perhaps do not make sense to include at the Outlet level at all. It looks like you already worked on this in your updated query, but Status and Last_Maint_Date remain. Last_Maint_Date seems a natural for MAX(). Is status there only because you want to exclude certain statuses? If so simply move that to your WHERE clause, otherwise perhaps you can use MAX() or MIN() on it.

@WarrenT
Ok, that makes sense now. I am not sure if I am putting the statement in the correct place but I keep getting errors for the "WITH" statement. I have narrowed it down to the EC01 records (Equipment) that are causing the duplicates. It is pulling the last maintenance record for each piece of equipment attached to an outlet. I just need the most recent maintenance date out of all of the outlet's maintenance dates (EC01_LASMTCDAT). What statement should I use to only pull the outlet's last EC01_LASMTCDAT (last maintenance date)? Here is my updated query...

SELECT
OM01U1.OM01041 as Loc,
OM01U1.OM01015 AS Outlet,
OM01U1.OM01945 AS OLTyp,
OM01U1.OM01052 AS Outlet_Name,
OM01U1.OM01054A AS Street_Number,
OM01U1.OM01054C AS Street_Name,
OM01U1.OM01055A AS City,
OM01U1.OM01055B AS State,
OM01U1.OM01106 AS Zip,
OM01U1.OM01058A AS Area_Code,
OM01U1.OM01058C AS Phone,
OM01U1.OM01037 AS Channel,
OM01U1.OM01926 AS USA_Type,
OM01U1.OM01078 AS Key_Acct,
OM01U1.OM01110 AS TRDGRP,
OM01U1.OM01034 AS Trade_Name,
OM01U1.OM01248 AS DTC,
OM01U1.OM01073 AS Sales_Route,
OM01U1.OM01068 AS Sup_CDE,
OM01U1.OM01982 AS Sup_Reason,
OM01U1.OM01067 AS Sup_Date,
OM01U1.OM01065 AS Creation_DT,
OM01U1.OM01066 AS Update_DT,
OM01U1.OM01141 AS CAN_NUM,
OC02_RNTCTCSTA AS OC_CONT_STATUS,
CF30_CNSTYP AS CF_CONTYP,
EC01_SYSSTA AS EC_STATUS,
RS2101F.DELDATE AS RS_LAST_DEL_DT,
EC01_LASMTCDAT AS EC_LAST_MNT_DT,
COUNT
(EC01_EQUNUM) AS EC_EQUIP,
COUNT
(EC01_EQUACY) AS EC_ACCESSORY,
COUNT
(EC01_EQUSPP) AS EC_COMPONENT
FROM
OM01U1
JOIN RS2101F ON OM01U1.OM01015 = RS2101F.OUTNUM
LEFT JOIN CF30 ON OM01U1.OM01015 = CF30_OUTNUM
LEFT JOIN EC01 ON OM01U1.OM01015 = EC01_EQUOUTNUM
LEFT JOIN OC02 ON OM01U1.OM01015 = OC02_OUTNUM
GROUP BY
EC01_LASMTCDAT,EC01_SYSSTA,EC01_EQUNUM,EC01_EQUSPP,EC01_EQUACY,CF30_CNSTYP,OM01U1.OM01141,OM01U1.OM01066,OM01U1.OM01065,OM01U1.OM01067,OM01U1.OM01982,OM01U1.OM01068,OM01U1.OM01073,OM01U1.OM01248,OM01U1.OM01034,OM01U1.OM01110,OM01U1.OM01078,OM01U1.OM01926,OM01U1.OM01037,OM01U1.OM01058C,OM01U1.OM01058A,OM01U1.OM01106,OM01U1.OM01055B,OM01U1.OM01055A,OM01U1.OM01054C,OM01U1.OM01054A,OM01U1.OM01052,OM01U1.OM01945,OM01U1.OM01015,OM01U1.OM01041,RS2101F.DELDATE,OC02_RNTCTCSTA
HAVING
RS2101F.DELDATE < 1110815
AND EC01_LASMTCDAT < 1110815
AND OM01U1.OM01068<>'S'
AND OM01U1.OM01015<>9999999
AND OM01U1.OM01945 Not In ('CRC','CRD','PER','PMO')
AND OM01U1.OM01110<>'CCNA'
AND OM01U1.OM01248<>'114'
AND OM01U1.OM01073 <> '398'