WEBINAR:

On-Demand

Dataflow Diagrams

Question: [Joe Celko's Dataflow Diagrams Puzzle]

Tom Bragg posted a version of this problem on the
CASE Forum on CompuServe. You have a table of dataflow
diagrams (DFDs), which has the name of the diagram,
the names of the bubbles in each diagram and the labels
on the flow lines. It looks like this:

What we want to find is what flows. Do NOT go into
each bubble within the diagrams. This will be part
of a diagram validation routine that will search for
missing dataflows. To make this easier, assume that
all bubbles should have all flows. This would mean
that (Proc1, input) is missing the 'facts' flow, and
that (Proc1, output) is missing the 'opinions' flow.

Basically, it makes all possible combinations of diagrams,
and flows, then removes the ones we already have. The bad
news is that you will probably have to do this with VIEWs
in most current SQL products.

Another SQL-92 query would be:

SELECT F1.diagram, F1.bubble, F2.flow
FROM (SELECT F1.diagram, F1.bubble FROM DFD AS F1
CROSS JOIN
SELECT DISTINCT F2.flow
FROM DFD AS F2
WHERE flow NOT IN (SELECT F3.flow
FROM DFD AS F3
WHERE F3.diagram = F1.diagram
AND F3.bubble = F1.bubble)
ORDER BY F1.diagram, F1.bubble, F2.flow;

Or to answer the puzzle in SQL-89, you will need to use VIEWs:

-- build a set of all the flows
CREATE VIEW AllDFDFlows (flow)
AS SELECT DISTINCT flow FROM DFD;
-- attach all the flows to each row of the original table
CREATE VIEW NewDFD (diagram, bubble, flow, missingflow)
SELECT DISTINCT F1.diagram, F1.bubble, F1.flow, F2.flow
FROM DFD AS F1, AllDFDFlows AS F2
WHERE F1.flow <> F2.flow;
-- Show me the (diagram, bubble) pairs and missing flow
-- where the missing flow was not somewhere in the flow column
-- of the pair.
SELECT DISTINCT diagram, bubble, missingflow
FROM NewDFD AS ND1
WHERE NOT EXISTS (SELECT *
FROM NewDFD AS ND2
WHERE ND1.diagram = ND2.diagram
AND ND1.bubble = ND2.bubble
AND ND1.flow = ND2.missingflow)
ORDER BY diagram, bubble, missingflow;

I probably overdid the DISTINCTs, but you can experiment
with it for execution speed. This should still run faster
than moving all the rows across the network.