General Bits is a column loosely based on the PostgreSQL mailing list
pgsql-general.
To find out more about the pgsql-general list and PostgreSQL, see
www.PostgreSQL.org.

Survey Time

PostgreSQL Survey

05-Jan-2004

It is time again for another survey. This survey is about
how you use PostgreSQL. Please cut and paste the questions
into your editor to answer the questions and email your reply to
generalbits@varlena.com.
The responses will be anonymously compiled and republished if possible.

1. What PostgreSQL Client API do you use the most?

2. To what extent do you or your company use PostgreSQL for its object or
object relational features? For example, user defined functions, user
defined data types, inheritance, pre-defined non-traditional datatypes
such as the geometric types, various indexing methods such as GiST, RTREE,
tsearch2.

3. What is your favorite new feature of PostgreSQL 7.4?

4. Should there be a standard worldwide PostgreSQL training course?

5. Does your business utilize a PostgreSQL support company? If so, which one?

6. What version of PostgreSQL do you currently use? If you do not use
7.4 yet what are the technical and/or business reasons that you have
not upgraded yet,

7. What features would make PostgreSQL more competitive with Oracle or DB2?

8. Would you or your company utilize a company specializing in PostgreSQL hosting?

9. Are you or your company using (PostgreSQL with) Open Office?

10. What procedural language do you use with PostgreSQL the most?

11. How important is PostgreSQL certification to you and/or your company?

12. What database did you use before PostgreSQL? What other databases
do you currently use?

13. Do you or your company use MySQL because it was designed for speed?

14. What feature in PostgreSQL do you leverage the most?

reliability

extensibility/object-relational design

ACID compliance

ability to handle large amounts of data

Cost

15. What platform do you primarily run PostgreSQL on?

16. What do you think would attract the most new PostgreSQL users?

17. What is your preferred PostgreSQL administration tool?

18. What is your preferred PostgreSQL visualization or design tool?

19. What is your backup strategy for production?

20. What is your fail over strategy for production?

Contributors:
elein at varlena.com

Union, Intersect and Except Basics

Merging, Intersecting and Excepting Rows in Tables

02-Jan-2004

Using a UNION is like stacking two (or more) tables or selections together.
There must be the same number of columns in each selection
and they must be compatible types.

Table 1

Table 2

These examples tables of employees and contractors
have the same table structure.
In the data, note that there is a contractor
'Dave' and an employee 'Dave' that started on the same day.
(The Classic Too Many Daves problem :-) (In real life there
would be more information to qualify the Daves uniquely,
however, to show variations on UNION we'll keep this anomaly.)

Notice that Andrew began as a contractor and then became an employee.
There have been no duplicate row issues with Andrew because the
rows are not exact duplicates--they have different
start and end dates. If the query involved only
names instead of names and dates, then Andrew also
would have been a duplicate.

In order to see all of the rows from both tables
regardless of duplicates, use UNION ALL. This
concatenates the selections instead of performing
a formal union on them.

The Too Many Daves problem as it turns out was
a mistake. Dave is an employee not a contractor.
So now we want to know which rows in
employees have equivalent rows in contractors
so we can verify and correct the mistake and see if there
are any others like it.
To do this, use INTERSECT instead of JOIN.

To eliminate this data set you can use the INTERSECT
statement nested in the WHERE IN clause of the DELETE.
Why are we using the nested subquery
SELECT name FROM (SELECT ... INTERSECT ...)?
Try it without the extra level to see.

Back to the original table (except for the extra Dave).
Suppose we wanted to see all the employees who were
once contractors. INTERSECT would do that. And what
about employees who were never contractors? You can
use EXCEPT for that question. Notice that we expect
all of the employees from the employee table except
Andrew. (Remember Andrew who started out as a contractor?)

INTERSECT and EXCEPT also eliminate duplicate rows, if any, in the
result set. If you want to keep duplicates, you can use
the ALL qualifier on INTERSECT and EXCEPT as you would with UNION.

Contributors:
elein@varlena.com

Join Basics

How to Join Queries and Tables

02-Jan-2004

A JOIN is a way of selecting columns from two or more tables
into the same target list, putting the selections side
by side instead of stacking their rows as was done for a UNION.
Usually, there are overlapping columns in each table which
are important as they specify the JOIN criteria--how the rows
are constructed. How to specify the JOIN criteria is what
you need to know to use JOINS properly.

Table 1

Table 2

The simplest JOIN is also the simplest one to write by mistake
and it is only sometimes useful.

By not specifying any JOIN either in the WHERE clause or in the
FROM clause, a Cartesian product is created. If the number table
had 2 rows and the letter table had 3 rows, then
the result of this query would be 2 * 3 rows. Each of the
2 rows in numbers will create 3 rows in the result.

Usually, the rows in one table are linked to one or more rows
in the joined table by designating a column in each that is
the same or a condition which is true.

Lets us first create some sample data to look at simple examples
of the other kinds of JOINS.
We have a table of rugby events, a table of teams and
a table of matches played at events. Notice we are entering
a match for which an event does not exist and a team with
no matches. Normally for this schema, we'd define FOREIGN KEYS
to ensure that all matches had teams and events.

There are several kinds of JOINs. We've seen the CROSS JOIN.
Now let us take a look at the others: INNER and OUTER.
All of these join types are both concepts and SQL commands. That is,
you can implement the JOINs with WHERE clauses and subselects
and UNIONS or with explicit JOIN statements.

The explicit form of JOINS use NATURAL, ON or USING to say how the
rows tables in the table should be joined.

An INNER JOIN shows all rows in the first table which have corresponding
rows in the second table. If there is a row in the
first table without a counter part in the second it
is not shown. The same is true for the second table.

An OUTER JOIN is one where if there is not a matching join column in one
of the tables, the NULL values are put into the columns that would have
represented the other table. There are three variations of outer joins:
LEFT, RIGHT and FULL. In a LEFT OUTER JOIN, all of the columns and rows from the
left table are shown, but where there is not a corresponding entry in the
right table, those columns are set to NULL. In a RIGHT OUTER JOIN, it is
the same except that all of the columns and rows in the right table are shown
and the left table values may be NULL. A FULL OUTER JOIN is a combination
of the LEFT and RIGHT JOIN. All of the columns and rows for both tables
are shown and where there is not a corresponding entry in the left or the
right table, those columns are set to NULL.

Inner Join

Left Outer Join

Right Outer Join

Full Outer Join

Table 1

Table 2

Table 1

Table 2

Table 1

Table 2

Table 1

Table 2

KEY:

Table 1

Join Column

Table 2

NULL Column

The following statements show all of the different ways
to perform an INNER JOIN. They show how to
perform an INNER JOIN using a WHERE clause and using
ON, USING and NATURAL. The JOIN links the matches'
event id to the event's event id and so brings in the
event name and date of the match. Notice that the match
for event 3 is not shown. An INNER JOIN only shows rows
where there is a match on the join column(s).

With the WHERE clause, just specify equality of the column(s)
you wish to join on. With a JOIN USING you specify the
common names of columns in both tables. A NATURAL join
is a short hand for JOIN USING which specifies that the
join should be on all of the commonly named columns.
A JOIN ON allows you to specify an expression as the join
criteria. This helps when the join column names are different
in the two tables or if you need to add more complex logic.

You can have more than two table joined in using any JOIN type.
(And of course, each table could also be a subquery, but that is
a different article.) Because the column names are different for the
team ids in the matches table and the teams table, we used a
JOIN ON condition. This could also have been a WHERE condition,
but it could not have been a USING condition or a NATURAL join.

An example of a LEFT OUTER JOIN would be to select all of the matches
for all of the events. We are using the same query as the one
above, however we are specifying a LEFT join between matches and events.
This should show us the all of matches, even if there is no qualifying
event. In this case, we should see the match between the Blues and
the Amazons with event id 4.

A RIGHT OUTER JOIN select all of the rows in the second table and sets columns
in the first table to NULL where there is no match. Joins are commutative
so the following query should give the same results as the one above by
switching LEFT for RIGHT and matches for events. Try it.

A FULL OUTER JOIN is a combination of a LEFT and RIGHT OUTER JOIN.
A FULL OUTER JOIN could be used to show all of the matches and all
of the events whether or not there are corresponding events or matches.
In this query we expect to see both the match without an event
as above and the event without matches (2004 Division 1).
Again we will just modify the original query. The difference here
is that the two joins on the teams table will also have to be
changed to FULL OUTER JOINS in order to see any event rows
without matches. If we do not make this change, then even though
the matches and events are a full join, the two team joins are
INNER JOINS and eliminate rows which do not match. Note also,
that by making this multi-join query all use FULL OUTER JOINS
the team with no matches will also be shown.