Thanks to

Requirements

You will also need to have libpq installed, including development headers.

Documentation

This extension provides an interface to the PostgreSQL relational database.

Connection management

[procedure](connect CONNECTION-SPEC [TYPE-PARSERS [TYPE-UNPARSERS]])

Opens a connection to the database given in CONNECTION-SPEC, which should be either a PostgreSQL connection string or an alist with entries consisting of a symbol and a value. The symbols should be connection keywords recognized by PostgreSQL's connection function. See the list of PQconnectdbParams parameter keywords in the PostgreSQL documentation. At the time of writing, they are host, hostaddr, port, dbname, user, password, connect_timeout, options, sslmode, service.

Using the alist notation is recommended; when available (when using libpq from Postgres 9.0 or later), PQconnectStartParams will be used. This prevents parsing errors when keys or values contain "special" characters like equals signs or single quotes. This also adds a layer of security for when connection specifier components come from an untrusted source.

TYPE-PARSERS is an optional alist that maps PostgreSQL type names to parser procedures, TYPE-UNPARSERS is an optional alist that maps predicates to unparser procedures. They default to (default-type-parsers) and (default-type-unparsers), respectively (see below).

The return value is a connection-object.

Also note that while these bindings use the non-blocking interface to connect to PostgreSQL, if you specify a hostname (using the host-keyword), the function might not be able to yield because the resolver will block.

Note: You cannot use the same connection from multiple threads. If you need to talk to the same server from different threads, simply create a second connection.

[procedure](disconnect CONNECTION)

Disconnects from the given CONNECTION.

[procedure](reset-connection CONNECTION)

Resets, that is, reopens the connection with the same connection-specs as was given when opening the original CONNECTION.

[procedure](type-parsers CONNECTION)

Retrieve the alist of type parsers associated with the CONNECTION.

[procedure](type-unparsers CONNECTION)

Retrieve the alist of type unparsers associated with the CONNECTION.

[procedure](connection? OBJECT)

Returns true if OBJECT is a PostgreSQL connection-object.

Query procedures

[procedure](query CONN QUERY . PARAMS)

Execute QUERY on connection CONN and return a result object. The result object can be read out by several procedures, ranging from the low-level value-at to the high-level row-fold. See the High-level API and Low-level result API sections below for information on how to read out result objects.

QUERY can either be a string or a symbol. If it is a string, it should contain an SQL query to execute. This query can contain placeholders like $1, $2 etc, which refer to positional arguments in PARAMS. For example:

If QUERY is a symbol, it must match the name of a prepared statement you created earlier. The number of parameters passed as PARAMS must match the number of placeholders used when the statement was prepared.

To actually create a prepared statement, you can simply use the query procedure with an SQL PREPARE statement. The placeholders in that statement are deferred until execute time. It allows no parameters to be sent at preparation time, which is a limitation in the PostgreSQL protocol itself. You could use escape-string if you really must pass in dynamic values when preparing the statement.

As you can see from the examples above, PostgreSQL types are automatically converted to corresponding Scheme types. This can be extended to support your own user-defined types, see the section about type-conversion below for more information on how to do that.

[procedure](query* CONN QUERY [PARAMS] [format: FORMAT] [raw: RAW?])

A less convenient but slightly more powerful version of the query procedure; PARAMS must now be a list (instead of rest-arguments). FORMAT is a symbol specifying how to return the resultset: either as binary or text (the default). RAW is a boolean which defines whether the PARAMS should be treated "raw" or first passed through the unparsers associated with CONN. If they are treated "raw", they must all be strings, blobs or sql-null objects.

This is the fundamental result set iterator. It calls (kons row seed) for every row, where row is the list of values in the current row and seed is the accumulated result from previous calls (initially knil), ie its pattern looks like (KONS ROWN ... (KONS ROW2 (KONS ROW1 KNIL))). It returns the final accumulated result.

The starred version works the same, except it calls (kons rowN-col1 rowN-col2 ... seed) instead of (kons rowN seed), so the procedure must know how many columns you have in the result set.

As row-fold/row-fold*, except this iterates sideways through the columns instead of lengthways through the columns, calling KONS with all values in all the rows of the current column, from left to right.

The starred version is much less useful here since you often don't know the number of returned columns, but it is provided for consistency.

Maps rows to lists by applying PROC to every row and using its result in the result list on the position corresponding to that of the row. This procedure is not guaranteed to walk the result set in any particular order, so do not rely on the order PROC will be called.

Transaction management

Execute THUNK within a BEGIN TRANSACTION block, and return the value of thunk.

The transaction is committed if thunk returns a true value. If an exception occurs during thunk, or thunk returns #f, or the commit fails, the transaction will be rolled back. If this rollback fails, that is a critical error and you should likely abort.

Nested applications of with-transactionare supported -- only those statements executed within THUNK are committed or rolled back by any with-transaction call, as you would expect.

However, escaping or re-entering the dynamic extent of thunk will not commit or rollback the in-progress transaction, so it is highly discouraged to jump out of a transaction. You will definitely run into trouble, unless you can ensure that no other statements will be executed on this connection until the outermost with-transaction returns normally.

If you provide LEVEL (which can be the symbol read-committed or serializable) this will set the transaction isolation mode for the transaction. If you provide MODE (which can be the symbol read/write or read-only) this will set the access mode for the transaction.

LEVEL is only allowed in the outermost transaction (when in-transaction? returns #f); if you provide it in an inner transaction, an error is raised. In subtransactions, MODE can only be "downgraded" to read-only from inside a read/write transaction, but you can't "upgrade" to read/write from a read-only transaction.

[procedure](in-transaction? CONN)

Returns #t if there is currently a transaction in progress on the connection CONN. Returns #f if no transaction is in progress.

Low-level result API

This API allows you to inspect result objects on the individual row and column level.

[procedure](result? OBJ)

Returns #t when OBJ is a result object, #f otherwise.

[procedure](clear-result! RES)

Directly clean up all memory used by the result object. This is normally deferred until garbage collection, but it's made available for when you want more control over when results are released.

[procedure](value-at RES [COLUMN [ROW]] [raw: RAW])

Returns the value at the specified COLUMN and ROW. It is parsed by an appropriate parser unless RAW is specified and #t. If RAW is true, the value is either a string, blob or an sql-null object. Otherwise, it depends on the parsers.

If ROW or COLUMN are not specified, they default to zero. This makes for more convenient syntax if you're just reading out a result of a query which always has one row or even one value.

Returns a list of all the columns' values at the given ROW number. If ROW is omitted, it defaults to zero. If RAW is true, the values are either strings, blobs or sql-null objects. Otherwise, it depends on the parsers.

[procedure](column-values RES [COLUMN] [raw: RAW])

Returns a list of all the rows' values at the given COLUMN number. If COLUMN is omitted, it defaults to zero. If RAW is true, the values are either strings, blobs or sql-null objects. Otherwise, it depends on the parsers.

[procedure](row-alist RES [ROW] [raw: RAW])

Returns an alist of the values at the given ROW number. The keys of the alist are made up by the matching column names, as symbols.

If ROW is omitted, it defaults to zero. If RAW is true, the values are either strings, blobs or sql-null objects. Otherwise, it depends on the parsers.

[procedure](affected-rows RES)

For INSERT or UPDATE statements, this returns the number of rows affected by the statement that RES is a result for. Otherwise it's zero.

[procedure](inserted-oid RES)

For INSERT statements resulting in a single record being inserted, this returns the OID (a number) assigned to the newly inserted row. Returns #f for non-INSERT or multi-row INSERTs, or INSERTs into tables without OIDs.

[procedure](row-count RES)

Returns the number of rows in the result set.

[procedure](column-count RES)

Returns the number of columns in the result set.

[procedure](column-index RES COLUMN)

Returns the index of COLUMN in the result set. COLUMN should be a symbol indicating the column name.

[procedure](column-name RES INDEX)

Returns the name of the column (a symbol) at the position in the result set specified by INDEX. This is its aliased name in the result set.

[procedure](column-names RES)

Returns a list of all the column names (symbols) in the result set. The position in the list reflects the position of the column in the result set.

[procedure](column-format RES INDEX)

Returns the format of the column at INDEX, which is a symbol; Either text or binary. This determines whether the value returned by value-at will be a string or a blob.

[procedure](column-type RES INDEX)

Returns the OID (an integer) of the column at INDEX.

[procedure](column-type-modifier RES INDEX)

Returns an type-specific modifier (a number), or #f if the type has no modifier.

[procedure](table-oid RES INDEX)

The OID (a number) of the table from whence the result column at INDEX originated, or #f if the column is not a simple reference to a table column.

[procedure](table-column-index RES INDEX)

Returns the column number (within its table) of the column making up the query result column at the position specified by INDEX.

Note: This procedure returns indexes starting at zero, as one would expect. However, the underlying C function PQftablecol is one-based. This might trip up experienced Postgres hackers.

Value escaping

To embed arbitrary values in query strings, you must escape them first, to protect yourself from SQL injection bugs. This is not required if you use positional arguments (the PARAMS arguments in the query procedures).

[procedure](escape-string CONNECTION STRING)

Escapes special characters in STRING which are otherwise interpreted by the SQL parser, obeying the CONNECTION's encoding and escaping settings, using the escaping syntax for string contexts. This does NOT add surrounding quotes to the string; that's up to you to add.

Example:

;; This prevents people from changing a query's parse tree.
;; For example, they could try to turn a query like
;; SELECT * FROM USERS WHERE id='x'
;; into
;; SELECT * FROM USERS WHERE id='1' OR '1'='1'
;; by quoting the value for X, you get the intended parse tree:
;; SELECT * FROM USERS WHERE id='1''' OR ''1''=''1'
(escape-string conn "1' OR '1'='1") => "1'' OR ''1''=''1";; Depending on the value of standard_conforming_strings you might also get
(escape-string conn "1' OR '1'='1") => "1\\' OR \\'1\\'=\\'1";; Of course, when using these strings you still need to surround
;; the output of escape-string with single quotes

[procedure](quote-identifier CONNECTION STRING)

Escapes special characters in STRING which are otherwise interpreted by the SQL parser, obeying the CONNECTION's encoding settings and escaping settings, using the escaping syntax for identifier context. Identifiers are table names, aliases etc. Surrounding double quotes will be added.

This procedure corresponds to PQescapeIdentifier, but the name was changed to reflect the fact that it performs escaping and adds quotation marks around the string.

NOTE: This procedure is only available when the egg is built against the libpq from PostgreSQL 9.0 or later. If you are using an older version, this will raise a (exn postgresql unsupported-version) error with an upgrade message.

Example:

;; Spaces are normally not allowed in table names, but when you
;; quote them, they are allowed
(quote-identifier conn "a table with spaces") => "\"a table with spaces\"";; Can't use a column or table called order because it is a
;; reserved word. However, escaping it makes it usable.
(quote-identifier conn "order") => "\"order\"";; Table names are case-insensitive and always implicitly downcased.
;; If you need to access a table with a capital in its name,
;; quoting the table also helps:
(quote-identifier conn "Foo") => "\"Foo\""

[procedure](escape-bytea CONNECTION STRING)

Quotes special characters in STRING which are otherwise interpreted by the SQL parser. This differs from escape-string in that some bytes are doubly encoded so they can be used for bytea columns.

This is required because of a technicality; PostgreSQL first parses the string value as a string, and then casts this string to bytea, interpreting another layer of escape codes.

For example, E'a\\000bcd' is first converted to 'a\000bcd' by the text parser, and then interpreted by the bytea parser as an "a" followed by a NUL byte, followed by "bcd". In Scheme, the value returned by (escape-bytea conn "a\x00bcd") is "a\\\\000bcd". Yes, that's a lot of backslashes :)

[procedure](unescape-bytea STRING)

This unescapes a bytea result from the server. It is not the inverse of escape-bytea, because string values returned by the server are not escaped for the text-parser. (ie, step one in the encoding process described under escape-bytea is skipped)

COPY support

High-level COPY API

This API is experimental and as such should be expected to change. If you have suggestions on how to improve the API, please let me know!

This is the fundamental COPY TO STDOUT iterator. It calls (kons data seed) for every row of COPY data returned by QUERY, where data is either a string or a blob depending on whether the COPY query asked for binary or text data and seed is the accumulated result from previous calls (initially knil), ie its pattern looks like (KONS DATAN ... (KONS DATA2 (KONS DATA1 KNIL))). It returns the final accumulated result.

Warning: It is not recommended to use this when the returned data is very big. It is usually much cheaper (memory-wise) to use copy-query-fold and reverse the result object, if the object's type supports that.

Maps COPY TO STDOUT output rows from QUERY to lists by calling PROC on each data row returned by the server. If the QUERY asked for binary data, the data supplied to PROC will be in blob form. Otherwise, the data will be provided as strings.

Like with-output-to-copy-query, except it calls PROC with one argument (the copy port) instead of parameterizing CURRENT-OUTPUT-PORT.

Low-level COPY API

This API is close to the C API. It requires you to first execute a COPY query (using the query procedure), and then you can put or get data from the connection. You cannot run other queries while the connection is in COPY state.

[procedure](put-copy-data CONNECTION DATA)

Put copy data on the CONNECTION. DATA is either a string, a blob or a u8vector and should be in the format expected by the server.

[procedure](put-copy-end CONNECTION [ERROR-MESSAGE])

This ends the COPY process. If ERROR-MESSAGE is supplied and not #f, the data sent up till now is discarded by the server and an error message is triggered on the server. If ERROR-MESSAGE is not supplied or #f, the server will commit the copied data to the target table and succeed.

A result object is returned upon success. This result object is currently not useful.

[procedure](get-copy-data CONNECTION [format: FORMAT])

Obtain one row of COPY data from the server. The data's contents will be in the format indicated by the server. If FORMAT is 'text, it the data will be returned as a string, if it is 'binary, it will be returned as a blob. The user is responsible for providing the right format to match the output format of the query sent earlier.

After the last row is received, this procedure returns a result object (which can be detected by calling result? on it).

Constants

[constant]invalid-oid

Represents the numeric value of the invalid Oid. Rarely useful, except perhaps when doing low-level operations in the system catalog.

Error handling

condition: postgresql

A condition of kind (exn <subtype> postgresql) is signaled when an error occurs. The <subtype> is one of the following:

query

There was an error while executing a statement or query.

parse

Something went wrong in a parser.

unparse

Ssomething went wrong in an unparser.

i/o

Something went wrong while trying to read from or write to the connection.

connect

Something went wrong during (re)connections. This includes errors during connection reset.

bounds

An out of bounds error happened (e.g., trying to read from a nonexistant column or row index).

type

Invalid type was passed by the user.

domain

A value was passed in an inappropriate context.

unsupported-version

An operation was performed which is not supported by the client library.

internal

A truly unexpected error occurred (unrecognised status codes, etc).

There will always be a subtype, but for historical reasons the postgresql component of this condition contains properties related to query errors. This may change in a future version.

You'll always find all of these properties in the postgresql component of the condition, but most may have a #f value.

severity

One of the symbols error, fatal, panic, warning, notice, debug, info, log (unfortunately, this symbol may also be translated/localised, so you should not dispatch on them in code: use error-class and error-code for that). Always present in query type subconditions.

error-class

A string representing a Postgresql error class (the first two characters of error-code). Always present in query type subconditions.

error-code

A string representing the full Postgresql error code (including the code class prefix). See the Postgresql documentation for a description of error codes and error classes. Always present in query type subconditions.

message-detail

A secondary (to the usual exn message property) message with extra detail about the problem.

message-hint

A string with a suggestion about what to do about the problem.

statement-position

An integer indicating an error cursor position as an index into the original statement string. The first character has index 1, and positions are measured in characters, not bytes.

context

An indication of the context in which the error occurred. Presently this includes a call stack traceback of active PL functions. The trace is one entry per line, most recent first.

source-file

The file name of the Postgresql source-code location where the error was reported.

source-line

A string containing the line number of the Postgresql source-code location where the error was reported.

source-function

The name of the source-code function reporting the error.

internal-query

A string containing the source text of an "internally generated" command where the error occurred (for example when you called a PL/PGSQL function which generates a query).

internal-position

An integer indicating the position in internal-query where the error occurred.

Type conversion

Type information is read from the database the first time you connect to it. Result set values are either text or binary (or sql null). If they are text, they are converted to Scheme objects by type parsers, as described below. If they are binary, they will be returned as unprocessed blobs (which you can then convert to u8vectors or strings).

Parsers

[parameter](default-type-parsers [ALIST])

Postgres result values are always just strings, but it is possible to map these to real Scheme objects. With this parameter, you can map your own custom postgresql datatype to Scheme datatypes.

The alist is a mapping of Postgres type names (strings) to procedures accepting a string and returning a Scheme object of the desired type.

The parsers can also be set per connection with the TYPE-PARSERS argument of the connect procedure.

These parsers are described below. For anything where no parser is found, the value is returned verbatim (which is always a string, or a blob in case of binary data).

Array and composite (row) types are automatically handled; unless a type-specific parser is defined, a parser is automatically created by combining the parsers for their constituent elements.

[procedure](update-type-parsers! CONN [TYPE-PARSERS])

As described above, type information is extracted from the system catalog whenever you initiate a new connection. However, there is a bootstrapping problem when you are defining custom data types. You must first connect before you can define your custom data types. But the type parsers do not have the information for this new type yet, so you must update them.

To do this, you can call update-type-parsers!. This procedure updates all the type parsers originally associated with connection CONN. By providing the optional TYPE-PARSERS, you can override the existing type parsers for this connection with new ones, otherwise the old ones are just refreshed.

[procedure](bool-parser STR)

Returns #t if the string equals "t", #f otherwise.

[procedure](bytea-parser STR)

Returns a u8vector containing the bytes in STR, after unescaping it using unescape-bytea.

[procedure](char-parser STR)

Returns the first character in STR.

[procedure](numeric-parser STR)

Returns STR converted to a number using decimal representation. If STR could not be converted to a number, raises an error.

[procedure](make-array-parser SUBPARSER [DELIMITER])

Returns a procedure that can be used to parse arrays containing elements that SUBPARSER parses. It will split the elements using the DELIMITER character, which defaults to #\,.

For example, to create a parser for arrays of integers, use (make-array-parser numeric-parser).

[procedure](make-composite-parser SUBPARSERS)

Returns a procedure that can be used to parse composite values (aka "row values"). It will use the list of SUBPARSERS to parse each element in the row by looking up the parser at the matching position in the list. For example, to create a parser for rows containing an integer and a boolean, use (make-composite-parser (list numeric-parser bool-parser)).

Unparsers

[parameter](default-type-unparsers [ALIST])

Just as PostgreSQL types are converted to Scheme types in result sets, Scheme types need to be converted to PostgreSQL types when providing positional parameters to queries. For this, the library uses type unparsers. Just like type parsers, you can override them either per-connection using the TYPE-UNPARSERS parameter to the connect procedure, or globally by changing a parameter.

This alist is a mapping of predicates to unparsers. Predicates are procedures which accept a scheme object and return a true value if the object is of the type for which the unparser is intended. Unparsers are procedures which accept two arguments; the connection object and the scheme object to unparse. Unparsers return either a string, a blob or an sql-null object to be used in the query.

It is not necessary to reload type unparsers after defining a new data type in the database.

Order matters; the type unparser alist is traversed from left to right, trying predicates in order and invoking the unparser linked to the first predicate that does not return #f. If none of the predicates match, the type must be of string, blob or sql-null type. If not, the query procedure will raise an error.

Similar to update-type-parsers!, this procedure allows you to update all the type unparsers originally associated with connection CONN.

[procedure](bool-unparser CONN B)

Returns "TRUE" for true values and "FALSE" for #f.

[procedure](vector-unparser CONN V)

Returns a string representing an array containing the objects in the vector V. The elements of V are unparsed recursively by their respective subparsers. It is the responsibility of the program to use correct values for an array; the elements should all be of the same type and, if they are vectors themselves, all vectors should have the same length and recursive vector depth. Otherwise, you will get an error from postgresql.

[procedure](list-unparser CONN L)

Returns a string representing a composite object (aka row value) containing the objects in the list L. The elements of L are unparsed recursively by their respective subparsers.

2.0.5 - Some bugfixes and pq:escape-string by Reed Sheridan; adapted to SRFI-69 hash-tables

2.0.4 - Changed usage of hygienic macros in setup script

2.0.3 - Bugfixes.

2.0.0 - Interface improvements. (Backward-incompatible.)

1.2.1 - Non-blocking queries.

1.2.0 - Optimizations, minor fixes and cleanups.

License

Copyright (C) 2008-2013 Peter Bex
Copyright (C) 2004 Johannes Grødem <johs@copyleft.no>
Redistribution and use in source and binary forms, with or without
modification, is permitted.
THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS
OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE
LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT
OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE
USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH
DAMAGE.