This document describes the syntax that the SQL Parser for EasyMOD accepts. While it is largely MySQL based, not all options available in the MySQL syntax are supported. phpBB supports several SQL servers with its Database Abstraction Layer (DBAL) implementation, hence a minimal set of options available to perform DB operations needs to be defined. This is the main goal of the SQL Parser, that is provide support for as much options as possible while making sure all those options are portable.

NOTE: These standards are under development and most importantly open to discussion. However, as being said, whatever is implemented should be portable to, at least, all SQL servers supported by phpBB.

Syntax Conventions

This section describes the general rules applied to SQL statements and their components:

No explicit rule is applied to case. However, keywords and identifiers are case insensitive for almost all SQL servers. While the parser accepts any lettercase, we recommend use of keywords written in uppercase and identifiers in lowercase, for readability, as well as to prevent possible portability related issues. All SQL statements generated by the SQL Parser will use keywords in uppercase and identifiers written as found in the input. Read the Identifiers section for other rules applied to identifiers, though.

Syntax Diagrams and Tables used in this document adhere the following conventions:

CONVENTION USED FOR
---------------- ----------------------------------------
UPPERCASE SQL keywords.
lowercase User supplied elements such as SQL identifiers or constants.
<label> Name for a block of syntax. This convention is used to group
portions of lengthy blocks of syntax that can be used in more
than one place.
| (vertical bar) Separating optional syntax items within brackets or braces
Only one of the specified items can be choosen.
[ ] (brackets) Optional syntax items (brackets aren't part of the input).
{ } (braces) Required syntax items (braces aren't part of the input).
[,...] The preceding item can be repeated n number of times
(separated by commas).
[ ...] The preceding item can be repeated n number of times
(separated by blanks).

SQL statements must be terminated with a semicolon (;). Note the semicolon is required, even if only one statement is specified or for the last one.

SQL statements can be splitted into several lines.

Several forms of comments are allowed (described below).

Indentifiers

Table, index and column names are user supplied identifiers. To minimize portability related issues, all user supplied identifiers are subject to the following rules:

Backticked (quoted) identifiers: Some SQL servers allow to use non-ASCII characters, or even Reserved Keywords, when identifiers are backticked (MySQL) or double quoted (other SQL servers). However, even if backticked, a valid identifier for MySQL can be considered illegal on other SQL servers. Therefore, backticks are ignored and the rest of the rules enforced to minimize portability related issues.

Reserved Keywords: Can't be used. To minimize portability related issues identifiers are checked against a huge list of reserved keywords extracted from the following sources:

Standard SQL string constants are arbitrary sequences of characters delimited by single quotes.

ie. 'This is a standard SQL string constant'

Double quotes are used in SQL to quote identifiers, except MySQL, which uses backticks for this purpose. MySQL (therefore this SQL parser too) also accepts double quotes to delimit string constants.

ie. "This is NOT standard, but still considered a string constant by MySQL, and so it is here"

Quotes used as string delimiters may be escaped by themselves or by backslashes.

ie. 'Here''s an example' 'Here\'s another example'

Character set specifiers and collations are not supported.

Numeric Constants:

Numeric constants include Intergers and Floating-Point Numbers. Numeric constants may or may not be enclosed in quotation marks. However, the SQL parser will remove the quotes when possible to prevent the SQL server from performing redundant and unnecessary type conversions. It is not always possible for the parser to know if a numeric constant should be un-quoted (for instance, when setting a default value in an ALTER statement), therefore it is recommended to not quote numbers.

Integer constants are represented by a sequence of numbers (0-9), optionally preceded by a hyphen (-) to indicate a negative value. Integers are accepted in the following format:

Simple SQL comments and C-style comments are allowed. Comments are removed from the input before parsing the SQL statements. Note comment markers (described below) inside of string constants should be (in fact, are) considered part of the constant.

Simple SQL comments:

Simple SQL comments are introduced by two consecutive hyphens (--) or by a hash character (#). All input from (and including) any of these comment markers until the end of the line is ignored.

The hash (#), when found on the first character of a line, is used by the MOD Template to delimit MOD Actions (note hashes preceded only by whitespaces may still be considered MOD action delimiters). However, you can still use the hash in the middle of a line or at end of a SQL statement to add comments.

ALTER TABLE phpbb_users ADD user_flag1 SMALLINT; # This is a SQL comment.
ALTER TABLE phpbb_users ADD user_flag2 SMALLINT; -- This is a SQL comment.
#
# This is NOT a SQL comment!
# Note it will mark the end of the SQL action in the MOD Template!

C-style comments:

C-style comments start with /* and end with */ and may affect multiple lines.

(D) is an optional argument used to specify display width for integer data types.(S) is an optional argument used to specify field size, in chars or bytes depending on the data type.L in (L[,D]) is the total number of digits (including decimals).D in (L[,D]) is the total number of digits after the decimal point.
Note (L[,D]) May also be specified for floating-point data types. However, it is not recommended since the meaning of these arguments may vary between SQL servers.

The following tables detail assumed storage sizes used for integer types:

The column "DW" means "Display Width", which is the maximum recommended value to use as the (D) optional argument for integer types. This value is used by the parser to decide if an unsigned integer should be promoted to a higher size type when converting SQL statements for SQL servers that have no support for the unsigned attribute, such as PostgreSQL, MSSQL, etc.

The length attribute (next to the column name and enclosed between parentheses) is optional for CHAR, VARCHAR, BINARY and VARBINARY data types and required for TINYTEXT, TEXT, MEDIUMTEXT and LONGTEXT data types.

The order attribute is optional and defaults to ASC.

For the column_name syntax, please refer to the Identifiers chapter.

Last edited by markus_petrux on Sun Jan 01, 2006 5:13 am, edited 4 times in total.