Debugging SQL Series: Syntax Errors

When writing SQL queries there are many places where a small mistype can cause you to receive an error. We will review some of the most common errors due to syntax mistakes. Note: this article will cover how common syntax errors affect Postgresql, if you are using a different type of SQL please check the documentation for these scenarios.

Spelling

Misspellings are the most common cause for error in SQL. Unfortunately, SQL will not autocorrect mistyped keywords, tables, columns, or values.

Check keyword spelling by referring to the documentation for the type of SQL you are using.

Check table spelling by referring to the database schema.

Check column spelling by referring to the database schema or doing SELECT * FROM the table you are trying to check the column name on. NOTE: putting limit 1 on that query will make it run quickly.

Check value spelling by doing SELECT * FROM table GROUP BY the column where the value is in. Then look through this column to find the value you were trying to match.

Single vs. Double quotes

A very common, yet not so obvious syntax error you can make is to use the wrong type of quotation marks in your query. This can change the meaning of what you are referencing inside the quotes.

Examine the following two examples:

Single quotes indicate you are referencing that text string

Double quotes indicate you are referencing a table name or a field.

Different types of SQL handle a single quote ‘ and a double quote “ differently so if you get an error such as “ERROR: column “age” does not exist,” try switching to using the other type of quotation around the text and if that does not work consult the documentation for that specific type of SQL.

Capitalization

If you reference a column or table and put it in quotes and the capitalization does not match what it is in the database, you will get an error since that table or column does not exist in the database. If you do not use quotes to reference the column or table it will run as expected.

For the table above the keyword select, the table facebook, and the column name without quotes can be written as:

You can get the same result with whatever use of upper and lower case letter:

Table and column with quotes must be written lowercase since that is how they are in the table above:

There are some capitalization conventions that should be followed though they are not required. The main one being that the keywords for clauses should be written out in all caps. This helps with the readability of the query.

While this capitalization of Select and From will work:

This is all capitalization of SELECT and FROM is preferred:

Data Types

A common syntax error is not using the correct data type when comparing a value in a field with some constant in your query. Sometimes when a number is stored in a database it is stored as a string. This would cause the following query to error.

The query below will produce an error because the data type of 27 in the WHERE clause is numeric but the field age is a text data type:

To get the answer we expect we must put 27 in double quotes “27”

Clause Order

When writing a SQL query you have to place the clauses you use in this order.

SELECT

FROM

JOIN

WHERE

GROUP BY

HAVING

ORDER BY

LIMIT

You do not need to include all of these clauses in a SQL query but you do need at least the first two. The most common mistake is placing the HAVING clause ahead of the GROUP BY clause. Or placing the WHERE clause after the GROUP BY.

WHERE filters rows before they are grouped.

HAVING filters the newly grouped rows.

Conclusion

Be careful when writing a SQL query. If you get an error first check for these common syntax errors before delving more deeply into what might be the cause.

About Matt David

Hi! I'm Matt David. I have over 7 years industry experience using data and currently work at Udacity as Product Lead for the School of Data Science. Data has become a prerequisite skill set for more and more non-data jobs and I am passionate about making data concepts more easily understood to increase data literacy for everybody.