In MySQL, ' is not for comments. Even if it were, there is no space before it so it can only end the string that precedes it.
– Lightness Races in OrbitSep 1 '11 at 12:14

40

As far as XKCD goes, if there is any question about some of the comics you can always go to Explain XKCD and have your answer figured out. There is even a XKCD wiki, which is very helpful for some tricky comics like XKCD geohashing
– AnatoliSep 14 '11 at 19:56

58

+1 for the fun question that we all need on StackOverflow once in a while. :)
– CodingInCirclesOct 29 '12 at 20:15

This is the naive way to add text input into a query, and is very bad, as you will see.

After the values from the first name, middle name textbox FNMName.Text (which is Robert'); DROP TABLE STUDENTS; --) and the last name textbox LName.Text (let's call it Derper) are concatenated with the rest of the query, the result is now actually two queries separated by the statement terminator (semicolon). The second query has been injected into the first. When the code executes this query against the database, it will look like this

The ' in the student's name is not a comment, it's the closing string delimiter. Since the student's name is a string, it's needed syntactically to complete the hypothetical query. Injection attacks only work when the SQL query they inject results in valid SQL.

Mmm, the WHERE with parentheses around the arguments is rather unusual, but at least it avoids a syntax error... :-)
– PhiLhoDec 2 '08 at 20:00

58

@PhiLho: If the original statement were an INSERT, then the parenthesis would make more sense. It would also explain why the database connection isn't in read-only mode.
– dan04Aug 10 '10 at 4:02

2

As @dan04 explains, the parenthesis makes more sense with an INSERT. Thinking backwards, the SELECT would not run anyway as the Insert of the Little Bobby Tables in the table would have already dropped the table.
– ypercubeᵀᴹJan 21 '13 at 21:41

8

Actually, in this example the first query ("add a new record...") will fail because Students expects more than just the one column (the original/correct statement provided two columns). That said, the presence of the second column is helpful to show why commenting is required; and since one cannot change Bobby's name, it's probably best leaving as-is with little more than this observation as a footnote.
– eggyalApr 27 '13 at 23:38

5

Bobby's last name - or at least his mother's, is Roberts, per Explain XKCD. I'm not sure that correcting that would improve the answer clarity, though.
– WBTAug 3 '16 at 18:29

As everyone else has pointed out already, the '); closes the original statement and then a second statement follows. Most frameworks, including languages like PHP, have default security settings by now that don't allow multiple statements in one SQL string. In PHP, for example, you can only run multiple statements in one SQL string by using the mysqli_multi_query function.

You can, however, manipulate an existing SQL statement via SQL injection without having to add a second statement. Let's say you have a login system which checks a username and a password with this simple select:

If you provide peter as the username and secret as the password, the resulting SQL string would look like this:

SELECT * FROM users WHERE username='peter' and (password='secret')

Everything's fine. Now imagine you provide this string as the password:

' OR '1'='1

Then the resulting SQL string would be this:

SELECT * FROM users WHERE username='peter' and (password='' OR '1'='1')

That would enable you to log in to any account without knowing the password. So you don't need to be able to use two statements in order to use SQL injection, although you can do more destructive things if you are able to supply multiple statements.

TL;DR

-- The application accepts input, in this case 'Nancy', without attempting to
-- sanitize the input, such as by escaping special characters
school=> INSERT INTO students VALUES ('Nancy');
INSERT 0 1
-- SQL injection occurs when input into a database command is manipulated to
-- cause the database server to execute arbitrary SQL
school=> INSERT INTO students VALUES ('Robert'); DROP TABLE students; --');
INSERT 0 1
DROP TABLE
-- The student records are now gone - it could have been even worse!
school=> SELECT * FROM students;
ERROR: relation "students" does not exist
LINE 1: SELECT * FROM students;
^

This drops (deletes) the student table.

(All code examples in this answer were run on a PostgreSQL 9.1.2 database server.)

To make it clear what's happening, let's try this with a simple table containing only the name field and add a single row:

The SQL injection here is the result of the name of the student terminating the statement and including a separate DROP TABLE command; the two dashes at the end of the input are intended to comment out any leftover code that would otherwise cause an error. The last line of the output confirms that the database server has dropped the table.

It's important to notice that during the INSERT operation the application isn't checking the input for any special characters, and is therefore allowing arbitrary input to be entered into the SQL command. This means that a malicious user can insert, into a field normally intended for user input, special symbols such as quotes along with arbitrary SQL code to cause the database system to execute it, hence SQL injection.

SQL injection is the database equivalent of a remote arbitrary code execution vulnerability in an operating system or application. The potential impact of a successful SQL injection attack cannot be underestimated--depending on the database system and application configuration, it can be used by an attacker to cause data loss (as in this case), gain unauthorized access to data, or even execute arbitrary code on the host machine itself.

As noted by the XKCD comic, one way of protecting against SQL injection attacks is to sanitize database inputs, such as by escaping special characters, so that they cannot modify the underlying SQL command and therefore cannot cause execution of arbitrary SQL code. If you use parameterized queries, such as by using SqlParameter in ADO.NET, the input will, at minimum, be automatically sanitized to guard against SQL injection.

In this case, ' is not a comment character. It's used to delimit string literals. The comic artist is banking on the idea that the school in question has dynamic sql somewhere that looks something like this:

So now the ' character ends the string literal before the programmer was expecting it. Combined with the ; character to end the statement, an attacker can now add whatever sql they want. The -- comment at the end is to make sure any remaining sql in the original statement does not prevent the query from compiling on the server.

FWIW, I also think the comic in question has an important detail wrong: if you're thinking about sanitizing your database inputs, as the comic suggests, you're still doing it wrong. Instead, you should think in terms of quarantining your database inputs, and the correct way to do this is via parameterized queries.

Thank you for your interest in this question.
Because it has attracted low-quality or spam answers that had to be removed, posting an answer now requires 10 reputation on this site (the association bonus does not count).