Category Archives: SQL

I haven’t written a SQL article in, well, quite awhile, and I guess this won’t really count as one either. Anyways, I tossed this up over on the SQL Team forum and I liked it so I’m putting it here.

The solution relied on a couple of arcane concepts and functions. First, I used PARSENAME which parses a string separated by periods. It’s basically the same idea as SUBSTRING_INDEX in MySQL only a bit less flexible because of the period.

I’ve talked about walking the tree before but, in essence, it breaks down a delimited string by joining a table of numbers and using that sequence to get, say, the third delimited value.

You have the following string.

12345678 SCINC, SCNRQ, SRPPR

The first part is the ORDER_ID and the rest of the string is the SKU’s attached to that order. Yeah, it’s a bit weird but you see things like this in converted older system or in some really ad hoc reporting tools.

Here’s what it does. @var1 does two things: it converts the comma/space to a period so PARSENAME can use it and it also removes the ORDER_ID field and using the variable makes it easier to explain. This is what it does:

Finally, this was for a single record. If we had full order entry system you would simply join the walkers table to the order data and limit the results to the number SKU’s for each record. My code does this but since it’s only for a single record it might seem confusing. Anyways, a full outer join or a non equi join would fit the bill.

I hope this isn’t too confusing. I did it on the fly and didn’t sequence it as well as I have in most of my examples. Honestly, I just liked the solution and wanted to add it here. Too often, I think, you see people run to a function to solve this kind of problem, especially on the SQL Server side of things when there are other ways to solve the problem.

The classic way to create row numbers in MySQL is to use a variable and increment it for each row. I did an example earlier showing it this way. In SQL Server, you just use ROW_NUMBER() or one of its variants. But, just for fun here is another way which was in the SQL Cookbook.

The SQL Cookbook also had an interesting approach, they used a COUNT(*). It’s a weird solution because it can only work on an ordered list of some sort. For example, if we alphabetized our list of countries it would work as follows:

I dunno, weird solution but it’s what he did, and it does work in some situations. I read that section really quick so maybe I missed something, something really obvious but I’ll stick to the variable for now.

Honestly, I don’t think I’ve ever done this outside of a book, but, it’s something that comes up so it’s worth mentioning. Fortunately, everyone happily works the same way so I’m just going to do this in MySQL.

Suppose we have two identical tables and we want to copy data from one to another. We could use an external tool, like BCP, or mysqldump, or we could just use a INSERT / SELECT as follows;

It would have been nice if it had dropped this error first, before I modified the column, but MySQL isn’t know for solid/clean errors. Anyways, just make sure that if you are switching columns around, for some reason, that you don’t bonk the data types.

Also, you must have an equal number of columns. For example, this will not work.

Creating a table is, more or less, similar among the three variants that I write about. You create a table and you roll with it. However, the CTAS statement (CREATE TABLE AS) is handled somewhat differently amongst them. So, I’m going to hit that one up.

Really, I’m just writing this up because outside of MySQL I always forget this syntax. Also, when you create a table this way, it won’t include

Originally, I’d planned this post to be about using CASE in an UPDATE. However, strangely, unexpectedly, it turned out that the big 3 all used the same syntax. So, this will just be a quick overview of the differences between SAS and MySQL / SQL Server using INSERT.

Here’s some SAS code that creates a table, inserts some data and then updates it.

There are two things worth noting in this code. First the INT data type. SAS accepts it, 9 others according to the Study Guide which it will then convert to one of the two SAS data types: number and character. In this case, INT is converted to a number.

I, finally, got smart and created a work database for MySQL and work schema for SQL Server. It should help my code look a bit more consistent, even if work. means one thing in SAS, another in MySQL and still another in SQL Server.

The other thing, which is what led me to write this post, is that SAS uses a VALUES clause before each row in an INSERT. As you’ll see in the MySQL code (it’s the same with SQL Server) we only use a single VALUES clause.

Interestingly, you can recreate a table, albeit this is a temporary table, by running a CREATE TABLE on top of it, which is why there are only 3 values in this table. If you have multiple columns in the table, you treat them exactly like you would an UPDATE statement.

“We’re developing a new citizenry. One that will be very selective about cereals and automobiles, but won’t be able to think.”
Rod Serling

SAS is particularly vague about what ALL does in the case of INTERSECT but it’s essentially the same as UNION and UNION ALL. In other words, it returns all records that match both tables in the INTERSECT rather than only unique records which is what a simple INTERSECT would do.

SQL Server does not support INTERSECT ALL and MySQL does not support INTERSECT at all.

I hope this is clear to anyone reading this. I always found SET operators really clean. Unfortunately, we’re about to enter the Twilight Zone as SAS has one that is unique to them, but then, that’s “situation normal” for SAS.

As you can see OUTER UNION simply appends the data and creates new columns for the second table. The other thing it does is set values to missing in the new, and old, rows. Bizarrely, the result set doesn’t seem to care if you have the same name. It just happily returns the new column. Fortunately, SAS has a solution for this: CORR.

CORR

I’m going to deal more with CORR in a future post but this is a useful place to put it because it’s about the only way that a OUTER UNION makes sense. What CORR does is to combine records that have the same column name. For example, we could change the prior query as follows:

That’s really clean compared to most of my posts, and, it’s going to stay that way. The reason? I’m not going to mess with SAS and I’m going to let MySQL go right ahead not doing it. If you decide to do it in MySQL, and one of these days, I’ll do it, you need to get a bit creative with a JOIN or a couple of other options such as a correlated subquery and NOT EXISTS.

OK, I talked myself into it, but only because I literally just did this the other day.

First, I’ll create a new table that has just the records where sex = ‘M’

I had to write that about 10 times because I kept forgetting to change the SELECT clause. Yeah, for me, SELECT = ‘HARD’, EXISTS = ‘Easy’.

The only part of this query that really matters is the WHERE clause. Essentially it uses a correlated subquery to compare the two data sets and where the record NOT EXISTS in allMale, it is returned to the result set. It’s not very clean, and all things being equal, I’d rather use EXCEPT.

OK, the SELECT matters too. I have to use the DISTINCT there so the query won’t return duplicate values, or, as the documentation I’ve read for EXCEPT says “unique values only”.

And just for kicks, Oracle doesn’t have an EXCEPT operator either. Over on that part of the ranch, they roll with MINUS.

Fortunately, I’m done with EXCEPT, except, until I do the “SAS is special” operator article.

The easiest set operator to deal with is a UNION. It’s basically a data set appended to a data set. It has a couple of minor quirks, column naming, ordering, and then, of course, SAS picked the path least chosen, but for the most part, anytime you see a UNION, or a UNION ALL, it’s going to work the same way.

As you can see the second SELECT is added to the first set. Typically, the data types do not have to match, as automatic type conversion will take place. Except, of course, if you are working with SAS, where the following will happen:

proc sql;
SELECT name, age
FROM mysql.admit
UNION
SELECT age, name
FROM mysql.admit;
quit;
ERROR: Column 1 from the first contributor of UNION is not the same type as its counterpart from the second.
ERROR: Column 2 from the first contributor of UNION is not the same type as its counterpart from the second.

A straight UNION will remove duplicates similar to what a DISTINCT does in a SELECT statement.

If you, instead, write the query as a UNION ALL it will include all rows, including any duplicates. This applies in SAS, although, they approach the ALL keyword slightly differently but I’ll deal with that in a later article.

You can also alias a UNION, however, the alias must be applied to the first SELECT statement’s columns. You cannot apply it to later columns, well, you can, but as you can see it won’t work. For example:

mysql> SELECT 1,2
-> UNION
-> SELECT 1,2,3;
ERROR 1222 (21000): The used SELECT statements have a different number of column

In other words, SQL Server, and MySQL, expect that a UNION operator will have the same number of columns. SAS, doesn’t care, and simply NULLs, or in SAS terminology, sets those fields to missing.

SET operators, all things considered, are really easy to use, assuming SAS doesn’t get involved and even there, as long as you stay with a UNION or UNION ALL, it’s sort of, maybe, kind of, close.

As far as the SAS Advanced test goes, I think I’ll keep in mind that SAS allows a different number of columns, and, it doesn’t allow different data types in a column. It also has other quirks but I’ll deal with those in later posts.