My Links

Advertisement

News

Welcome to my weblog. My name is Jeff Smith, I am software developer in Boston, MA and I was recently named a 2009 SQL Server MVP. Check in frequently for tips, tricks, commentary and ideas on SQL Server and .NET programming.

A occasional question seen in the forums, which was just recently asked today, is:

"I know I can use

SELECT * FROM table

to get all of the columns from a table, but is there a way to write

SELECT * minus columnXFROM Table

to get all of the columns except for certain ones?"

Now, my goal isn't to debate whether not "SELECT *" is bad or good or should be used or not. The fact of the matter is, people use it all the time because it is quicker and shorter than typing out all of the column names. People are lazy, right? (myself included.) And these lazy people often would like to return all of the columns from a table except for one or two without typing them all out. So, why isn't this concept allowed? Would it make any sense?

Here's my usual response:

First, the answer is no, SQL does not support that. You must specify what you want.

Second, if you ask me "why not? It would be great!", let me ask you this in return:

"If SQL did support that syntax, and you executed

SELECT * minus ColumnX
FROM Table

but columnX did not exist in that table, what should happen?"

I think it's kind of an interesting thing to think about. Should an error occur? Or should it just happily return the results, since we didn't want ColumnX anyway? Maybe it should issue a warning? I really don't know ... Would we all ever be able to agree on a definitive, logical way to handle this?

Here's another way to think about it: What if there were a command called "DontExec" that simply didn't execute the stored procedure specified. What should happen if you call DontExec on a stored procedure that doesn't exist?

That sounds kind of silly, I know, but it is the same basic thing to consider as if you had a "* minus Column" option in a SELECT statement. Shorter to type? Sure. Does it make any sense? Not really.

By the way -- the third part of my response is usually this: the easiest solution is to simply use SSMS, QA, or EM to assist you in building your column list by using the scripting features or the query builder if you don't want to type it all out.

(UPDATE: As Denis points out in the comments, it's actually even easier. Now you really have no excuse to use *. )

Frankly, the only time someone should be selecting * is for some adhoc data discovery; in which case, the inclusion of some extraneous columns shouldn't matter.

Production level code should explicitly list the columns to be returned. The principal reason for this is to insure against schema changes. When a field is added to a table, select * will return it -- sometimes with unexpected results.

With the myriad of IDEs available for database developers, including the entire list of columns should not be a burden

If you run a query like that often you should create a view that gives you just what you want. Otherwise specify what you want when you type the query. You can even create sql files with the queries that you use often. But I don't see a compelling reason for the except clause.

I use "Denis the SQL Menace"'s process to avoid the "SELECT *" : in Mgmt Studio (same applies to Query Analyzer) highlight the table you're doing the SELECT on, press CTLR+D (to put the results in grid-mode), then press ALT+F1 (to get the table metadata) then highlight the column_name column in the 2nd table & you can paste it into your query window & do the grunt work of replacing newlines with ", " to build the column list.

I also remap CTRL+1 keystroke to sp_helptext so I can quickly view the contents of views/functions/sprocs w/out having to navigate the object explorer tree.

I don't think the question is misguided at all. Incorporating "except" logic into the SELECT statement would dramatically improve readability.

Suppose I have 26 columns, each named after a letter in the alphabet. I want all but one of them. This could be important if, say, one columns is freeform text that takes up a lot of space.

I could type: select a,b,c,d,e,f,g,h,i,k,l,m,n,o,p,q,r,s,t,u,v,w,x,y,zBut which field did I leave out? Not obvious, is it?Wouldn't it be clearer o say this? select * except jReadable code is much easier to maintain.

I appreciate the suggestion of using a tool to autogenerate the query, but such tools aren't available in every environment, and won't produce code as readable as "select * except j".

IMO, it should not throw an exception. It's analogous to C# or Java ArrayList. If you have a list, and you ask, remove(x), but x is not in the list, it does nothing. Your only goal is that x not be in the list. If it wasn't in the list to begin with, then throwing an exception is just annoying because now you first have to check, if (list.contains(x)) list.remove(x), which is just dumb.

C# kind of screwed up Dictionary<U,T> this way, that if you say Object y = dict[x], but x is not in the dictionary, it will throw an exception, so then you end up using dict.TryGetValue(x)... lame. But Hashtable ht[x] returns null like a good boy.

In the projects my team codes, we do a lot of stuff with business objects, and have API abstraction layers that return business objects. So, we use "select *" most of the time because most of the data stored in the tables were from user input, and this data has to be in the business object.

re: SyntaxI think "minus" is fine, as long as you could do:select *, minus(COL_X, COL_Y) from table ...just as long as you don't have to type minus infront of every column. Or maybe even:select *, -(COL_X, COL_Y) from table ...

rkippen -- close, but not quite. you are forgetting something. in your examples, x is data, it is not code! x is not an object in your application, or a table or column in your database, it is simply a piece of data. That is the key difference!

Well, the most annoying part about not being able to do this is in INSTEAD OF triggers.Sure we don't want the SELECT * in the usual code, but in an INSTEAD OF trigger MSSQL forces you to recreate your original order since it totally lacks BEFORE triggers.

So it's VERY annoying to have to update that trigger every time the table needs updating. It would be extremely handy to just be able to sayINSERT INTO xSELECT * FROM inserted MINUS (keyColumn)

After all, keyColumn that's autoID will prevent that from working if it's included, forcing you to fully list every field, which has to be maintained for something that ya really should have a before trigger for.

Is there really a performance issue using * instead of a column list if you need all columns and the only way to find out all the names of the columns from your application code is to loop through a load of reflection data and build a column list string? There is also a performance issue in transmitting a huge number of column names to the database.

Further more in a well design software package you would not expect the schema to change for no reason, only through the software upgrading the database so there is really no need to protect yourself in this instance from selecting extra columns which might lead to an unexpected result. In fact if all database code is handled by a few utility classes in the model layer then it will insulate the rest of the application from unexpected data.

Just my 2ps worth, and I would LOVE a feature that allowed me to select all but one column.

I searched for how to not include only specific columns in a query because the design I'm working on could be aided by it. I know most DBA's and DB developers will cringe (and their heads may explode) at this concept, but this would be monumentally useful in a dynamic data environment. What I mean by this is you have a primary table with a number of fixed columns, and a secondary table that is joined to the primary via a common key. This secondary table can contain n number of columns, as it is customizable. The result is that joining the tables repeats the key (once for the primary table, once for the secondary table). This introduces problems in the application logic if it is expected to accommodate the dynamic nature of the data. The options become to either generate a SP that can dynamically generate the correct SELECT statement, or force the application to look for duplicate key fields and not include them in the result set, which adds a layer of complexity to the binding scheme. Is this whole concept solid in the world of relational database theory? Probably not, but as the worlds of static, strongly typed data tries to meet the needs of a truly object oriented programming model, the expectations for the data store have to be modified.