My Links

Advertisement

News

Welcome to my weblog. My name is Jeff Smith, I am software developer in Boston, MA and I was recently named a 2009 SQL Server MVP. Check in frequently for tips, tricks, commentary and ideas on SQL Server and .NET programming.

I see many people have the need to pass in multiple pieces of information to stored procedures in a parameter, often as a CSV string. For example, you might have a stored procedure that will return customer info for all customer ID's passed in a CSV string, called something like this:

exec GetCustInfo '123,322,12,9122,1'

That's all fine and good, but it can take a little bit of work to process the string, find the commas, strip the whitespace, and then use the results. The extra work is required because of the CSV format chosen for the input parameter -- it creates a little more difficulty because you need to parse it to find commas, and to make sure you don't miss the last item or the first item passed due to the lack of surrounding commas on those items.

But remember -- YOU write the stored procedure, and YOU can specify the how data should be passed to it. CSV's look nice to us humans, but why not require that a fixed length string is passed in with the customer ID's? You can specify that your string requires ID's to be exactly 4 characters long, padded with spaces, and all concatenated together:

exec GetCustInfo '123 322 12 91221 '

This is much easier to parse, and it is no harder for the client to build this string when calling the procedure. In fact, it is probably much easier for the client because you don't have to deal with making sure you don't leave a beginning or trailing comma on the string. Sure, it doesn't look as nice as a comma-separated string of ID's, but who cares -- no one ever sees this, and the computer certainly doesn't care which method is more readable to human eyes !

When using a fixed length string, the # of elements passed equals the length of the string divided by the length of each element. You can easily check to make sure the input is valid, and it is trivial to join to a Tally table (a table of numbers, starting at 1, permanently stored in your database) to parse the elements immediately and quickly:

And that's pretty much it -- put the SELECT into a temp table or table variable, or join to your Customers table, do whatever you need. You may need to add an RTRIM() or a CONVERT() to get the datatype of your ID columns, but that is also very easily done. The point is -- you are the one deciding the specification for your stored procedure, you decide the rules, so make it easier, quicker and more direct by using a simpler format for the input strings if possible.

I've seen so many questions in the forum at SQLTeam about parsing CSV strings and asking “why doesn't WHERE ID IN (@CSV) work ?” that I thought it would be worthwhile to take a minute to provide some ideas on an easier way of solving this problem.

Good programmers are almost always a little lazy by nature -- they want to write shorter, more direct, more efficient, and more easily maintainable code. These are all good things!

That's exactly it. We needed more than just ids or numbers. Sure, we do have some cases where we take lists of numbers (or better, lists of guids, which are fixed length by definition), but a good portion of our code dealing with input CSVs needs to take in a list representing some structure, with one or more variable length strings. That's all I meant by saying that you can't always make the assumptions necessary for doing fixed-length splitting.

You're absolutely correct that the formatting required for the input list is a defined contract, and a caller could certainly break that contract even in a delimited implementation. It wasn't my intent to imply that using fixed length strings is more fragile than using delimiters, just that it doesn't give you as much flexibility (in exchange for speed, of course).

Great point, Todd -- I probably should have mentioned right in the beginning that this really only works for "non-freeform" data like ID's or Codes or GUID's -- I can see that it might not have been clear that that was the perspective I was taking.

(it realyl bothers me that I just used the word "that" twice in a row -- is that proper Enligh?)