T-SQL String Manipulation Tips and Techniques, Part 2

Last month, I started a two-part series about string manipulation tips and techniques. I continue the series this month, focusing on type conversions. I discuss how to handle cases in which you need to convert a character string value to another type when there's a possibility for the conversion not to succeed. I show how to handle this need in SQL Server 2008 R2 and earlier, as well as introduce new capabilities in this area in SQL Server 2012. (See "T-SQL String Manipulation Tips and Techniques, Part 1").

Converting a Character String to a Different Type

Consider a situation in which you're given a character string input (call it @str) that represents an integer. You're supposed to convert it to an integer typed value. The tricky part is that there's no assurance that the string will in fact successfully convert-for example, because of data quality issues. In case the value isn't convertible, you're supposed to return a NULL. Your solution isn't supposed to generate an error.

In SQL Server 2008 R2 and earlier, this isn't a straightforward task. One way to handle it is to use a CASE expression with a number of predicates in the WHEN clause that ensure that the value is convertible, and only if it is, actually convert it in the THEN clause. Here's how such a CASE expression might look:

DECLARE @str AS VARCHAR(100) = ' -1759 '; SELECT CASE WHEN ISNUMERIC(@str) = 1 AND @str NOT LIKE '%[^0-9!-+ ]%' ESCAPE '!' AND CAST(@str AS NUMERIC(38, 0)) BETWEEN -2147483648 AND 2147483647 THEN CAST(@str AS INT) END;

The first predicate uses the ISNUMERIC function to ensure that the input is convertible to some numeric type. Note, though, that ISNUMERIC returns 1 even in cases in which the input isn't convertible to an integer but rather to some other numeric type (e.g., NUMERIC, FLOAT, MONEY). For example, in all of the following three cases, ISNUMERIC returns 1:

The second predicate further verifies that there are no symbols besides digits, plus and minus signs, and spaces. Still, the value could be greater than the maximum INT value 2147483647 or less than the minimum -2147483648. Namely, it could successfully convert to a NUMERIC(38, 0) type but not to INT. So the last predicate converts it to NUMERIC(38, 0) and checks that the result falls within the range supported by INT. If all three predicates evaluate to true, the THEN clause can safely convert the input to INT. Otherwise, the implicit ELSE NULL will have the CASE expression return a NULL. The solution is a bit convoluted, as you can see.

In SQL Server 2012, life is good and simple. To achieve the same task, you simply use a new function called TRY_CONVERT, like so:

SELECT TRY_CONVERT(INT, @str);

If the input is convertible to the target type, the function converts it; otherwise, it returns a NULL.

Avoiding Conversion Failures in Dynamic Schema Scenarios

There might also be cases in which you need to convert values from a character string to a different type as part of a query in a dynamic schema scenario. Avoiding conversion failures in those cases isn't always a trivial thing. To demonstrate what I mean, I'll use a table called Properties that the code in Listing 1 (below) creates and populates with sample data. This table holds object property values.

As you can see in Listing 1 (below), the val attribute is of a VARCHAR(500) data type, and the type attribute indicates the data type of the value. You're guaranteed that the value is convertible to the data type indicated by the attribute type. So, for example, if the type is 'INT', the value is convertible to INT.

Suppose that you need to query all rows where the type is of the integer family ('TINYINT', 'SMALLINT', 'INT', 'BIGINT'), and out of all integer values, filter only those that are greater than 10. You issue the following query:

SELECT name, type, CAST(val AS BIGINT) AS val FROM dbo.Properties WHERE type IN ('TINYINT', 'SMALLINT', 'INT', 'BIGINT') AND CAST(val AS BIGINT) > 10;

You assume that SQL Server supports a short circuit concept: First, it evaluates the predicate type IN ('TINYINT', 'SMALLINT', 'INT', 'BIGINT'), and only if it's true, proceeds to evaluate the predicate CAST(val AS BIGINT) > 10. You assume that if the first predicate isn't true, there's no reason for SQL Server to proceed to the second predicate, and hence no reason for it to fail on a conversion error.

Reality is different, though. SQL supports a concept called all-at-once, meaning that all expressions that appear in the same logical query processing phase (e.g., the WHERE filtering phase) are conceptually evaluated at the same point in time. This means that SQL Server can start with whichever predicate it prefers, not necessarily in left to right order. So SQL Server might choose which predicate to start with based on optimization considerations. In other words, this code might fail, and if it does, it's not considered a bug. Sure enough, when I ran this code on my system, it failed with the error that Figure 1 shows.

The fix that will let you avoid conversion errors is to use the techniques I described in the previous section. In SQL Server 2008 R2 and earlier, you can use a CASE expression and actually perform the conversion only if you know that the type is of the integer family, or otherwise return a NULL. Here's the revised query code:

SELECT name, type, CASE WHEN type IN ('TINYINT', 'SMALLINT', 'INT', 'BIGINT') THEN CAST(val AS BIGINT) END AS val FROM dbo.Properties WHERE type IN ('TINYINT', 'SMALLINT', 'INT', 'BIGINT') AND CASE WHEN type IN ('TINYINT', 'SMALLINT', 'INT', 'BIGINT') THEN CAST(val AS BIGINT) END > 10;

In SQL Server 2012, the solution is much simpler using the new TRY_CONVERT function, like so:

Suppose that you decide to define a view that filters only integer properties, like the one in Listing 2 (below).

According to logical query processing in SQL, the WHERE filtering phase is supposed to be evaluated prior to the SELECT phase, so you assume that the conversions should take place only after integer values have been filtered. You then submit the following query against the view:

SELECT name, type, val FROM dbo.IntegerProperties WHERE val > 10;

According to standard SQL, this query isn't supposed to fail. But for optimization reasons, SQL Server expands the definition of the view and rearranges the logic to query the underlying table directly. So behind the scenes, this query actually translates to:

SELECT name, type, CAST(val AS BIGINT) AS val FROM dbo.Properties WHERE type IN ('TINYINT', 'SMALLINT', 'INT', 'BIGINT') AND CAST(val AS BIGINT) > 10;

Based on what I explained previously, you already know that this form might fail.

To fix this problem in SQL Server 2008 R2 and earlier, you need to alter the view's definition to use a CASE expression, as in Listing 3 (below). Try the same query again, and this time it shouldn't fail:

SELECT name, type, val FROM dbo.IntegerProperties WHERE val > 10;

In SQL Server 2012, in the view definition, you can use the simpler fix with the TRY_CONVERT function, as in Listing 4 (below).

Type Conversion Made Simple

In SQL Server 2008 R2 and earlier, converting a character string input to a different type isn't a trivial task when there's a possibility for the value not to be convertible. You can use a CASE expression with some tricky logic to first check whether the value is convertible, and only when ensuring that it is, actually convert. SQL Server 2012 simplifies this need by introducing the TRY_CONVERT function. If the value isn't convertible to the target type, the function returns a NULL, as opposed to generating an error. You can also use a CASE expression or the TRY_CONVERT function to get around conversion failures in dynamic schema scenarios.

From the Blogs

My initial goal in writing this series of posts was to outline some of the concerns surrounding Availability Groups (AGs) and SQL Server Agent Jobs – and call out how there is virtually no guidance from Microsoft on this front and then detail some of the pitfalls and options available for tackling this problem domain. I initially expected this series of posts to have between 25 and 30 posts – according to some of the early outlines I created ‘way back when’....More

Throughout this series of posts I’ve taken a somewhat pessimistic view of how SQL Server Agent jobs are managed within most organizations – meaning that most of the code and examples I’ve provided up until this point were based on assumptions about how CHANGE to jobs is managed. That pessimism, to date, has come in two forms:...More

In this series of posts I’ve called out some of the concerns related to SQL Server AlwaysOn Availability Groups and their interaction with SQL Server Agent jobs – both in the form of Batch Jobs (see post #3) and backups....More