Database Administrators Stack Exchange is a question and answer site for database professionals who wish to improve their database skills and learn from others in the community. It's 100% free, no registration required.

We have a SQL generator that emits SQL conditional statements generically for specified fields (which for the sake of discussion: we will label as myField).

So, if myField is of type NVARCHAR, we can do a comparison of said field against a string like so: myField = 'foo'. However, this does not work for fields of type NTEXT. Thus, we have to do the comparison with a cast: CAST(myField as NVARCHAR(MAX)) = 'foo'. This will in fact work if myField is of type NVARCHAR or NTEXT.

This brings me to my question: what is the performance hit of doing the aforementioned cast on a field that is already of type NVARCHAR? My hope is that SQL Server is smart enough to dynamically recognize that myField is already of type NVARCHAR (effectively turning the CAST into a NO-OP).

This question came from our site for professional and enthusiast programmers.

A quick note for anyone finding this question: NTEXT (and TEXT and IMAGE) are officially deprecated and due to be removed in some future version of SQL Server (though IIRC they still work in SQL1014), so you should be using NVARCHR(MAX) (or VARCHAR(MAX) or VARBINARY(MAX)) instead. Replacing the NTEXT column with an NVARCHAR(MAX) one in this instance would remove the need for the cast as the comparison can be done directly with that type, and there are other potential efficiency gains here and elsewhere too. Unfortunately you can't index a *(MAX) column, but you can't a TEXT/NTEXT one either.
–
David SpillettAug 1 '14 at 14:25

If the cast of the column is to the same datatype but greater length and the seek predicate is a string literal it causes an index scan. This is obviously to be avoided.

If the cast of the column is to the same datatype and the same or greater length and the seek predicate is a local variable it adds a compute scalar operator to the execution plan. This calls GetRangeThroughConvert and outputs a range.

SELECT *
INTO #test
FROM [master].[dbo].[spt_values]
CREATE NONCLUSTERED INDEX [ixname] ON #test
(
[name] ASC
)
DECLARE @name NVARCHAR(MAX)
SET @name = 'rpc'
SELECT name
FROM #test
WHERE CAST(name AS NVARCHAR(35))= @name --Cast the same and local variable
SELECT name
FROM #test
WHERE CAST(name AS NVARCHAR(MAX))=@name --Cast to longer and local variable
SELECT name
FROM #test
WHERE CAST(name AS NVARCHAR(35))='rpc' --Cast the same and literal
SELECT name
FROM #test
WHERE CAST(name AS NVARCHAR(MAX))='rpc' --Cast to longer and literal

Does not break SQL's ability to use indexes if indexes exist, and if they don't exist, may result in logging a missing index.

Similarly, when casting from int to tinyint or bigint to int etc, the cast function does not stop SQL from using indexes IF the optimiser knows that the cast operation does not change the sort order of the 2 comparable datatypes.

Here are a bunch of tests you can run and view actual plan using Adventureworks2008R2

select count(*) from Sales.SalesOrderDetail where SalesOrderID = 8 --1
select top 10 * from Sales.SalesOrderDetail where cast(SalesOrderID as tinyint) = 8 --2
select top 10 * from Sales.SalesOrderDetail where cast(SalesOrderID as bigint) = 8 --3
select top 10 SalesOrderID from Sales.SalesOrderDetail where cast(ModifiedDate as date) = '19780322' --4
select top 10 SalesOrderID from Sales.SalesOrderDetail where convert(date,ModifiedDate) = '19780322' --5
select top 10 SalesOrderID from Sales.SalesOrderDetail where cast(ModifiedDate as varchar(20)) = '1978' --6 -- THIS WILL NOT USE INDEX
select SalesOrderID from Sales.SalesOrderDetail where cast(ModifiedDate as date) between '19780101' and '19780109' --7

In general the CAST will kill performance because it invalidates any use of index seeks as Martin Smith's last example shows. CASTing to nvarchar(max) or to a different length means a different data type: the fact it's all nvarchar is irrelevant.

On top of that, the datatype of the right hand side of the compare matters too. If it's a local variable or parameter of a different length then one side will be implicitly CAST to the widest of the 2 datatypes (see datatype precedence).

Basically, if you have a general CAST to nvarchar(max) it will bollix things up. I would consider fixing the use of ntext before I added CAST all over.

Edit:

Martin Smith's first query invalidates most of this answer. Perhaps behaviour has improved SQL Server 2005 to 2008.. i need to investigate more