November 22, 2006

Dealing with CVErr Values in .NET – Part I: The Problem

First of all, I want to thank Dennis for setting me up as a guest author on his “.NET & Excel” blog, which I think is off to a fantastic start. I’m honored to be able to contribute.

I didn’t really chose this topic “Dealing with CVErr Values in .NET,” it chose me. I was experimenting with some User Defined Functions (UDFs) created using a Managed COM Automation Add-in and discovered some serious incompatibilities between Excel’s error values, such as #Value! or #N/A, and .NET’s ability to read and write such values.

I searched using Google extensively, checked for threads on MSDN’s VSTO Forum and even looked specifically for blog posts by Carter, Lippert and/or Whitechapel… but I could find nothing. Dennis searched through the newsgroups and found only one highly-technical thread, which did not appear to resolve the problem.

So I had to go it alone… And the following are my thoughts on some of the issues involved with CVErr values when using .NET and Excel. This turned out to be a little long, so it has been split into two parts:

In this post I’ll discuss some of the key issues/problems with CVErr values when accessed via .NET and then in the next post we’ll see what options we have for dealing with them.

Background:

A CVErr value is a special Variant value that can be used in the Component Object Model (COM) to indicate an error. It is a method of “returning an error value” as opposed to “raising an error.” For more on this subject you can read about the CVErr() Function in Microsoft Office documentation. What is important for us to note is that this approach to error handling is now obsolete and is not included in .NET Framework.

However, as Excel VBA programmers know, Microsoft Excel uses CVErr values to indicate cells holding error values such as #Value!, #N/A or #Div0!. For example, Excel has an enumerated type named xlCVErr within which the field xlErrNa is defined as an Integer value of 2042. Using VBA or Visual Basic 6.0, a call to CVErr(2042) or CVErr(xlErrNa) returns an N/A error value that can be held within a Variant. When displayed on the spreadsheet, the result is “#N/A” (without the quotes).

Using VBA or VB 6.0 there are a few different ways available to us for assigning a CVErr value to a range. For example, if we define a range as follows:
Dim rng As Excel.Range
Set rng = xlApp.Range("A1")

Then the following two lines in VBA or VB6 would both successfully assign #N/A to the range “A1″:
rng.Value = CVErr(xlErrNa)
rng.Value = CVErr(2042)

But note that in Excel, we also have a couple of additional “tricks” available to us in order to assign a CVErr value. For example, the following two Range.Formula property assignments also result in the range holding a #N/A value:
rng.Formula = "=NA()"
rng.Formula = "#N/A"

The first assignment “=NA()” works because the formula evaluates using the =NA() worksheet function provided by Excel, which returns a #N/A value. The second assignment operates by assigning the formula “#N/A”, which directly evaluates to #N/A.

This all works as expected when working in a COM environment such as VBA or VB 6.0. However, in .NET, the concept of the CVErr values has been excluded. And since the CVErr values present in VBA/VB6 are special Variant values that are not part of an enumerated type, they cannot be directly passed to or from .NET. Instead, when passed from COM to .NET these CVErr values are converted to unsigned UInt32 values which C# and VB.NET both interpret as signed Int32 values. For example, when using C# or VB.NET, #N/A is converted to the Integer value of -2146826246 and #Value! is converted to -2146826273.

The result of the above is that that the rngA1 will hold #N/A, but that rngB1 will hold -2146826246 instead! What happens is that the #N/A value returned by rngA1.Value is coerced to an Integer by .NET and then this Integer is assigned to the rngB1.Value.

So now we have the situation where even a simple Range.Value assignment no longer works properly. Errors won’t propagate as #N/A values. Wildly incorrect results can be created as .NET code converts CVErr values to valid, large negative integer results. This is not good…

Note that the calls to CVErr(2042) or CVErr(xlErrNa) cannot be made within .NET, for the CVErr() method is not included within the Microsoft.VisualBasic namespace. And if they could be made, the results would not be interpretable by .NET in any case. In .NET, such a method could only return an Integer value such as -2146826246 for #N/A or -2146826273 for #Value!, etc.

In addition, the IsError() function, which in VB6/VBA is designed to return True if a Variant holds a CVErr value or Missing, has been completely changed in .NET to now check if an object passed in inherits from the System.Exception class. This is so misleading compared to its previous functionality within VBA/VB6 that I think it should have been removed as was IsMissing(), IsNull(), etc.

Preliminary Thoughts:

Ok, so what can we do about it? The answer is that it’s not easy. My initial solutions were clunky, imperfect and/or required a COM intermediary – either a COM DLL or VBA project – acting as a crutch. Fortunately there is a solution to this conundrum, which I’ll get to in the next post, but I think it’s worth discussing some of the preliminary approaches just to get a sense of how difficult this issue can be to the uninitiated.

As an example, consider the situation where we wish to sum all the cell values in a range. Generally, we can do so efficiently by assigning the Range.Value to a two dimensional Object(,) array and then iterating through the array’s elements. The following VB.NET code illustrates:
Dim rng As Excel.Range = xlApp.Range("A1:Z100")
Dim array2D As Object(,) = CType(rng.Value, Object(,))
Dim sum as Double = 0
For r As Integer = 1 To rng.Rows.Count
For c As Integer = 1 To rng.Columns.Count
sum += CType(array2D(r,c), Double)
Next c
Next r
Message.Show(sum.ToString)

The above runs just fine… Or does it?

What if we wished to throw an exception if we encountered a #N/A or any other CVErr value? Or what if we wished to quietly ignore all error values and only sum valid numeric data? Directly speaking, we can not, not with .NET. No CVErr values could ever come through the assignment of Dim array2D As Object(,) = rng.Value. Instead, CVErr values such as #N/A would be converted to large negative values such as -2146826246. And these values would have a disastrous effect on the summed result.

So what can be done? Here are a few initial options that come to mind:

The “IsProbablyCVErr” Approach

It would seem that the fastest and easiest approach when using .NET to test for CVErr values is to check for the seven specific CVErr results such as -2146826246 for #N/A, -2146826273 for #Value!, etc. In essence, we could create an IsProbablyCVErr() method that returns True for any of the key values into which a CVErr can be coerced when marshaled from COM to .NET:
Function IsProbablyCVErr(ByVal obj As Object) _
As Boolean
Const ErrDiv0 As Integer = -2146826281
Const ErrNA As Integer = -2146826246
Const ErrName As Integer = -2146826259
Const ErrNull As Integer = -2146826288
Const ErrNum As Integer = -2146826252
Const ErrRef As Integer = -2146826265
Const ErrValue As Integer = -2146826273
Select Case CType(obj, Integer)
Case ErrDiv0, ErrNA, ErrName, ErrNull, _
ErrNum, ErrRef, ErrValue
Return True
Case Else
Return False
End Select
End Function

Probabilistically this is a good bet, and is our fastest-executing solution. And the odds of these large negative integer results occurring due to a legitimate non-error calculation are remote. On the other hand, having the “remote chance” of a “very bad result” is understandably unacceptable to most, so what can we do?

To change the probability of an error from “unlikely” to have a problem to “impossible” we can attempt a few things:

The Test Range.Value And Range.Text Approach

We can loop through the individual cells, checking if the Range.Value = -2146826246 and then check if Range.Text = “#N/A”. If both of these conditions are True, then this guarantees that we have a #N/A result. However, there are two problems with this idea. The first is that looping through the cells individually instead of looping through the two dimensional array that is returned from the Range.Value property is massively slower, on the order of 50-fold. The other issue is that this is not fool-proof since the Range.Text property can return “####” or the like if the column width is too narrow. In short, this approach will run slowly and is not guaranteed to work in all circumstances.

The Copy-Paste And Test Approach

A third approach is that we can copy the range to a hidden, temporary worksheet using Range.PasteSpecial(), pasting values only. Then one can create two different arrays returned by (a) tempRange.Value, and (b) tempRange.Formula. Given these two arrays we can then loop through the values array checking for -2146826246 and then we can confirm that this is in fact an actual #N/A value by checking if the respective formula holds the string value “#N/A”. This should operate very quickly and at first blush would seem to be our most efficient “fool-proof” solution. (Fortunately, we can actually do much better, as we will see in the next post.)

The problems we face are compounded when our code is operating from within a User Defined Function. UDF code operates under stricter conditions than does standard Automation code and so the situation becomes a little trickier. If one is using standard Automation code, that is, code which is not operating within a UDF, then the “Copy-Paste And Test Approach”, above, is the best idea we’ve considered so far. However, if calculating values of a Range from within a UDF then this approach cannot be used since the copy-paste operation would not be permitted.

Things get even more complicated when we wish to write a CVErr value such as #N/A from .NET to COM. If using Automation code, we can make use of the Range.Formula property in order to assign a CVErr value. For example, as we saw earlier, the following VB.NET code assigns #N/A to the range “A1″:
Dim rng As Excel.Range = xlApp.Range("A1")
rng.Formula = "#N/A"

However, what if we had a User Defined Function from which we needed to return a CVErr value such as #N/A? A UDF cannot make direct formula or value assignments. The UDF can only return a value which the calculation engine places within the caller range. But it does not appear that a UDF written in .NET could return a CVErr value because the .NET Framework does not even seem to know what a CVErr value is, converting them into Int32 values. We cannot have our UDF simply return an Integer and hope that it gets converted to a CVErr, can we? No, an Integer such as -2146826246 returned by a UDF would get stored as a large negative value in the cell, not as a CVErr value such as #N/A.

Do note that a User Defined Function created with .NET can force a #Value! result to be returned by intentionally throwing an exception. However, a #Value! result is generally used to connote an invalid input. If the inputs are valid, but there is no data at the desired data location, then a UDF should return #N/A. Unfortunately, if we need a UDF created in .NET to return #N/A, #Num! or other CVErr value, then we would seem to be completely dead in the water.

Hi Dennis, no problem, very glad to contribute. :)
> “For me it’s remarkable that the error handling in UDFs has become an issue. After all, in classic VB and VBA we’re used to solve it in a smooth way.”
I was not “surprised” at first, because I kind of knew that CVErr values were gone. But the more I thought about it, the more intractible it became. I am very glad that in the end there is a solution, at least for us Excel developers. Other COM environments, however, are probably in a more instractible situation, unfortunately.
— Mike