The current project I am working on requires writing large amount of data to Excel worksheets. In this type of situation, I create an array with all the data I want to write, and set the value of the entire target range at once. I know from experience that this method is much faster than writing cells one by one, but I was curious about how much faster, so I wrote a little test, writing larger and larger chunks of data and measuring the speed of both methods:

Clearly, the array approach is the way to go, performing close to 1000 times faster per cell. It also seems to improve as size increases, but that would require a bit more careful testing.

However, one additional thing I needed to do was to format the data, using NumberFormat as well as font, borders and color fills, and I thought I would use the same approach – and I observed a significant performance degradation.

Here is the benchmark I ran, comparing writing NumberFormat by array vs. cell by cell:

The cell-by-cell version performs about the same writing values or number formats; however, the array version works about 100 times worse for NumberFormat compared to Value2. It still runs way faster than the cell-by-cell approach, but it’s not night-and-day any more.

Fortunately, when you are writing large amount of data like this, chances are, you are really writing records to a worksheet. And while every cell could potentially have a different value, the format is likely consistent, either by row or by column. That is, every cell in a column probably has the same number format. In that case, we have an alternative, which is to apply the format to an entire range at once, like this:

The format by column runs initially as fast as the array-based approach, but its time remains roughly constant as we increase the number of rows, making it an increasingly attractive option as the number of rows increases.

How do you handle writing large amounts of data to Excel? Any Jedi tricks you care to share?

And for completeness, here is the code I used to run my tests; I used an Action delegate in my test loop, which allowed me to easily swap the functions I wanted to compare – feel free to comment and criticize!

Hi Dennis,It's a good question, and I don't have an answer just yet - I'll try to compare these approaches soon. I have actually never played with ADO, and see this approach mentioned regularly, thank you for giving me an excuse to look into it Mathias

That's a really nice approach but i'm wondering about the suitability of compiling a full-blown C# app when a vbScript (using ADO) could do the same thing. (You can have it in JScript too). You'll have to go through those old ASP books to find the solution. I have the ASP bible. Very useful.

Hi Ivan,Thank you for the feedback - you bring up some fair points. Yes, you could do all the above using VBA. I would probably not write a C# app to do "just" this, but I was interested specifically in comparing how these approaches compared, in C#. That being said, you are the second person who brings up ADO now, so I should really add this to the mix, and compare!Mathias

Hi Ashfaq,thank you for your question. The purpose of the WriteEvaluation method is to produce a summary of each of the writes to a worksheet. To compare each writing episode, I compute the number of cells written per millisecond. The overall program uses various methods to write to a worksheet, and increase the number of cells to write, and measure the time it takes for each method and each size. WriteEvaluation receives the stopWatch - the timer which recorded how long it took to write - and the number of cells that has been written, and writes out to the console the performance.Hope this helps!Mathias

That's intriguing. I tried this code with various versions of the Excel interop in a console app, and it worked just fine, so this is not it. When I looked up the msdn reference for Worksheet.Range, it indicates that the corresponding assembly is Microsoft.Office.Tools.Excel.v9.0. Can you try out to add that reference, and see if that solves the problem for you? I am having a hard time reproducing the issue, but I'd be interested in hearing whether this solved it or not! msdn.microsoft.com/.../...heet.range(v=VS.90).aspx

Hi Michael,Actually, my project references and using statements are fairly straightforward, I can't see any obvious issue there - I wonder if this has to do with VSTO4 being installed, a different version of the Interop, and/or Visual Studio 2010. What version of Office and VS are you using? The source of the difference is unfortunately a bit painful to track down, but I would like to be able to solve that mystery

Hi Anton,Not totally sure I understand your question. WriteNumberFormatByColumn is used to write data column by column, it is called in the Main method (the second version), in the evaluation loop. Hope this helps!Mathias

Error 1 The type or namespace name 'Office' does not exist in the namespace 'Microsoft' (are you missing an assembly reference?) C:\Documents and Settings\in211249\Local Settings\Application Data\Temporary Projects\WindowsFormsApplication1\Form1.cs 3 26 WindowsFormsApplication1

I never knew about 2D array writing to Excel. It saved my lot of time but after seeing this solution. I will change my previous implementation which was cell by cell. Because simple I love this solution.

Would it perhaps be faster if you removed the boxing/unboxing penalty per cell? I'm not sure if that is even possible but sending explicit types, i.e., string if your field is intended to be a string would theoretically prevent Excel from having to box/unbox.

Can a progrramme be written for the following situation? in C or C+ or C# or dot net or any other language

1. Nine numbers 1,2,3,4,5,6,7,8and 9 are to be filled in 12 adjacent cells in a row in an excel worksheet.

2. A cell can be blank. It can contain one or more numbers.3. In a row a number can appear only once.4. In a cell the order of numbers does not matter. ie., 2,3,5, is the same as 2,5,3 or 3,5,2, or 3,2,5 or 5,2,3 or 5,3,2

How many rows will be needed to fill the cells in all possible combinations with the conditions stated above

Hi Mathias, In the WriteArray method when i actually assign the actual value from the dataset rows instead of data[row - 1, column - 1] = "Test", i get the insufficient memory error for large number of rows for ex: 50K rows. But, it works just with value "Test". Any idea where i am missing?