I _Really_ Don't Know

A low-frequency blog by Rob Styles


Generating an Excel file from .Net should be easy right? MS have good reason to have you using their formats as the de-facto standard and make it easy, huh.

Nope. Not really.

It's a bit trickier than you might think.

Obviously with Office XP onwards, Excel supports XML and rendering the workbook as xml is fairly trivial, but I want to support aged machines running Excel 97. This means the BIFF binary format.

Starting with MSDN you find the obvious articles on using the Office Primary Interop Assemblies to interact with the Excel object model and create a workbook that way. Other articles out on the web suggest creating your own interop assemblies... Good Plan. I wonder why MS went to all the trouble of writing the PIAs then?

The PIA approach works and the code is simple enough, but because it's COM Interop you get objects passed back out rather than the types you want, which means you have to cast stuff to what you're expecting. This makes the code messier to read, it's distracting and more importantly it moves type checking from a compile time task to a runtime task making it harder to find bugs.

The big issue, though, is performance. The interop is slow. My tests tasks were to produce a roughly 100 line spreadsheet with some basic formatting of just a few cells and then write it out to disk. That shouldn't take more than oooh, TWENTY ONE SECONDS :-O and that's after I've taken out the seven seconds it takes to construct the Excel object.

There has to be a better way, or even just another way! Which, of course, there is.

There are a handful of commercial products:

Aspose.Excel; SyncFusion's ExcelRW; SoftArtisan's ExcelWriter5. to name just the ones I shortlisted.

And they are all pretty good. There are performance differences, with Syncfusion coming in at around one second to generate the test workbook and Aspose.Excel coming in fastest at 91mS with very little variation.

But some of the programming interfaces are quirky. ExcelWriter exposes a whole load of interesting stuff when referenced, such as a load of nums all starting with __MIDL and a load of com.sun.java.collections classes that suggest it wasn't written entirely for .Net.

Again, Aspose worked the best for me as they chose to index the sheet, row, column and cell indices from 0, yep zero, just like everything else in C#. Syncfusion, on the other hand indexed them all from 1. :-/

But I'm still looking for a good free, open-source, C# implementation to use. I found a little VB6 version, ported to VB.Net on Planet Source Code, but it isn't very well written; it swallows exceptions and one or two other things. It also exposes the details of BIFF8 format a little too literally for my liking. Oh, and I couldn't get it to do formatting, but that's probably just me being dumb.

So, for the project I'm on we'll probably buy one, but maybe I should offer some time to koogra.