Creating CSV Files

Comma-separated values (CSV) files can be used to transfer information between systems that support no other common interface. CSV files can be read using standard functionality from the .NET framework but creating them requires a custom class.

Comma-Separated Values

In the last article I described how you can use the TextFieldParser class to read comma-separated values (CSV) files. This powerful class allows you to read the data from such files without worrying about escaped or quoted values, blank lines or human-readable comments. Unfortunately, it does not provide any means to create a CSV file.

In this article we will construct a class that generates CSV files from string array data. We'll make the new class work in a similar manner to existing .NET framework classes that can be used to create text files, whilst removing the need to consider the inclusion of special characters. We'll also permit consumers of the class to add comments into generated files. The resultant CSV data will be compatible with the TextFieldParser class, so you will be able to create and read CSV files easily.

Creating the Class

In the downloadable source code, which you can obtain using the link at the start of this article, I've created a console application containing the CSV writer class. This allows you to run the code and generate a simple file. For real-world projects you will probably want to create a class library for the new class, or add it to one of your existing utility libraries.

We could create our new class and inject an object into it to control the interaction with the file system. This would allow us to keep the interface very small and restrict the consumer to calling methods for inserting CSV rows or comments only. Alternatively, we could extend an existing .NET framework class using inheritance. This would allow us to add only the new methods and properties that we require. It would mean that the final type would provide more flexibility but with increased risk that a consumer could use base class functionality to create an invalid CSV file. For this article I have chosen the second approach; the new class is a subclass of StreamWriter.

To create the CsvWriter type, add a new class file and modify the automatically created declaration as follows:

public class CsvWriter : StreamWriter
{
}

To provide access to the namespace containing the StreamWriter class, include the following using directive:

using System.IO;

Configuration Fields

Usually CSV files, as their name suggests, include field data that is separated by commas. However, sometimes it is useful to be able to select a different delimiter character so we will allow this using a property. Fields whose data includes special characters, such as the delimiter, quotes or line breaks are usually wrapped with quotes but we might want to use apostrophes or some other character. We'll include a property for configuring this also.

To hold the values for this configuration, add the following two fields to the class:

char _delimiter, _quote;

We will repeatedly use the quote character in string operations, either as a single character or as two quotes when escaping a quote in a field's data. To make the code more readable in these circumstances let's add two string fields that will hold this information:

string _singleQuote, _doubleQuote;

Changing the delimiter or the quote character will affect how fields should be constructed. If a data element to be stored in a CSV file contains either character, or one of several other special characters, it will need to be quoted. Rather than hard-coding comparisons to each possible special character, we'll create a list that contains them and use this later in the code. Add the following field to store the character list:

List<char> _quotableCharacters;

We need one further field to hold the token that will be applied at the start of any human-readable comment lines:

string _commentToken;

We'll add the properties that control these fields later.

Special Characters

We've created a list to hold the characters that, if present, will cause the data in a field to be surrounded by quotes. By default, this list will hold the quote character, a comma and any other special characters that must be quoted for the CSV to be readable. The other special characters are those used to signify the end of a line of text. These are usually the carriage return and line feed characters. As the .NET framework includes a property that defines the characters for a new line, we should use that property when creating our list. That way, if an alternative string is used in a different locale the CsvWriter class should operate correctly.

Let's create a private method that initialises the quotable characters list using the configuration fields and the Environment.NewLine property: