Autodetecting CSV Delimiters

Get-CSVDelimiter can analyze batches of CSV files and autodetect delimiters. CSV-files are not always comma-separated. CSV separators depends on your culture setting. Time to create a function that autodetects any CSV file delimiter!

The module CSVManager contains two functions: Get-CSVDelimiter takes one or more CSV files and returns the delimiter used in that file, along with other useful information such as the number of columns. Convert-CSV takes any CSV file and converts the delimiter found into any delimiter you choose.

Get-CSVDelimiter autodetects delimiter used in one or more CSV files. Convert-CSV takes autodetected delimiter of arbitrary CSV file, then replaces the delimiter with a different delimiter of your choice.

How Autodetection Works

To automatically detect the delimiter used by a CSV file, Get-CSVDelimiter uses a heuristic approach which proves to be very fast and dependable. The assumption is that a delimiter must be a character that occurs in each line of the file with the very same frequency. To speed up this process, the function also uses an exclude list with characters like plain letters or numbers that are generally not used as delimiter.

Typically, Get-CSVDelimiter can identify the delimiter after parsing only three to five lines of data. As a side effect, Get-CSVDelimiter now also knows how often the delimiter occured in a line, so it can also report the number of columns.

What You Can Do Now

Knowing the delimiter all by itself can be useful. You could use Get-ChildItem to find all of the CSV files on your machine, then pipe them to Get-CSVDelimiter to get a great statistic on which delimiters are in use (and possibly, which CSV files may be broken).

In addition, you can use Convert-CSV to actually change CSV file delimiters. So if you got a bunch of files from a colleague that used a different delimiter, you could batch-convert them all to the delimiter of your choice.

Note: we are not taking any resposibility for the quality and fitness of this code. This is simply a free proof-of-concept that you can freely use. Before you actually change potentially valuable CSV data, make sure the code does what you want it to. And in case you find areas of improvements, please share them with us and other readers. Deal?

Both functions are included in our module CSVManager. This is a zipped script module. Click here if you need assistance in unpacking, installing and using PowerShell modules.

Get-CSVDelimiter autodetects delimiter used in one or more CSV files. Convert-CSV takes autodetected delimiter of arbitrary CSV file, then replaces the delimiter with a different delimiter of your choice.