9.12. Extract CSV Fields from a Specific Column

Problem

You want to extract every field (record item) from the
third column of a CSV file.

Solution

The regular expressions from Recipe 9.11 can be reused here to iterate
over each field in a CSV subject string. With a bit of extra code, you
can count the number of fields from left to right in each row, or
record, and extract the fields at the position
you’re interested in.

The following regular expression (shown with and without the
free-spacing option) matches a single CSV field and its preceding
delimiter in two separate capturing groups. Since line breaks can appear
within double-quoted fields, it would not be accurate to simply search
from the beginning of each line in your CSV string. By matching and
stepping past fields one by one, you can easily determine which line
breaks appear outside of double-quoted fields and therefore start a new
record.

Tip

The regular expressions in this recipe are designed to work
correctly with valid CSV files only, according to the format rules
discussed in Comma-Separated Values (CSV).