For a breakdown on how that reges works take a look at this answer. Mine is a slightly adapted version. I consolidated the single and double quote matching to match just one text delimiter and made the delimiter/separators dynamic. It does a great job of validating entiries but the line-splitting solution I added on top is pretty frail and breaks on the edge case I described above.

I'm just looking for a solution that walks the string extracting valid entries (to pass on to the entry parser) or fails on bad data returning an error indicating the line the parsing failed on.

You simply cannot do this using regex. You can create a regex that handles some or even most conditions. But there will always be some valid CSV which will not work with the regex.
–
James AndersonMay 2 '12 at 4:41

Regex is for detection, not parsing. As you scan across the text, if you have to "remember" anything besides the characters (i.e. i'm "inside" a quoted literal), you are parsing it. Subtle difference, which is why everyone wants to use it to parse.
–
Jeff Meatball YangMay 2 '12 at 5:14

You're right. A FSM is clearly the superior approach (and much easier to fine-tune for edge-cases). I have provided a complete working implementation in the question update. BTW, your first link is dead.
–
Evan PlaiceOct 16 '12 at 2:13

I understand that, that's the point of the question. I'm trying to manage the edge case. String.split(',') only works on trivial CSV, this is a library dedicated to CSV so I need a robust solution. Thanks for the response anyway.
–
Evan PlaiceMay 3 '12 at 4:25

This is closer to the approach that I'll probably use. Counting unescaped quotes seems to make the most sense but I want to avoid the second pass (ie split then parse) with a single-pass. Basically, walk the string until a valid newline is detected, extract that as an entry, parse it, continue to walk the sting. 2 pass works but it can have ugly performance/memory implications on large data sets.
–
Evan PlaiceMay 3 '12 at 4:46