Perhaps there's a bit of a language issue here, but I'm pretty sure the OP meant that the processing should work if split found four comma or space delimited columns, and die otherwise, not that split itself should raise an exception, even though it may have literally read that way.

See my reply, below. I know it might not exactly hit the mark, as the specification was a little vague, but it should be easy to modify for different inputs or failure conditions.

This throws an exception from within the regex passed to split if the input string contains a pipe character. I wouldn't recommend bringing that to a code review, but given that none of the other solutions already provided seem to satisfy you, I am thinking that you'll only be happy when an exception is thrown as part of the split line. Despite the hackish nature of the code, it produces what you're requesting. Here's the output:

It would be a lot better to just follow the advice of bart's post, or Colonel_Panic's post, in this same thread. And if neither of those posts does what you need, rather than just repeating your question again, explain exactly how their code fails to meet your needs. I find it hard to believe that your requirement is for the exact line containing the split to throw an exception. It seems a lot more reasonable to just assure that an exception is thrown once split fails to produce reasonable output, or possibly to pre-screen the line of text and throw before you split, if a pipe character is found.

Update: Just for fun, an explanation of the regex:

(?(condition)true_regex|false_regex) creates a conditional. For our condition, we use a zero-width lookahead assertion, (?=^[^|]*|) that detects if a pipe character is found anywhere in the string. If that condition is satisfied, the "true_regex" gets tested. The "true_regex" that we use is a (?{code}) construct, which is used (or abused) to execute Perl code from within a regular expression. The codeabuse we execute is the die statement. For our "false_regex", we use an empty expression, which will not affect the rest of the split match. The remainder of the regex is just what we would normally pass to 'split'.

Hmm, I think you are reading a CSV file, not just a text file. And unless it's for learning Perl, consider using Text::CSV_XS (or the slightly slower pure-perl version Text::CSV) instead. Text::CSV_XS handles all of those ugly edge cases that a simple split can't handle - embedded quotes, embedded separation character, quoted values, to name just a few.

Alexander

--
Today I will gladly share my knowledge and experience, for there are no sweeter words than "I told you so". ;-)

Text::CSV uses Text::CSV_XS if it is available for the platform. It seems to me that using CSV_XS directly means the code won't work on a platform without XS capabilities whereas it would have worked if Text::CSV had been used. From reading the pod I don't see any advantage to using Text::CSV_XS directly. Have I missed something?

Text::CSV uses Text::CSV_XS if it is available for the platform. It seems to me that using CSV_XS directly means the code won't work on a platform without XS capabilities whereas it would have worked if Text::CSV had been used. From reading the pod I don't see any advantage to using Text::CSV_XS directly. Have I missed something?

Probably not. I think it's just a problem of the timeline, or an old habit.

According to CPAN, Text::CSV 0.01 was released on 1997-Jul-31, followed by 1.00 on 2007-Nov-27, more than 10 years later. Text::CSV_XS 0.16 was released 1999-Feb-11, followed by several releases up to 0.23 released 2001-Oct-09. During that time, Text::CSV did not change at all. In 2007, both Text::CSV and Text::CSV_XS saw a maintainer change and have been updated since then. During that maintainer change, Text::CSV was "rewritten to make a wrapper to Text::CSV_XS and Text::CSV_PP".

I learned about Text::CSV_XS between 2001 and 2007. During that time, Text::CSV seemed to be an unmaintained and incompatible "first shot" version, and most other modules of that time, including DBD::CSV, used Text::CSV_XS. DBD::CSV still depends on Text::CSV_XS.

Installing Text::CSV should be sufficient, and works without requiring a compiler, but it is slower than the XS version. The Makefile.PL from Text::CSV hints that installing a sufficiently recent XS version makes Text::CSV faster, but it does not attempt to install the XS version, even if a C compiler is available.

Text::CSV_XS, on the other hand, does not depend on Text::CSV, and does not require it to be installed. It requires a working C compiler, but then, it is faster than Text::CSV.

It would be nice if Text::CSV would attempt to install the XS module if that is possible. This way, there would be no need to install Text::CSV_XS manually.

Alexander

--
Today I will gladly share my knowledge and experience, for there are no sweeter words than "I told you so". ;-)

When putting a smiley right before a closing parenthesis, do you:

Use two parentheses: (Like this: :) )
Use one parenthesis: (Like this: :)
Reverse direction of the smiley: (Like this: (: )
Use angle/square brackets instead of parentheses
Use C-style commenting to set the smiley off from the closing parenthesis
Make the smiley a dunce: (:>
I disapprove of emoticons
Other