I am a Perl beginner with a working script that needs a small modification. I am having a problem getting a working regular expression. I have the following that seems to work somewhat but not well enough. Basically it is crap and I need help...

I have tried the following: "m/^[\(\s]*S[UBJECT]\s*1/i" "m/^[\[(*\s*]S[UBJECT][*\)*\s*\d*\s*\)*\s*]:*\s/" "m/\(*\s*S[UBJECT]*\)*\s*\d*\s*\)*\s*:*\s/ " "m/\(*\s*S[UBJECT]*\)*\s*\d*\s*\)*\s*:*\s/"

As you can see I know almost nothing about this so any help is greatly appreciated.

I am trying to find any of the following identifier; "SUBJECT", at the beginning of any paragraph: (This is not every possible variance but you get the idea) "S " "S: " S#: SUBJECT: SUBJECT #: (S) (S#) (SUBJECT) (SUBJECT): (SUBJECT #): "SUBJECT "

"S" is always capitalized and if only the "S" it is followed by either a space or a colon and a space. If spelled out entirely the string will always be capitalized, "SUBJECT". The "#" could represent any number.

I do not want to hit on these occurrences if they appear later in the paragraph: "subject" "Subject"

I believe the "^" will prevent false positives from occurring.

Example:

SUBJECT: Improve ashamed married subject expense bed her comfort pursuit mrs. Four time took ye your as fail lady. Up greatest am exertion or marianne. Subject occasional terminated insensible and inhabiting. So know do fond to half on. Provided so as doubtful on striking required.

S: Improve ashamed married subject expense bed her comfort pursuit mrs. Four time took ye your as fail lady. Up greatest am exertion or marianne. Subject occasional terminated insensible and inhabiting. So know do fond to half on. Provided so as doubtful on striking required.

S This is a subject. Subject's here should not be caught. S: This is a subject. Subject's here should not be caught. S2: This is a subject. Subject's here should not be caught. SUBJECT: This is a subject. Subject's here should not be caught. SUBJECT 3: This is a subject. Subject's here should not be caught. (S) This is a subject. Subject's here should not be caught. (S7) This is a subject. Subject's here should not be caught. (SUBJECT) This is a subject. Subject's here should not be caught. (SUBJECT): This is a subject. Subject's here should not be caught. (SUBJECT 2): This is a subject. Subject's here should not be caught. SUBJECT This is a subject. Subject's here should not be caught.

So the regular expression pattern I gave you seems to be matching every single one of the nine items in my @c array, which presumably reflects exactly the requirement you posted. Now, it could be that you added somewhere a space that I cannot see since you did not use code tags, or changed something. Or maybe it is matching too much. But then please explain exactly what you think does not work.

That is absolutely correct, I just noticed that the regex was looking for the quotes. I put them in to show the whitespaces. Great catch and spot on.

I am testing the rest and will report back with what your expression yielded. I wasn't trying to be vague, I simply didn't know how to determine what it was hitting on because it was returning everything in the source.

I thank each of you for you help and tolerance of my ignorance with this.