The SitePoint Forums have moved.

You can now find them here.
This forum is now closed to new posts, but you can browse existing content.
You can find out more information about the move and how to open a new account (if necessary) here.
If you get stuck you can get support by emailing forums@sitepoint.com

If this is your first visit, be sure to
check out the FAQ by clicking the
link above. You may have to register
before you can post: click the register link above to proceed. To start viewing messages,
select the forum that you want to visit from the selection below.

Writing an accurate email pattern is more complex than you think. Since I have not yet reached the 10 post mark, I cannot post a URL (after all, I am a spam bot).

In google, do a search for 'iamcal + email parser' (without the quotes). The first or second entry is the iamcal site. When you go to that site, you will be greeted with the breakdown of the parser this guy wrote... scroll down to the bottom of the page for a 'simplified' version of the parser. If this is not good enough, there is a download link at the very bottom which will lead to a page offering different parsers... (you will want to click on the 'RFC 3696 Parser' link. This leads to the mother load email parser.

But truth be told, for my stuff, I don't bother with the nitty-gritty parsers... I prefer a much more relaxed / flexible system that is relaxed enough to let even some odd ball ones through, yet strict enough that you can't get away with just anything..

This system isn't bulletproof (nor is it meant to be). The point to this is that it may be in your better interest to either a) use a loose fisted system like that one, or b) if you do want to go the opposite way and go with a tight fisted system, go with one that is already built to be suitably good (like the one on the iamcal - publish) site. Otherwise, your pattern may disallow some legit email formats that you are not aware of.

It might help to write it in english and then look at docs for operators:
starts with one or more of [these] chars | followed by one and one only @ | followed by one or more of [these] characters | ends with period followed by two or three of [these] characters

starts with and upper or lowercase letter or number ^a-zA-Z0-9
May also contain -_.
Must contain at least 1 period .
Must contain only 1 @
Must end with upper or lower case letters $a-zA-Z
All between 8 and 50 characters.

So:

'/^[a-zA-Z0-9-_.] | @ | [a-zA-Z0-9-_.] | . | [a-zA-Z]{8,50}$/'

That doesn't work exactly as expected either, it allows more than 1 instance of @.

Theres the + that specifies 1 or more, you'd think there would be some syntax for just 1 regardless of it's position.

What I gather is that it means any letter, number, ., _ or - then a @ then any letter, number, _ or -, then a ., then any letter and a period, and those last bunch of characters can only be between 2 and 5 characters in length.

So if I understand correctly, the plus symbol ONLY means 1 or more and has nothing to do with concatenation or addition?

And there is no need to seperate individual parts of the pattern by any characters.

What if I wanted the first part before the @ to be only 10 characters long, would it simply be:

^[a-zA-Z0-9._-]{10}

as the first part? (If I only wanted to EXACTLY match 8 characters length)

Means:
'My name is ' followed by multiple occurances of (any letter or a space) followed by ONE or ZERO full stops.

In other words, it matches 'My name is Dave Smith' and 'My name is Dave Smith.' as if they were the same thing.

This is not correct.. If the string is say 'My name is Dave Smith', it will only match: 'My name is Da'

There are inaccuracies with this description. What this pattern is actually saying is;

match: 'My name is ', followed by any uppercase letters or any whitespace characters (which could include a tab, return carriage, space or newline for example) one or more times consecutively, followed by a dot_match_all wildcard (which matches any single character other than a new line by default), of which is optional (zero or one time).

I suspect what you intended is:

Code:

/My name is ([a-z\s]+)\.?/i

note the escaped dot (which now means look for a literal dot, and the i modifier after the closing delimiter (which makes any letters within the pattern case insensitive).