I'm wondering how far people should take the validation of e-mail address. My field is primarily web-development, but this applies anywhere.

I've seen a few approaches:

simply checking if there is an "@" present, which is dead simply but of course not that reliable.

a more complex regex test for standard e-mail formats

a full regex against RFC 2822 - the problem with this is that often an e-mail address might be valid but it is probably not what the user meant

DNS validation

SMTP validation

As many people might know (but many don't), e-mail addresses can have a lot of strange variation that most people don't usually consider (see RFC 2822 3.4.1), but you have to think about the goals of your validation: are you simply trying to ensure that an e-mail address can be sent to an address, or that it is what the user probably meant to put in (which is unlikely in a lot of the more obscure cases of otherwise 'valid' addresses).

An option I've considered is simply giving a warning with a more esoteric address but still allowing the request to go through, but this does add more complexity to a form and most users are likely to be confused.

While DNS validation / SMTP validation seem like no-brainers, I foresee problems where the DNS server/SMTP server is temporarily down and a user is unable to register somewhere, or the user's SMTP server doesn't support the required features.

How might some experienced developers out here handle this? Are there any other approaches than the ones I've listed?

Edit: I completely forgot the most obvious of all, sending a confirmation e-mail! Thanks to answerers for pointing that one out. Yes, this one is pretty foolproof, but it does require extra hassle on the part of everyone involved. The user has to fetch some e-mail, and the developer needs to remember user data before they're even confirmed as valid.

We're looking for long answers that provide some explanation and context. Don't just give a one-line answer; explain why your answer is right, ideally with citations. Answers that don't include explanations may be removed.

One suggestion: don't reject addresses with a + in them. It's annoyingly common to reject them, but it's a valid character, and gmail users can use address+label@gmail.com to label and sort incoming mail more easily.

In your post it seems that when you say "SMTP validation" you mean connecting to the server and trying a RCPT TO to see if it's accepted. Since you differentiate it from actually sending a confirmation email, I assume you want to do it inline with the user actions. In addition to problems like network issues, DNS failures, etc, gray listing can wreak havoc with this method. Method's vary, but essentially gray listing always defers the first attempt to deliver to recipient per connecting IP. As I said, this can vary, some hosts might reject invalid addresses at first attempt and only defer valid addresses, but there's no reliable way to sort out the different implementations programmatically.

The only way you'll ever be sure that an address is valid and is submitted by its owner who really does want it used for your application is to send a verification email. Well, as long as it doesn't get spam filtered I guess =).

You're best off just checking for simple things like @ and . in JavaScript, and then actually send them a verification to their email. If they verify their account, you have yourself a valid email address. That way you know for sure you have a working address, and you don't have to be too bossy in the form.

I'll try to keep this page up-to-date as people enhance their validators. Thanks to Cal, Dave and Phil for their help and co-operation in compiling these tests and constructive criticism of my own validator.

People should be aware of the errata against RFC 3696 in particular. Three of the canonical examples are in fact invalid addresses. And the maximum length of an address is 254 or 256 characters, not 320.

On consideration from the answers (since I completely forgot about confirmation e-mails) it seems to me like a suitable compromise for a low-friction solution would be to:

Use regex to check that the e-mail address looks valid, and give a warning if it is more obscure but avoid rejecting outright.

Use SMTP validation to ensure that the e-mail address is valid.

If SMTP validation fails then -- and only then -- use a confirmation e-mail as a last resort. Confirmation e-mails seem to require too much interaction outside of your application for them to be considered low friction, but they are a perfect fallback.

I've worked at 4 different companies where someone at the help desk got yelled at by someone named O'Malley or O'Brien or some other e-mail address with an apostrophe. As suggested previously, not all regex's will catch everything, but save yourself some hassle and accept an apostrophe without generating a warning.

If you want to verify an e-mail (i.e. ensure, that the user owns the e-mail address), a confirmation e-mail is the only thing you can do. Then again many people have dedicated spam-addresses or use services like OneWayMail and if they don't want to give you their actual e-mail address, they won't. So basically you're creating user obstruction.

When it comes to validation, to ensure, that the user doesn't unintentionally input a wrong e-mail address, it's definitely the right motivation. However, at least for HTML forms (which are by far the most common way of gathering e-mail addresses), it is hardly the right instrument.

For one, you will not be able to recognize typos in the actual "words" of the email address. You have no way of finding out that back2dso@example.com is wrong, solely based on the format.
But more importantly, from a user point of view, there's only one (or a hand full) of email addresses you could possibly want to enter. And you've probably already entered it.
So rather than trying to validate an address, you should focus on ensuring, that all browsers recognize the email field and thereby eliminate the need to type in the email address in the first place. Of course this doesn't apply, if you're building the kind of site likely to be hit by users who never entered their e-mail address into their browser before. But I suppose the least of us are in such a position.

I think it depends on what context you're using the email for. More serious projects require stricter validation but I think for most things sending an email to the provided address with a conformation link will ensure the email address is valid.

@Mike - I think that part of the reason why confirmation emails are sent is not only to ensure that the email address is valid, but that it is accessible by the user who submitted it. A person could easily put in a one-letter typo in the email address that would lead to a different, valid email address, but that would still be an error as it would be the wrong address.

The most complete and accurate regex I've ever encountered for email validation is the one documented here. It is not for the faint of heart; it's complicated enough that it's broken into parts to make it easier for humans to parse (sample code is in Java). But in cases where going all the way with validation is merited, I don't think it gets much better.

In any case, I would suggest that you use unit testing to confirm that your expression covers the cases that you feel are important. That way, as you dink around with it, you can be sure that you haven't broken some case that worked before.

Whatever you choose, I think you need to err on the side of believing that 99% of the time, the user does actually know what their email address is. As someone from Australia, I still find very occasionally an oh-so-clever email validation that tells me that I can't possibly have a .com.au domain. It used to happen a lot more in the early days of the internet mind you.

Sending a confirmation email these days is acceptable to users, and is also useful in terms of opt-in as well as validating their supplied address.

On some sites developed at places I have worked at, we have always used confirmation emails. However, it was surprisingly common for the users to mistype their email address in ways that could not possibly have worked, and then keep waiting for the confirmation email which would not come. Adding ad-hoc code (or, for the domain name part, DNS verification) to warn the user in these cases could be a good idea.

The common cases I have seen:

Dropping a letter on the middle of the domain name, or several other simple typo variants.

TLD confusion (for instance, adding a .br to a .com domain, or dropping the .br from a .com.br domain).

Adding a www. at the beginning of the local part of an email address (I am not making this up; I saw several email addresses of the form www.username@example.com).

There were even more bizarre cases; things like a complete domain name as the local part, addresses with two @ (something like username@domain.tld@example.com), and so on.

Of course, most of them were still valid RFC-822 addresses, so technically you could just let the MTA deal with them. However, warning the user that the email address entered is quite possibly bogus can be helpful, especially if your target audience is not very computer literate.

Depends on the goal. If you are an ISP and you need to validate that users are creating valid email addresses, go for the Regex that validates against everything possible. If you just want to catch user errors, how about the following pattern:

[All Characters, no spaces] @ [letters and numbers] (.[letters and numbers])
where the final group appears at least one time.

I think that part of the reason why confirmation emails are sent is not only to ensure that the email address is valid, but that it is accessible by the user who submitted it. A person could easily put in a one-letter typo in the email address that would lead to a different, valid email address, but that would still be an error as it would be the wrong address.

I concur, but I'm not sure it's worth it. We also have confirmation fields for that purpose (repeating your e-mail address again). Another situation where the type of site might warrant different approaches.

Additionally, sending a confirmation e-mail itself gives no way of indication to the original user that the address they entered was wrong. After not receiving the confirm e-mail they may just assume your app/site is faulty; at least by allowing the user to immediately start using their account they could correct their e-mail address, particularly if it is displayed in a suitably obvious place.

All those are valid, complete email verification systems in and of themselves, and for a given website one will be more appropriate (or as good as is warranted) than the others. In many cases several steps of verification may be useful.

If you're developing a website for a bank, you're going to want snail mail or phone verification on top of all these.

If you're developing a website for a contest you might not want any of them - verify the emails in post processing and if one fails it's too bad for the person who entered it - you might value server performance given a huge crush of people (TV contest, for instance) over making sure that everyone gets validated correctly inline.

How far should one take email verification?

As far as necessary and warranted.

I've seen sites that also guard against people using temporary throwaway spam bucket sites like Mailinator or MyTrashMail, which get around the confirmation e-mail thing. I'm not saying you should be filtering those out, I'm just saying.

Regex validation of email addresses can, at best, verify that the address is syntactically correct and relatively plausible. It also has the hazard (as already mentioned many times) of possibly rejecting actual, deliverable addresses if the regex isn't quite correct.

SMTP verification can determine that the address is deliverable, subject to the limitations imposed by greylisting or servers which are configured to give out as little information as possible about their users. You have no way of knowing whether the MTA has only claimed to accept mail for a bogus address, then just dropped it on the floor as part of an anti-spam strategy.

Sending a confirmation message, though, is the only way to verify that an address belongs to the user who entered it. If I'm filling out your form, I can quite easily tell you that my email address is president@whitehouse.gov. A regex will tell you it's syntactically valid, an SMTP RCPT TO will tell you it's a deliverable address, but it sure as hell ain't my address.

With the coming of HTML5 at least one new approach is added to the possibilities: the use of the input of type 'email' which allows validation on the client side. Current versions of Firefox, Chrome, Safari and Opera do support this (and other browsers just treat it like type=text, so it can be used without problems althoug then you don't have validation of course.)

It can never (as is pointed out a few times) guarantee a workable address, but it may be very beneficial (and eventually replace the server side check) in places where you just need to catch likely user errors.

2) email domain check against MX records to see if the domain name has an email service

3) sending a confirmation email with a confirmation link or code

Level 1:

In Visual Studio, you can use the "Regular Expression Validator". And in the "ValidationExpression" property you can click on the "..." button that has a wizard to add in the regular expression format for email addresses.

Level 2:

Here is my C# code below to use nslookup to verify whether an email domain has valid MX records. Runs quick and ok on Win 2008 R2 and Win 7.

another option is to use the Arsofttools nuget package but it may be slow on Windows Server 2008 R2 as I have experienced but runs quick on Win 7.

Level 3:

For email confirmation, you can either generate an email specific hex url (using encryption functions) etc http://domain.com/validateEmail?code=abcd1234 to validate the email address when the user clicks on it. There is no need to store this url in memory.