If I'm designing a server infrastructure to support an email server. Should the reception system support fail tolerance. I think that given the type of transactions it'll manage, heavy on processing and filtering, but not on disk (there's another system to storage). I think that it would be very expensive to do that. But don't know.

Also, given the fact that when I send an email and it doesn't arrive to the recipient, the send system automatically retry. I think a cluster for availability is necessary, but not necessary fail tolerance (to stores the state).

Any advice?

EDIT

When I'm talking about "reception system" I means the server pointed on the MX records. I'm talking about 20.000 users. Is an academic entity so the mail is very important.

There are almost 1.000.000 new emails per day BUT only the 30.000 are useless (the rest is SPAM or Malware).

I'm asking specifically about if its necessary to provide fail tolerance saving the state of incoming transactions or just having a load balancer that in case of a failure, redirects the load to another (functioning) server.

My apologies if this is completely off base: your questions smacks of a manager, who doesn't understand how e-mail systems work, and just received a recommendation/proposal to implement fault (not "fail") tolerance without an ROI analysis or other justification. If that's the case, go back and ask for the justification. Fault tolerance isn't 'mandatory' in any e-mail system, it's presence or lack thereof has both real costs and opportunity costs however.
–
Chris S♦Feb 21 '11 at 17:58

no need for apologies, I'm sure it's an honest mistake. If system failure seems like it creates problems for your situation, then definitely have the person who is in charge of the e-mail system figure out the ROI. If they can't figure it out, there should be consultants around who can. If it doesn't seem like it's even close to an issue for you (ie, never had any problems with it) then the ROI is likely not to be there.
–
Chris S♦Feb 21 '11 at 18:10

6 Answers
6

It depends on how timely you need your email. SMTP is designed to work around occasional mailer outages, but end users aren't. If a message takes more than 5 minutes to get where its going, I know I get calls. Ultimately, I believe this is a business decision based on what the business needs are and what service expectations are for the users who will be using this system.

Question is vague - there is no such protocol as 'email', therefore telling us you're setting up an email server is not a lot of help - there are lots of protocols used for handling email - POP, IMAP, SMTP, SSMTP, x400....

when I send an email and it doesn't arrive to the recipient, the send system automatically retry.

That presupposes that the 'send system' (WTF?) was working when you tried to send the email. And yes, the MUA will save unsent messages.

Without knowing what the content is, its impossible to advise whether or not you need fault tolerance - if you're talking about your home PC with an ADSL connection, then its probably not worth running a transactionally secure multi-node cluster with offsite failover.

Also, you've not mentioned what software / OS you intend to use - if its commercial software, then licence costs will likely be significantly more than the hardware costs.

Setting up a secondary MX is a no-brainer - and if the "system to storage" is already seperate, you've already got half the problems of contention on mailboxes licked.

I tend to use an email or webhost to provide smaller clients with a 'cache' or holding point for, typically, 3 days of holding all SMTP email. This is handy in case their server's internet link stays down for more than 24 hours.

Another solution is to use your own email server as a fall-back to your clients' email server.
Just look into multiple MX record setup for a domain (i.e. it's part of the DNS settings).

Bigger clients will have their own fall-over setup, spread out over office locations, or data-centres.
As an example, run:

Any IT system should support the level of availability that can be justified for the amount of money spent on it. If your company depends on email, and will lose money if it fails, then it should probably be fault-tolerant, unless you can have a plan for restoration that will be fast enough.

Those are very vague answers for your very vague question - IT and the business need to put specific details in their discussion.

Your 2nd and 3rd sentences hurt my head. I think your second sentence is asking about MX records (SMTP servers for receiving email from external parties) but I really have no idea what you're asking.

Your second paragraph is also frightfully vague; it's so confusing that I don't know if you have some misconceptions about email, or are just leaving out some details.

Mail delivery system has some robustness embedded in the SMTP protocol itself. So, if your server is unavailable to receive mail the sending server will periodically try to redeliver it for up to several days depending on configuration. If this is not enough you can always subscribe for a secondary mail service from multitude of service providers, which will accept mail designated to you and will keep it in its queue while your server is down. So my answer is no - you don't need high availability for your mail server, but I recommend to keep a decent backup and tools for bare metal recovery just in case.