We use MySQL and are generally very happy with it. It seems to work well together with all the other systems. I don't know the specifics of the current outage so I can't say much more about that, but I do believe MySQL is the right choice for FM at this time. Every database software has its own quirks... just cause there are problems with it every once in a while is no reason to change over to a completely different system.

The reason why many outages seem database related is that the databases are what drive the system. If something else breaks for a short time it's not always immediately noticeable (not to mention that other applications aren't quite pushed as hard by FM).

I don't have any experience with databases for HUGE systems like a mail provider. I use MySQL myself, and for low-end applications (polls, and simple stuff like that) it's great. But obviously it's not holding out as well as it should under these stresses...

(Dan, I am suffering from a bad case of insomnia... instead of going to sleep I get hyperactive in the night.. )

Nope, just a case of DSPS. (that page makes it sound more severe than it is... I don't actually have a problem with that sleep pattern at all, esp. considering the fact that I work shifts).

Quote:

Does FastMail also provide feedback to the MySQL developpers?

I'm not aware of any specific occurences, but when we bump into a bug we also try to fix it ourselves and speak to the developers about the problem we are having. I don't believe FM is in such a unique situation that we have a special connection with the MySQL team though.

It has not been clear from the reports whether the "rebuilding the database" has been the result of disk failure or of the database "just rotting" (that is, software errors).

I do not know whether all the disks are mirrored. Whereas wide-area duplication remains very expensive, there is no excuse for a system of this nature not to mirror all its disk. And if the disks were mirrored, there should have been no outage for a disk failure. But as I said, I don't know whether the failure is hardware or software.

It also has not been clear whether the excessive recovery times have been due to intrinsic flaws in the software or due to improper configuration and operation. (I recall that SpamCop has had some database problems that took about ten times as long as expected to rebuild from.)

It is not clear whether the MySQL developers pay as much attention to recovery as they should. Relational databases are in fashion now, and when I see a group following fashion, I have to wonder if they have blinders on.

What is clear is that full recovery under production conditions has not been adequately tested.

This kind of outage is NOT necessary. This IS a professional opinion. I'm familiar with a financial institution that handles millions of transactions per day. About the only time they go down is to upgrade the OS or DB software. But hey, this is mainframe stuff that's been under development for over 30 years. It has industrial strength reliability. This is difficult to attain in free software due to the extreme stress testing requirements. Developers don't like testing, and I haven't seen many testing experts gravitating toward the free software movement.

Yes, it costs more to use mirrored disk and industrial strength software. It costs more to fully test database recovery. It costs more to do the frequent online backups that are necessary for fast recovery.

But frankly, $40/year for reliable email is cheap to those of us in the first world. If it takes a higher price to make it reliable -- and retain all the other advantages that fastmail.fm offers -- then it's worth it to me. For those in parts of the world where this is already expensive, I would hope that the revenue from yearly paid accounts could carry along the free and lifetime accounts.

MySQL would not have been my choice for a system with the ambition level of FM. Very few large systems have been built to use it. That said, I doubt if I would change that decision now. MySQL is improving and, correctly used, can probably provide what FM needs.

My question is: why should a database problem should bring down everybody? A large email system is a classic example of a case where segmented databases make sense. Then, in the event of a problem, only a subset of the client base is impacted and recovery is much faster.

As at least other poster mentioned, it is technically feasible to create databases that virtually never fail. Some of the methods needed are not, however, supported yet by MySQL. I think the solution is multiple databases, any of which can be restored if necessary in less than 30 minutes. While none of us like downtime, an occasional half hour outage is something that I think most of us could live with.

Originally posted by Onno We use MySQL and are generally very happy with it. It seems to work well together with all the other systems. I don't know the specifics of the current outage so I can't say much more about that, but I do believe MySQL is the right choice for FM at this time.

I am a DBA, looking for a job. And do you know why? Because people don't grok data nowadays. So even very competent programmers think MySQL is OK, they don't need a DBA and don't need an expert to do data models. Well, it isn't like that. MySQL leaves too much to be done by the application, and just isn't solid enough; its creators aren't data people, but programmers who never really got the ACID and other basic principles of database management.

Nowadays, the choice in free DBMSs is PostgreSQL. Can do much more in the database itself, and is much more solid and proven, because its developers really understand DBs and their requirements. There are also FireBird and SAP db, but I don't think they quite strike the maturity and standards-compliance of PostgreSQL.

Failing that, the relational model again shows promise, but its only full-scale implementation is proprietary; perhaps a well-made case for a high-profile implementation such as FastMail.fm could shift Nathan Allan, Alphora's proprietary, to free Dataphor or at least the relevant parts. Problem is, it's .Net...

If I'm not mistaken, several popular commercial or open source email software providers (I'm thinking of Calacode's @mail, Mintersoft, and Horde/IMP) specify MySQL.

Some of these certainly have been used with by installed services with userbases larger than fastmail.fm. I think that Mintersoft's VisualOffice software has an installed base of over 50 million accounts to date (or course over many services).

I don't think that MySQL per se is the problem. Besides, the licensing fees for commercial DBs are prohibitive for a small company. (That's why Oracle DBA's make so much $$$)

Originally posted by paleolith What is clear is that full recovery under production conditions has not been adequately tested.

This kind of outage is NOT necessary.

I completely agree. Without knowing much about the details, I suspect this was preventable or could have been a much smaller issue.

Quote:

But frankly, $40/year for reliable email is cheap to those of us in the first world. If it takes a higher price to make it reliable -- and retain all the other advantages that fastmail.fm offers -- then it's worth it to me.

Despite my frustration, I agree here, too. I would gladly pay more per year for a rock-solid system. By the same token, I could (and still may) take my chances at self-hosting and pay nothing (aside from broadband fees and my own time). Not rock-solid, but cheap. It's one or the other. The middle doesn't seem to work...

I am also not really sure how many people that use MySQL actually use transactions.MySQL with the standard MyIsam tables only support table blocks as far as I know.

Not using transactions though for a complex system with thousands, tens of thousands of users would be a sin though :-)

And this is where what I heard matches what leandro says that MySQL is not as mature, but this is only what I hear as the general opinion, if there is any truth in that is a different story, because in the open source community many people just seem to echo what they have heard without doing much investigations themselves.

I myself have only really read about how far ACID is on MySQL but not any comparisons about how advanced or stable this is compared to others.

[quote]Originally posted by Onno
[b]Nope, just a case of DSPS. (that page makes it sound more severe than it is... I don't actually have a problem with that sleep pattern at all, esp. considering the fact that I work shifts).

Don't all programmers nearly have that disorder :-)

I love programming through the nights (when I have time), so quiet, noone bothers you, you get so much work done... :-) The net also used to be so much faster :-)

I can adjust pretty well though back to "normal" when I am working like now. Usually takes me a day.

But in the past when I was a student and had a few weeks off I would typically start doing something on the PC or start reading a book at like 10 in the evening and work on it till the birds started chirping because the sun was just beginning to rise, then lay down till lunch time :-)