Posted
by
samzenpuson Wednesday October 08, 2008 @07:35PM
from the take-this-job-and-shove-it dept.

An anonymous reader writes "From Kay Arno's blog we see that David Axmark, MySQL's Co-Founder, has resigned. This comes on top of the maybe, maybe not, resignation of Monty. We saw earlier this year that Brian Aker, the Director of Architecture, has forked the server to create a web-focused database from MySQL called Drizzle. The MySQL server has been 'RC' now for a year with hundreds of bugs still listed as being active in the 5.1 version.
What is going on with MySQL?"

Theoretically, yes you can fork the code. But there are broader issues than the legal ability to fork.

This has put a huge question-mark over MySQL's long-term viability. For a fork to be viable, you need a critical mass of developers. But we've seen 2 key ( founding ) developers leave, and Oracle buy InnoDB.

If Sun bought MySQL to further the project, then where is the evidence that this is happening?

If Oracle bought InnoDB to further the project, then where is the evidence that this is happening?

Of course you could argue that neither company is obliged to do anything. But alternatively you could argue that both companies have behaved in an explicitly anti-competitive way. This is itself is of course no surprise to anyone other than the US justice department.

Sun doesn't, but if you live in the Java world have you looked at Derby [apache.org] recently? We started out using it as an authentication database embedded in an app, and are now making more and more use of it. It supports transactions and hundreds of simultaneous connections, has very flexible configuration, and supports up to about 50Gbytes of storage. The last alone makes it more useful in many applications than the free versions of MS SQL Server. There are many applications currently running on MySQL which (in my opinion) would benefit from migrating to a tightly coupled all-Java solution. The Derby footprint is tiny, database backup and failover is now supported, and you can work with anything from the command line tool to the usual studio type applications. It has taken me 4 years to become a convert, after 8 years of MySQL, but now in the latest release I love it.

Sun have offered support for PostgreSQL for a few years now. The version of PostgreSQL they ship has a number of Solaris-specific tweaks that integrate with their other buzzwords.

With regard to MySQL, forking is difficult. MySQL is GPL'd, including the client library. This means that any application that uses it (by linking against the client library) must be GPL'd, or must by a proprietary license (previously from MySQL AB, now from Sun). Postgres, on the other hand, is BSD licensed, meaning you can

I guess it's the natural progression of things. Products die, OSS projects die. If there's a gap; a need for software that doesn't exist and is very important to the community, it will be created or an old project will be resurrected.

There is money in OSS. A lot of the big or important OSS projects have been able to bring in a good deal of money (I mean, look at mySQL, those guys made a small fortune from it) so I have no doubt that if mySQL dies out and there's no equivalent alternative (but there is

Your post seems to think that Oracle and Sun teamed up to kill MySQL. You want proof that Sun has done something to help MySQL along since it bought it.

My "Proof" is comes from that of a Java developer, one that uses almost all Sun tools (Netbeans, Glassfish, EJB3, WebServices and would consider Sun hardware if we could). The freaking day Sun "bought" the company that sold support for MySQL they started to push it as "their" default database of choice. Within a few days they had tutorials up on how to int

In other words, Sun Inc. still has no clue as to precisely what it wants to do with its assets. And no clue what to do with other markets where they have accidentally became leader. Current strategy seems to be "play dead."

And the leakage of talented people from Sun, doesn't really give it a good mark as employer.

I'm about to start a new web project and I get to choose the DB. I'm concerned over the lack of stored procedures though. My last big project used SP's for everything and honestly, while initial coding was a pain, in the long run it was a huge benifit.

I need a lean and mean webDB, so, if not Drizzle, does anyone have other recommendations?

We are still working on the first version of Drizzle. While folks are using it, I don't really recommend it at this point. When we feel like it is ready for adoption we will publicly start recommending it.

That advice is only appropriate in the expected query results are small, on large tables using stored procedures can significantly reduce the load on the DB by not requiring it to handle open connections while a large amount of data is streamed to the remote client.

No matter what DB you chose you are going to be tied in whether you use SP's or "dynamic" SQL.

If you go for SP's you have the benefit of only having to port the database - not the programs, the reason? Calling stored procedures are on most databases the same so you could in theory just port the database and the application layer would be none the wiser.

The database server is designed to handle data efficiently. Most large DB's, Oracle, SQL Server, Sybase, DB/2, etc.. have years of experience in doing one thing, handling large amounts of data efficiently. Application servers are multifunction tools, while they are used to handle data retrieval, it's not their primary purpose, they have other things to do and can do them more efficiently.

While I think it's great to be able to throw ad-hoc queries down to the DB from the app server, I found that I

Don't use stored procedures. They concentrate computation in the database server which is harder to scale than the application servers.

Ahh, sadly this is what "MySQL database thinking" has wrought.

The mystical grail of the enterprise "scaling." Like many things, conventional wisdom often evaporates when confronted with facts. Can stored procedures load the server? Sure, if you are doing something bad. For the most part stored procedures reduce server load.

(1) Stored procedures can be "pre-compiled" SQL which saves CPU time in the planner. (In databases with such an architecture).

(2) Stored procedures allow data selection beyond mere SQL and can lead to the reduction of data transfered from server to application.

(3) In PostgreSQL, for instance, one can create an index based on a function (like a stored procedure), so:

create index on mytable myindex (foobar(mycol) );

select * from mytable where foobar('froboz') = foobar(mycol) ;

Generates a query that uses and index and doesn't do a full table scan.

(4) Computers today are seriously fast, so much faster than the data storage systems that CPU capability is almost infinite with regards to I/O. Any CPU work that can be done at the server to reduce I/O load will probably improve general scalability.

Basically, stored procedures and functions would not exist, i.e. no one would have created them, if they did not help. Saying that "don't use stored procedures because they load the server" is the same logic behind "don't use power tools because they are dangerous." Yes, if you are ignorant, power tools present a huge danger, however, neglecting them means a lot more work. It is better to educate ones self and use the more powerful tools. Those tools would not exist if they did not provide a positive contribution.

Horses for courses, mate. There are good arguments in either direction. Personally I tend to avoid stored procedures not for performance reasons but for pragmatic ones. For one, it's easier sometimes to get a change approved in an application than it is to talk someone into approving a change -- any change -- in the database schema, no matter how trivial, and for another it's easier to migrate or replicate a database to another platform's database (say, Oracle to DB2 for example) when you're only worried about transferring tables and views, not logic. And it is true that the simpler it is, the easier it is to scale. Databases tend to scale by lock-managed clustering, applications by horizontal means (sometimes simply adding another apps server). One tends to be easier than the other.

Sucking data out in bulk can be a good idea too, for safety reasons -- I've seen bank OLTP databases frozen because someone thought it would be safer to set a read-only lock on a report scan, not realising they were using the wrong consistency setting across the entire database & thus forcing the rest of the users (thousands of them) to operate off the DB's log file, then killing the job mid-way after a few hours only to discover he had to face a few hours rollback....

That's a very narrow-minded statement. The application I maintain has an Oracle 10g backend, Pro*C middleware, and a Java fat client. The standard process for an action in the application is to ask the middleware to run a certain stored procedure in an Oracle package.

Given that this application is huge (I'm talking 1000+ tables, some with up to a million rows) and there are at least 1000 concurrent users, it's very convenient to have the logic on the server-side. Any code change to the client requires an outage (to replace the jar file), which is BAD if it's an emergency fix. By putting all the logic (and access to a vast amount of data) server-side, it reduces network traffic, allows easy rollbacks, and allows the support team to apply a fix without an outage.

Some more great things about our setup is that Oracle packages and triggers support networking. We have a publish/subscribe system tied to triggers such that when one user makes a change, it's instantly reflected on every other user's screen.

Obviously this solution isn't best for all situations, but it fits our needs very well. YMMV

Stored procs do have their place, but now say for instance your company needs to reduce some costs this year. You knowshrinking economy and such, so oracle comes a knocking for their annual or is that anal bend over the IT budget and ram it home support contract costs. Or perhaps your customers are not really very passionate about having to drop a million for a single database instance to run your software. Yes, now you are in between a very big rock and a hard place, all the logic is nowembedded in a propr

You go talk to EnterpriseDB, who've been working on Oracle compatibility for Postgres. I can't speak for it personally, but you might be able to get away with the cost of a conversion project to a similar database (read: close, but not equal). At that point, it's pure savings.

Oracle has features PG doesn't have, sure, but ask yourself: how many of them are you actually using? Of the one's you're using, can they be done differently -- maybe by tying in a different FOSS project or home-growing a solution? If

Your definition of huge is funny, I have tables that are growing a million rows a month and that's for a small S&P 500 company. I have a friend who does joins on multiple tables with 300+M rows every day. I'm not bragging because my DB is huge (it's not) but more commenting on the fact that so many slashdotters seem to lack a perspective on what a truly large DB is.

It seems my estimate of 1M was rather off. I just did some SELECT COUNT(*) from some of the frequently used tables and got about 20M per table. That's "used per day", not including historical records. I don't lack a perspective; in this case "huge" means "larger than most private clients". When it comes to "large databases", there's a difference between "huge" and "behemoth" (ie. Google).

When I said 1000+ tables, I was identifying the entire system. The components/subsystems ARE separated and DO have loose coupling.
It's not "horribly designed", it's just 15 years old and has had 15 years worth of enhancement requests.

Sure, but the scalability arguments of the grand parent poster holds.Scripting languages like PL/SQL are not the most performing by definition and had me in a performance fix more than once I must admit.At the same time, object relational mapping technology (Hibernate and all) have advanced to such a state (even for.Net it seems) that it feels completely wrong to me to put too much application logic in the database. Database independence comes rather cheap these days.

Thanks slashdotters for being passionate about all topics FOSS and MySQL!

David's departure is in all ways amicable, and he will continue to be an ambassador for MySQL and for free and open source software in general. For some time already, David was working only part-time for MySQL. After about 25 years of working on MySQL and the projects that preceded MySQL, he very much deserves do whatever he pleases to.

I think if you ask people who know me, they will say that I stand for transparency and truthfulness.

If the departure had not been amicable, I guess I would not have commented on it at all, or I would have focused my commentary on whatever other positive aspect I could find.

But the best may be to ask David directly. I don't want to publish his email address here, but it is not difficult to guess. Most early employees of MySQL AB, like myself, use firstname at mysql dot com.

Marten

P.S. Generally I am somewhat perplexed by the attention this topic is getting. The beauty of open source is that you can be actively contributing and participating in your favourite project whether you are employed by a certain company or not. So what's the big deal about David choosing not to be employed? He is not abandoning MySQL. With the enormous payout from the acquisition, the founders can now allow themselves to pursue whatever interests and daily routines they like. Good for them, and I think we should all just be happy that open source can provide not just software freedom but also financial freedom. Just my 2c.

Generally I am somewhat perplexed by the attention this topic is getting. The beauty of open source is that you can be actively contributing and participating in your favourite project whether you are employed by a certain company or not. So what's the big deal about David choosing not to be employed? He is not abandoning MySQL. (..) Just my 2c.

Your cents are worth it:-)

Who has contributed or donated to the MySQL project while actively using MySQL in a production environment?

Lots of press about a not to large event. I have been working less with MySQL over the past several years (as the company has grown). And when we got acquired we got to big for me (I like to know everyone in a company).

A huge part of my work have been spreading FreeSoftware/OpenSource and I will continue to do that. And tell about the MySQL story many times more hoping to inspire others to try to start FLOSS businesses.

And I hope to meet many of all the people who made MySQL such a sucess many times over the coming years./David (who posts so seldom he does not remember his slash login/password..)

What's the big deal? One developer leaves the MySQL team, after MySQL has been bought by Sun. Ok. The two may or may not be related. And it may or may not indicate that something is making the developers unhappy.

MySQL has hundreds of bugs open against the upcoming release. Ok. Is that a lot? It does sound like it. On the other hand, it's hard to say what this means for quality. It means all these bugs have been _found_, which is good. Now they just need to be fixed.

including mine. unbelievable amounts of client sites use mysql, all web development clients use mysql, there is a HUGE market for php/mysql out there (bigger than anything similar i assure you), hell, even a goodly percentage of web runs on mysql....

rest assured if anything happens to mysql, ill come there with a thick stick in my hand.

Yeah, I had to snigger at that. The project I'm responsible for has a database that's gotten up to tens of gigabytes in size. MySQL was chosen before I came along, and knowing what I know now, I'd definitely consider alternatives, but for the most part, it serves our purposes.

I have about 2.5 TB in MySQL right now, it does alright, but it's also not critical data. While MySQL improved technically quite a bit in the last couple of years, their overriding philosophy seems to have always been "speed over data integrity", which makes it a hard choice to make (and it only has that performance advantage in a relatively narrow set of circumstances, at that). So it's usually relegated to the role of "granular file system for non-critical data".

there's no need to start dicksizing about the type of databases you manage. no one is claiming that MySQL is the best database management system out there, or that it can handle any kind of application. but for a certain range of applications it's a very capable and well designed database server.

not everyone needs a multi-terabyte database. and the utility of a RDBMS is not defined by database sizes it can handle. MySQL is so popular precisely because most sub-enterprise businesses don't need anything as robust as Oracle. so MySQL is therefore a much more cost-effective solution.

"and the utility of a RDBMS is not defined by database sizes it can handle"

Actually there is some relevance.

If you needed a database gigabytes in size a few _years_ ago, MySQL would have been a really bad choice (it still is crap, just less so IMO).

For MyISAM:You would have to configure it to get tables bigger than the default 4GB limit (there's a number of row limit and table size limit). Hope you don't make the new setting too small so you're still working in the place when those run out too;).

For Innodb:Before the single file per table, if you're moving about gigabytes of stuff, you end up with one huge multigigabyte innodb table.

For both:Adding an index was the same as "alter table" and involved making a copy of the table.

So let's say you have a 40GB table and 40GB of space free. No index add for you:).Keep in mind if you have plenty of space free making a copy of a 40GB table does take time.

BTW concurrent inserts to an innodb table with an auto increment field were slow till only recently (well allegedly they've fixed that).

For cost, for robustness, for functionality, MySQL is a far poorer choice than PostgreSQL.

I've used lots and lots of databases, relational and otherwise - MSSQL, Oracle, DB2, Informix, Unidata, etc. etc.. MySQL looks great to people who haven't got much experience with other databases, and it looks like a chunk of shit to those of us who have. I'm not even talking about database size. I'm talking about functionality level stuff - views, useful subselects, a single reliable table type that supports transactional data writing (and for that matter, a transactional layer that isn't shitty). Features that are always coming in a future version, but are already available in other products - ones that can be had for free.

There's no compelling business case for MySQL over another product, except that you might need to make use of a crappy open source project that's tied to it.

LiveJournal couldn't deal with the load balancing and disk latency issues with MyISAM just flat-out _not_ scaling. Hence, their need for the creation of memcached. Of the others listed, who else [wikipedia.org] is using memcached?

I have used MySQL for nearly 7 years now.... 30 databases... many servers and operating systems from MS to Linux.... as small as 200k to one as large as 900MB.....I have never had a single issue with any of them in all that time, ever.

Sounds like somebody got a program working right and, instead of tweaking it some more and breaking it again, quit.

After decades of information technology it's ABOUT TIME that happened.

Really? How about the "bad connection" issue where the database server due to no reason obvious to the developer will count to ten and then just refuse new connections? How about when MySQL trips over itself and locks it's own tempfile? How about the admin gui that pretends to let you change parameters but really doesn't? How about MySQLs abmyssal speed once it has to deal with larger tables? How about introducing new keywords that are common words like 'release' and thus making a DB upgrade much more painfull then it needs to be? Overall I like MySQL, grew up with it even, but there is no use in pretending like there aren't any problems...

I am perplexed by MySQL. I installed it on a server 9 months ago for a new project I was tinkering with. Paying jobs kept me from doing much more than getting MySQL up and running. Last month I noticed my server was straining under a heavy load. I figured I had finally pushed the server beyond its limits doing multi-site hosting with Apache and Postgres, multi-site mail serving with Postfix and Courier POP/IMAP [backended into Postgres], running video security software, and being a general collect all p

PostgreSQL does not support any of these, they are all add on. On top of that none of them are viable for critical environments, some work by replicating through triggers, some work as a middle layer, none of them can guarantee your data in case of primary failure, and none of them has proper sub second fail over (except for Sequoia who doesn't support triggers and procedures) - trust me I've been researching this extensively and there are no FOSS databases that handles this.

"How about the "bad connection" issue where the database server due to no reason obvious to the developer will count to ten and then just refuse new connections? How about when MySQL trips over itself and locks it's own tempfile? How about the admin gui that pretends to let you change parameters but really doesn't?"

I've developed, debugged, administered, and administered MySQL databases for nearly a decade now, and I have never seen any of those issues you complain about.

"How about MySQLs abmyssal speed once it has to deal with larger tables?"

The InnoDB storage engine uses clustered indexes and is actually pretty good with large tables. Combine that with the partitioned table support in MySQL 5.1 and large tables are quite manageable. I have one OLTP application with well over 300M rows, and the server runs fine even though it is on commodity hardware.

"but there is no use in pretending like there aren't any problems..."

Indeed, but they weren't what you mentioned here. I am looking for better CPU utilization on multicore systems, semi-synchronous replication, parallelized replication, better foreign key performance, and better join algorithms. Many of these features are planned of course but I want them now.

I have! MyISAM tables get corrupted by normal MythTV use on x86_64, which causes mysqld to crash. Pretty annoying to live with, until you realize you can change the engine to InnoDB and it seems to work.

That's the problem with MyISAM. It's only useful if you're not worried about losing data. Any breakage, whether it's a server crash, running out of disk space or the wrong phase of the moon will totally lunch your DB. InnoDB on the other hand is storage engine actually meant for real projects. That said, MySQL definitely has its limitations, but within them it's pretty good.

We store about 3 billion rows using compressed MyISAM tables. I dread to think what that'd be like using a transactional table type; MVCC generally bloats disk usage 2-4x, and compression is only supported in the very latest InnoDB.

Sure, it'd be nice to have everything in one table type, but at least some of these tradeoffs seem quite fundamental.

I'm switching to PGSQL also. I knew when Sun bought them it was the beginning of the end. The community is just not there any more. I would like to see MySql survive, but they are so far behind when it comes to SP programming and such. Postgres was designed to be programmed, whereas MySql was designed to be small and fast for little websites. I have multiple Mysql boxes with 5GB+ innodb tables and while it works, it does not make me comfortable..... I have a few pgsql out there but there's a lot of mi

Back when I was a student, one of the modules I had to take was databases. The coursework for this was meant to be done in MS Access (yes, depressing) but could be done on any other database supporting SQL, since we were not meant to use the GUI. I first tried doing it on MySQL. I got as far as question 2, which required foreign keys. Back then, MySQL wouldn't even parse these (SQLite now does, but ignores them, although you can implement them yourself using triggers). Considering such a basic feature