Vote NO for MySQL

Vote NO for MySQL - Election 2012: NoSQL. Researchers predict a dark future for MySQL. Significant market loss to come. Are things that bad, is MySQL falling behind? A look at NoSQL, an attempt to
...

Vote NO for MySQL - Election 2012: NoSQL. Researchers predict a dark future for MySQL. Significant market loss to come. Are things that bad, is MySQL falling behind? A look at NoSQL, an attempt to identify different kinds of NoSQL stores, their goals and how they compare to MySQL 5.6. Focus: Key Value Stores and Document Stores. MySQL versus NoSQL means looking behind the scenes, taking a step back and looking at the building blocks.

Transcript

1.
Ulf Wendel, MySQL/OracleVOTE NO for MySQL Election 2012: NoSQL

2.
The speaker says...451 Research predicts* MySQL usage of 80% in 2012, and: • Loss of 25% within five years • Down to 55% in 2017 • 30%** looked into or use already NoSQLHmm, I think, I am looking for a new job. Are you looking forUlf WendelTM? I am for sale!* http://de.slideshare.net/mattaslett/mysql-vs-nosql-and-newsql-survey-results-13073043** MySQL users of the 451 Research sample

6.
The speaker says...Time for voting: • Which systems are you using? • Since when?Please note, LevelDB itself is a Key Value Store but Riakembeds it. Riak supports many storage engines and, thus itcan be bended in many directions.

7.
A new era of databases Scaleable Elastic Explore the benefits Highly Available Easy To Use

8.
The speaker says...Not Only SQL databases aim to be scalable. From one nodeto one tousand nodes in a bit. And, back depending on bothquery load and data size! Sharding built-in! Chewing gumsof the cloud area!Master down? No problem. Some dont do lame primarycopy. Paxos and others ensure the database clustersurvives the failure of nodes – including primaries, if any.Hot on conferences: JavaScript, HTTP, JSON you name it -ingredients of todays web applications. No room for MySQL?

12.
The speaker says...How about this claim: document databases such asCouchDB and MongoDB have their roots in the 70th . Thehierarchical data model is older than the relational datamodel.Both databases manage documents in forests of trees. Theprimary operation is a simple scan on the tree. Additionallythe user can embed links to reference between trees.Relations are not an integral part of the data model.

15.
The speaker says...One or the other way, whether I have to search for a new jobor not, I need to take them serious. People use NoSQL!There must be something smart...For example, CouchDB: • Speaks HTTP – iPhone browser can connect directly • Returns JSON – iPhone browser can display it • Server-side JavaScript – app server built-in • Replication with conflict detection – sync mobile device

17.
The speaker says...We identified four flavours of Not Only SQL databases.Which ones are relevant to me when searching for a newemployer?Key-Value-Stores are popular. Document Databases arebeing considered for web applications. Click or tap to zoom!Graph databases and Big Data are beyond the scope of thepresentation: specialist tools for special purposes. Swipe!

19.
The speaker says...A Key Value Store strikes for its simple data model which isthat of an associative array/hash. The data model ispoor at ad-hoc queries: loose the key and you lock your datain the treasure. But, it is fast. A need for speed has led tomany in-memory solutions in this class. A perfect model foruse as a cache. If used as a cache, persistence is oftensecondary. Generally speaking the data model is perfect forpartitioning/sharding. There are no operations coveringmultiple values, thus values can be distributed on multiplenodes to scale the system.Most operations are basic (CRUD). Redis stands out withcomplex data types and correspondig commands.

21.
The speaker says...Both Redis and Memcache are proven and popularsolutions. However, due to their late birth some featureshave limitations.For example, lack of multi-core CPU support. Whencomparing an operation on MySQL 5.0 and MySQL 5.6using only one concurrent client, dont be disappointed if ithas not become much faster. Focus is on scaling in the areaof 30...40 CPU threads.For example, persistence. Worst case recovery: WAL haschanges for the duration of the snapshot frequence. Replaytakes minutes, albeit no transactions to roll back...?Maybe we can bend MySQL?

23.
The speaker says...Lets make MySQL a happy dieter, or cut off some fat atleast! Then, support a proven, lightweight protocol with manyexisting language bindings for super fast access.Lets give MySQL users direct access to the inner workingsof MySQL. Inside MySQL we find a stable and CPU-efficientB-tree based storage called InnoDB. If needed, InnoDBautomatically builds a hash index behind the scenes – eversince. In-memory performance is great because it has to be.Estimated 20% use 64 – 256 GB RAM with MySQL.Persistence is a given on top of it. However, SQL parsingand interpretation is slow.

25.
The speaker says...Ever since MySQL has separated the SQL layer from thestorage layer through the so-called internal Handler API,however the feature has not been exposed to the user formany years. Popular storage engines include InnoDB,Memory, MyISAM and NDB (MySQL Cluster). Since MySQL5.1 the storage engines can be implemented as a serverplugin. The Memcache plugin is another kind of serverplugin. It integrates memcached with MySQL.The plugin can store data both in InnoDB and the defaultmemcached way. As a result users can now access InnoDBtables using either SQL or Memcache protocol.

27.
The speaker says...MySQL Cluster has reached 1.17 billion writes transactionsper minute on a 30 nodes setup in mid 2012. That is some3.9 million writes per second. You can have upto 48 nodesin a cluster. Clusters can be replicated over wide areanetworks, for example, to run them in different data centerson different locations. Just in case you worry about networklatency...You can access MySQL Cluster through a variety ofinterfaces. Among them are MySQL Server SQL nodes(ODBC, JDBC, .NET, ...), ClusterJ (JNI), LDAP, HTTP/REST(Apache mod-ndb) and Memcached. All of them internallyuse the NDB C++ API.

29.
The speaker says...Memcached support is one of the latest additions to the listof MySQL Cluster interface. Since MySQL Cluster 7.2 it ispossible to use MySQL Cluster as a storage engine for aMemcache server. This is quite similar to using MySQLCluster/NDB as a storage engine for MySQL. In both casesthe „frontends“ wrap the main MySQL Cluster API which iscalled NDB API. Both frontends inherit all MySQL Clustergoodies.You can choose whether to run the Memcached, the MySQLCluster data nodes and the application on one machine (lowlatency) or on different ones (fail safety). Note, that with theInnoDB Memcache plugin you have Memcached andMySQL running in the same process.

31.
The speaker says...The InnoDB Memcache Plugin is certainly a step forward.MySQL is putting pressure on itself to modularize the serverallowing users to slim MySQL, to strip off features notneeded to get a certain job done.Users get more choices. If you want to combine a fast andlean client protocol with simple and fast access operationsbut cannot accept compromises on persistence orscalability, here you go.Cluster has been a speed monster ever since...

33.
The speaker says...PECL/mysqlnd_memcache is another free and open sourceplugin for the PHP mysqlnd library. Mysqlnd is the compiletime default C library used for all PHP MySQL APIs (mysqli,PDO_MySQL and mysql).Like other plugins it adds new features to all the APIs. Basedon a configurable regular expression the plugin turns a SQLaccess into a Memcache access. Due to the lightweightprotocol and direct access the Memcache access to MySQLis faster. No matter what protocol used by the library, theuser gets a standard result set in return. Simple to use.However, note that no meta data is available if a key accesshas been performed.

34.
Ulfs take...Not a bad attempt at all... Go try! Go ask for more!A significant number of MySQL users is using Memcached • Deploy only one data store instead of two • Dual interface: can we skip a caching layer in our apps?A good first step, but looking for more • Persistence for Memcache - more of a topic for Redis? • No issues with warm-up or stampeding/slamming • KVS is about performance, where is the proof @ 5.6...? • Data model is about distribution/sharding, MySQL Cluster only?

35.
The speaker says...Inside MySQL is some fine and stable technology... Onceinternals are exposed in a user friendly manner MySQL willbe kicking.

37.
The speaker says...Document Databases use a data model that seems asappealing as that of a Key Value Store. Think of a Key ValueStore that holds nothing but arbitrary documents.Documents are schema-free thus, you can code withoutthinking first... Nested documents are great for storingobjects of a programming language.Take CouchDB/MongoDB. All the ingredients of a modernweb database are there! MongoDB: Sharding, automatic HAfailover. CouchDB: Lazy primary copy, conflict detection,consistent hashing for sharding/clustering, ACIDtransactions.

39.
The speaker says...Search deserves a dedicated slide, more below. On paperthe high availability approach of both MongoDB andCouchDB looks good. Like with Key Value Stores their datamodel is great for sharding. This is a write-scale outapproach they share with MySQL Cluster. Scalability –young systems, again. Nobody wants to waste disk space,manually compact files, limited to a single file on disk or relyon operating system cache managers to suit databaseneeds. Over the years MySQL got disk-efficient to makereads fast and learned partitioning to fine tune datadistribution on disk arrays. Mongo M/R jobs on multi-coresingle-threaded: https://jira.mongodb.org/browse/SERVER-4258 ?

41.
The speaker says...The standardized query language (SQL) is a rich querylanguage, the MongoDB query language is not. Please see,http://blog.ulf-wendel.de/2012/searching-data-in-notonlymysql-databases-a-rich-query-language/ for details.Imagine you ever want to swap the database system. UsingMongoDB syntax means 100% vendor lock-in and significantefforts porting the application. As a developer, once youlearned SQL you are qualified for any job using any RDBMS.Same true for NoSQL...? Map&Reduce? Great batchprocessing for distributed systems! But, you write it. Youdefine the data access path. SQL: you say what, DB takescare of finding best possible physical access path.

43.
The speaker says...The relational data model may be an academic one. Thismay be annoying at times. But, do not forget about its basicsuch as relations (qualified: 1:n, n:m) and the normal form(data duplication, update anomalies, …). Also recall thegoodies SQL has to offer: blue-prints, types, validation ofdata, validation of relations. As a PHP MySQL guy I vividlyremember STRICT_MODE and see PHP users asking fortypes to prevent errors...Data duplication has a serious side effect: disk spacerequirements. You need more disks ($) and you get slowerdata access because more data has to be read.What if MySQL schema changes would be cheap...?

45.
The speaker says...Online DDL speaks for itself. The improved ALTER TABLEgives you higher concurrency, lower disk and CPU usageand less purging of the buffer pool all resulting inperformance dips in the past. Please note, the overallruntime of ALTER table can increase. Recording concurrentDML changes and applying them at the end of an onlineDDL may reduce raw performance. However, concurrency isbetter.As a side-effect loading huge dumps has become faster.You can now create the table, load the data and addsecondary indexes later. Clients can start to access the tablewhile the secondary index is still being created.

47.
The speaker says...A new LOCK clause of the ALTER TABLE statement allowsyou to block DDL operations that reduce concurrency.Setting LOCK=NONE permits the execution of any ALTERTABLE statement that either blocks read or write operations.LOCK=SHARED blocks an ALTER TABLE statement whichprevents concurrent read operations.In sum: MySQL is not schema fee but has become morefriendly towards schema changes if need be.Next topics: High Availability and Replication

49.
The speaker says...A global transaction identifier is a cluster-wide uniquetransaction identifier. MySQL 5.6 can generate themautomatically and for older versions of MySQL you can usea client-side emulation provided by, for example, MySQLProxy and PECL/mysqlnd_ms.A GTID helps with failover in case of a master/primaryoutage. The most up-to-date slave/secondary should become the new master. Which one is the most current slavecan be checked comparing GTIDs. Unfortunatly, this is noautomatic failover. For automatic failover useDRBD/Pacemaker or other 3rd party tools. Dont miss thecommandline mysqlfailover utility of MySQL Workbench!

51.
The speaker says...MySQL 5.6 makes replication faster. Binary log writes canbe grouped together to significantly improve theperformance of the replication master. Less writes, lesswaits for the disks – in tests we saw improvements between2x – 4x.Slaves may catch up faster to the master in 5.6 as well.Transactions from different databases can be applied inparallel. The slave SQL thread acts as a coordinator for upto 1024 workers. Note, that transactions which do notoverlap, may be recorded in a different order in the slavesbinary log than in the master log.

53.
The speaker says...By the way, we have released yet another free and opensource PHP mysqlnd plugin to make using clusters easier.Of course, I would love to see it used by everybody,because I believe in my own work. However, I do not expectthis to happen as cluster support is something most of youhave solved, one or the other way, years ago. If so, you stillmay want to read the documentation. You may getinspiration for improving your current solution.For example, how about the idea of quality-of-service ortransparent cache support, if eventual consistency is goodenough?

57.
Ulfs take...Young contenders, beat MySQL on a single machine...Accept new data models • Dont get emotional on the ER model: people love nested data!Push on (elastic) scalability and high availability • Where is built-in automatic failover not just command line? • Where is built-in write scaleout with MySQL Replication?Open up for innovation • Developing server plugins must become easier • Adding HTTP and other interfaces must become easier

58.
The speaker says...The document data model must look as if it has been madefor certain applications – so easy to dump arbitrary data thatrarely changes! This comes at a price (normalization,storage requirements, validation, ...).Limiting the discussion to a single machine (transactions,persistence, CPU scalability, memory efficiency, disk layout,query language, ...) MySQL is competitive. It takes manyyears of development fine tuning basic algorithms.But, NoSQL is also about massively distributed systems.

59.
What can MySQL users do?Drive MySQL 5.6 to its limit before you worry...Ask yourself what you give up in case of a switch • NoSQL is not SQL vs. something – it is far more!Use MySQL and the ideas of the new generation • Fast and lightweight client protocols • Create independent data units for scaling over many machines • Prebuild aggregates • Batch processing in addition to ad-hoc querying

60.
The speaker says...No big surprises here: ideas from the 80th, or has it been the70th? Who cares... Some things in NoSQL are new, othersare old favourites.Try to break things down to the basic concepts. That is hardin a world of temptations with each vendor focussing onpromoting his strengths. However, it really helps to take astep back and search for the basic concepts.Maybe, you have a senior in your company that can helpyou...