In this session we will look at different tuning aspects of MySQL Cluster.

As well as going through performance tuning basics in MySQL Cluster, we will look closely at the new parameters and status variables of MySQL Cluster 7.2 to determine issues with e.g disk data performance and query (join) performance.

This was the last session I attend, and for me is alway a great pleasure to be at Johan presentations, for many reasons:

- he is probably the best cluster expert (in service delivery)

- he knows a lot of distinctiveness and insight that no one else knows

MySQL/Oracle has released a new version of MySQL cluster recently, and I had the opportunity to test a little bit but not really in depth as I would like.

But one of the aspect that I was looking for what the way how the "condition pushdown" was modified.

Johan confirm by tests (empiric is always more trustable then commercial announcements), the way it is working now. NDBcluster returns the full result set, no sub set to MySQL and additional requests, so less roundtrip, and the relations by ID are resolved at local data node level.

Interesting the way it is also possible to partition data by Key associating that on the Data node, that would increase the locality of reference also if not implementing a real partition pruning.

Only negative note is that we still have the annoying problem with the MySQL SQL node/connection pool, it is still taking 1 slot out of the 253 available for each connection in the pool. This means that if I have 4 MySQL SQL nodes each allocating 10 connection in the pool, I will take out 40 connections from the 253, instead 4.

This cannot be a problem, but given I have done several MySQL cluster implementations, I can say that when MySQL is really used it needs to heave a huge number of Data nodes and MySQL nodes, resulting to a very high number of connections slot used.

Slave lag is the bane master/slave replication. This talk will explain why slave lag occurs and show you three important ways that Tungsten Replicator can banish it for MySQL slaves. Parallel apply uses multiple threads to execute slave transactions. Prefetch uses parallel threads to read ahead of the slave position and fetch pages that will be needed by the slave. Batching uses CSV files to load row updates in extremely large transactions that bypass SQL completely. We will explain each technique, show you how to use it, and provide performance numbers that illustrate the gain you can expect. We will round the talk out with a discussion of non-Tungsten tools that offer similar benefits. With these techniques in hand, you'll be well-prepared to attack any replication performance problem.

The talk taken by Robert Hodges with Stephane Giron, was as expected very interesting, and give to the audience a good insight abut how to implement Replicator efficiently.

I also think that at the current moment Continuent Replicator, is the only production ready solution that can be use for:

- Parallel replication by schema (or combination of them)

- Multi master one slave solution

- on process filtering and data processing

- Oracle to MySQL

- MySQL to oracle.

I enjoy it, and I have immediate plan to use the solution for solving current customer need.

I am in particular interested in the FILTER option, and to see how it can really become helpful when talking of data processing.

Finally a small note some times ago I publish a blog describing a dream about replication, my dream, and Rob was the only one of the mega-expert in the fields, who honors me with an answer and a good explanation why it could not work, at the current state of arts.

What I was interested most during the second day was again, synchronous replication and Replication solutions provide from Continuent.

The first I attend in the day was the Galera one, done Henrik and Alexey.

The presentation was going to talk about:

"We will present results from benchmarking a MySQL Galera cluster under various workloads and also compare them to how other MySQL high-availability approaches perform. We will also go through the different ways you can setup Galera, some of its architectures are unique among MySQL clustering solutions.

* MySQL Galera

** Synchronous multi-master clustering, what does it mean?

** Load balancing and other options

** WAN replication

** How split brain is handled

** How split brain is handled in WAN replication

* How does it perform?

** In memory workload

** Scale-out for writes - how is it possible?

** Disk bound workload

** WAN replication

** Parallel slave threads

** Allowing slave to replicate (commit) out-of-order

"

I know how passionate is Henrik when talking about Galera and I partially shares the feeling. I said partially because I am still not fully convinced about the numbers, but I am working on that.

Anyhow, a side the part related to bench marching, I have found interesting the combination of blocks and element for the HA solution.

Including redundant load balancer and use MySQL JDBC with Galera is a simple but efficient way to provide HA.

Also important we can finally drop the DRBD solution that has being for too long the only syncronous solution for MySQL. DRBD was forcing to have one PRIMARY (RW) node, and one SECONDARY completely useless.

I have also appreciated the honesty from Alexey about scaling.

Galera will not scale to infinite as some foul state, but it could have a decent number of nodes.

Now the limit of it is obviously to discover and calibrate against the real load pushed against the nodes, it cannot be define as an absolute abstract limit.

Interesting also how the Galera team is managing by "quorum" the server synchronization. In short having 3 nodes if one will not be able to access the other two but still get writes (split brain), at the moment of re-union, the other two will take over by "quorum".

Obvious and immediate problem is in case of having 3 datacenter one with 6 nodes and the others with 2 nodes each. If the DC with 6 nodes gets disconnected, the valid data will be in the 2 remaining data centre, but at the moment of reunion, the DC with 6 nodes will take the majority, and all data set will become invalid.

Alexey is working on a way to calculate the "weight" by proximity to fix this issue.

Honestly, I am not sure that Galera is production ready, but is for sure the most interesting and easy solution for simple write scale.

Key Notes

It was simply amazing for me as ex-MySQL AB to be at the conference today.I was really emoted seeing so much people most of the ones I know, all together again.The spirit was again the right one, with the will to say WE ARE HERE!Impressive, and I am happy to say once more, “I was there”.I am not going to comment the keynote speeches, but want to share the Baron message.We are here to share, and help each other to make better, help each other to go beyond our current limit.The spirit was the right one, the people there probably the most smart in the field, so why not.I have only one world AMAZING.

Topic: Measuring Scalability and Performance With TCP

What if you had all the data you needed to measure system performanceand scalability at any tier, discover performance and stabilityproblems before they happen, and plan for capacity and performance bymodeling the system's behavior at greater load than you currently have?Now it is as easy as running tcpdump and processing the result with atool. In this two-part talk you will first learn how to do black-boxperformance analysis to discover hidden problems in your systems. Inthe second part you will learn about mathematical performance andscalability models, how the inputs can be computed from TCP packetheaders, and how to derive and interpret the results with free toolsfrom Percona Toolkit.

Speaker:Baron

Comment:

Good Talk, as we are now used to get from Baron, also if the topic was touched on the Percona Live event, Baron had review and refine the slides, which are now much more clear.The proposed method for the quick analysis of the performance using TCP dump is simple and efficient.Honesty we do use it already but Baron add the scientific notations that makes an empiric measurement more objective.Specially in regards to the immediate issue identification and the concurrency calculation. On this specific topic I still need to digest/elaborate.Like the formula for the concurrency:

GOOD reading:Neil J. Gunther's book? Guerrilla Capacity Planning

Topic: Hibernate and Connector/J Tuning

Many Java developers using MySQL as a data backend rely on Hibernate to bridge their OO designs with the relational database world.This talk will review Hibernate and some of it's related projects, with a focus on performance.We will also cover performance related considerations about Connector/J, discussing settings and usage scenarios that will be useful even for Java developers not using Hibernate.

Comment:

I did attend this talk, hoping in something more and less.More focus on Hibernate possible problems, that we find every day because customers have no idea how to use Hibernate.Less because it was going too much in details of few Select, and was too fast in describing the solutions.Anyhow, given my huge background in programming, I was not really enlighten by the information, and was able to follow the flow the information, which are base on good sense in using the standard feature and definitions in Hibernate, regarding the Lazy load of the collections, and the way SELECT … JOIN(s) needs to be done.Finally a good review of what the MySQL JDBC can really do, which is not common given the most of the user just

Tired of the intricacies of circular replication? Dreaming of real multiple masters solutions for MySQL replication?Dream no more. Tungsten Replicator, a free and open source replacement for MySQL replication,can build clusters of asynchronous nodes in a matter of minutes.This workshop will explain the basics of Tungsten Replicator, and it will show how to start your multiple master cluster in a few minutes.There will be examples of several topologies: from the simplest bi-directional replication to the ambitious all-to-all (every node is a master),fan-in (multiple masters to a single slave), the star (a central hub connected to several bi-directional masters).

Comment:

Mysql 5.6 is going to be GA soon, probably at the end of September, by then most product that use customize replication solutions will be obsolete, but not the replicator.Continuent has develop a good solution for the multi master/multi master single slave solution that will remain valid in the time.Also replicator offers MySQL -> Oracle replication, and Oracle -> MySQL replication.It is going to be the perfect solution for many customers that will need to have scalable replication solution, and/or relation with Oracle databases.The parallel replication is and will remain by schema, also no real mechanism to guarantee the data integrity between masters/slave given the checksum will be calculated on the command and not on the data.Installation was facilitated a lot with the replicator installer.Last but not least the product has already in place the possibility to support “FILTERS” develop in Java or JavaScript, this will allow the implementation of possible DATA transformation at replication level, which is a very important factor.I was already discussing how this solution could solve several issue for some of our customers.

Topic:MySQL Optimizer Standoff MySQL 5.6 and MariaDB 5.3

Both MySQL 5.6 and MariaDB 5.3 introduced advanced game changing optimizer features.In this presentation we will look in details and comparison on these changes as well as perform benchmarks to show which version is able to handle complex queries better.If you're working with application using complex queries with MySQL this presentation is for you.

Comment:

This speech was some how a little bit strange.From one side Peter presenting his results on the evolution on the optimizer, on the other side few of developers from MariaDB team discussing most of the result.The most important point is and remain that the MySQL Optimizer, one of the most important element of the MySQL DB platform, is finally revisited from both sides MariaDB and Oracle MySQL.Optimizer was revisited in full to be included in the MySQL 6.0. This MySQL version was never released, as consequences all the improvements done so far on the Optimizer, where forgotten and leave aside the release delivery.Optimizer is the core of any DB platform, it mainly decide how to physically access the data, reading the SQL statements, and translating it in to action plan against Indexes, and table(s) reads. The optimizer goes for lower cost not for execution time, to do so the optimizer use statistics, if a DBA do not collect accurate statistics the optimizer will not be able to identify an efficient action plan. Is a fact that changes to the optimizer are always scaring, giving they can overturn successful SQL statements in very bad SQL statement.The MySQL optimizer still has serious limitations like not using prepare statement to execute the queries, such that any query will invoke/involve the optimizer.From the many improvements done both side, Maria DB and MySQL 5.6 are very much more efficient then 5.5, of an order of 5.5. 900 sec 5.6 maria 180 sec.Not only in many cases MariaDB is much more efficient then MySQL 5.6Speaker: Peter Zaitsev

This is my 2nd year in Pythian, and the first in Santa Clara as part of the Pythian company, but I will not be alone.

This year Pythian will have a good number of MySQL members. We will wait for you at Pedro's restaurant , if you are not registered yet, please do so NOW!!! Register yourself and join us.

The official announcement from the company:

"Pythian organizing an event that by now may be considered a tradition: The MySQL community dinner at Pedro’s! This dinner is open to all MySQL community members as many of you will be in town for the MySQL Conference that week."