Tag Archives: mysql

London, United Kingdom – 31 March 2014 – The MariaDB Foundation, an independent body which promotes the popular open source database MariaDB, today announced the much-anticipated general availability of MariaDB 10, providing today’s generation of application developers with enhanced performance and functionality.

Since its launch in 2009, MariaDB has built an active and vibrant open source community and has led the way in database innovation. In 2013, Wikimedia Foundation, the people behind Wikipedia, announced the migration of many of its production MySQL systems over to MariaDB and Google confirmed it aims to move its thousand-plus MySQL instances over to MariaDB technology.

Engineered by some of the world’s leading database experts, including MySQL founder, Michael ‘Monty’ Widenius, MariaDB has been enthusiastically adopted by a fast-growing eco-system including influential Linux distributions Red Hat, Fedora, Suse and Debian. MariaDB 10 includes numerous innovations developed with and for web-scale players like Google, Fusion-IO and Taobao such as:

Replication

MariaDB 10 sets a new standard in performance. It is many times faster than previous generations of MariaDB and especially legacy database MySQL thanks to new features including parallel replication and a further advanced group commit. Also, the replication slaves are now crash-safe.

It’s also now possible to replicate data from multiple master servers giving a complete view of the distributed data across big datasets for real-time analytical purposes via the multi-source replication feature.

NoSQL Capabilities

The CONNECT engine enables dynamic access to diverse data sources dynamically, including unstructured files such as log files in a folder, or any ODBC database, from within MariaDB 10. Great for ETL (Extraction, Transformation and Load) and Real-Time analysis.

Dynamic Columns store disparate labelled data objects in each row of a table in much the same way as NoSQL technologies.

MariaDB 10 includes built-in sharding in the form of the SPIDER engine, allowing big database tables to be split across multiple servers, for performance and scale. MariaDB combines sharding with the new replication features to provide true high availability.

Michael ‘Monty’ Widenius, CTO MariaDB Foundation said: “MariaDB has led the way as a cutting-edge, open source database innovation. MariaDB 10 continues MariaDB’s reputation for delivering new capabilities ahead of the curve, building advanced features to address the needs of the most demanding users.

“We’d like to thank the tireless support of our principal contributors SkySQL and community members like Google and Fusion-IO whose continuous feedback is essential in driving innovation in MariaDB. We are also very thankful for all community developers that have contributed to the success of MariaDB.” concludes Widenius.

“We are delighted to see our collaboration with the MariaDB community come to fruition in the MariaDB 10.0 solution,” said Madhu Matta, Fusion-io Vice President of Hyperscale. “Databases continue to showcase how flash technologies like Fusion ioMemory drive application acceleration and solve critical customer problems. The unique combination of MariaDB 10 innovations and our flash aware interfaces like NVM Compression and Atomic Writes will help drive the next chapter of database technologies.”

Patrik Sallner, CEO SkySQL said: “MariaDB 10 is a landmark open source innovation and an evolution of the fundamental SQL technology which runs the world’s largest databases. Developers can now achieve state-of-the-art application performance at scale with a SQL database they know and trust. If you want fast and secure applications, MariaDB is the right choice.”

For application developers looking to build serious, revenue-generating applications using MariaDB, SkySQL will be soon releasing its next generation of MariaDB Enterprise that provides all the tools and professional expertise needed to run MariaDB in large, complex production environments.

About MariaDB Foundation
The MariaDB Foundation promotes, protects, and advances the MariaDB codebase, community, and ecosystem. It’s members are companies like SkySQL (who also employ many core MariaDB code contributors), Booking.com, Parallels and Automattic. In addition it receives sponsor support from a range of vendor and user companies. Visit www.mariadb.org to learn more.

On Thursday MySQL technology saw a huge boost. It’s hard for anyone now to argue that MySQL isn’t in the game of extreme scalability and performance, which some NoSQL vendors have been using as a tagline for the last years. To see four of the largest MySQL and MariaDB users come together to bootstrap a branch of MySQL for extreme scaling needs is simply fantastic. The improvements done inside these companies will now be available to the rest of the community. In all fairness Facebook and Twitter, in particular, have been making their improvements publicly available also before. Google has also made some improvements available publicly over the years and have lately been active in the MariaDB project with code reviews, bug fixes and other patches. But broadening the public contributions further and combining it all, is new.

Engineering of MySQL technology happens in many places. Aside from Oracle and the companies behind WebScaleSQL, there are two other entities that have made significant contributions to improving MySQL technology in recent years – Percona and MariaDB. Percona has made many performance-related functionality and tooling improvements. The MariaDB project has made significant engineering efforts by bringing out many new features to MySQL technology and has also become the project for community contributions. Take a look at the list of bigger contributions that have made it into MariaDB 10.0.

MariaDB 5.5 is constantly being merged with MySQL 5.5 community edition. Every time there is a new minor release of MySQL 5.5 a new release of MariaDB 5.5 with exactly the same number comes out shortly afterwards. In MariaDB 10, this dependency is lighter, which the numbering also indicates. MariaDB 10 includes a lot of merged code from MySQL 5.6, but it also includes a big amount of MariaDB specific code and code merged from the wider community. WebScaleSQL will be another important source for merges in the future. Without knowing all the details of WebScaleSQL, it should be safe to say that there are two sorts of patches: the ones that improve MySQL technology in general and the ones that would be specific for the purposes of running MySQL at extreme scale with integration into technologies not commonly used in more normal setups of MySQL or MariaDB.

MariaDB is all about improving and keeping the world’s most installed database, MySQL, available to the masses independent of whether they are private persons with the need for a database for their blog or if the target is a mission critical enterprise application. MariaDB therefore needs to provide all the components needed from database drivers (connectors) to integrated high-availability solutions like MariaDB Galera Cluster.

In addition, the majority of the users and organizations using MariaDB or MySQL don’t have the in-house skills to build and make changes to MariaDB or MySQL. This is why MariaDB has to be supported on a wide variety of platforms and binaries provided for all of them. WebScaleSQL is currently only compatible with GNU/Linux x86_64 platforms and no binaries are produced.

WebScaleSQL confirms the power of community driven development in open source and is a very nice addition to the branches and forks of MySQL!

Background

Eventual consistency is a consistency model used in many large distributed databases which requires that all changes to a replicated piece of data eventually reach all affected replicas; conflict resolution is not handled and responsibility is pushed up to the application author in the event of conflicting updates [13].

Eventual consistency is a specific form of weak consistency; the storage system guarantees that if no new updates are made to the object, eventually all accesses will return the last updated value [14]. If no failures occur, the maximum size of the inconsistency window can be determined based on factors such as communication delays, the load on the system, and the number of replicas involved in the replication scheme [3].

A few examples of eventually consistent systems:

DNS

Asynchronous master/slave replication on an RDBMS e.g. MariaDB

memcached in front of MariaDB, caching reads

Eventual consistency states that in an updatable replicated database, eventually all copies of each data item converge to the same value. The origin of eventual consistency can be traced back to Thomas’ majority consensus algorithm [12]. The term was coined by Terry et al. [11] and later popularized by Amazon in their Dynamo system, which supported only eventual consistency [7].

The CAP theorem, also called Brewer’s theorem by its author Dr. Erik A. Brewer, was introduced at PODC 2000 [4, 5]. The theorem was formally shown by Gilbert and Lynch [8]. Brewer introduced consistency, availability and partition tolerance as three desired properties of any shared-data system and made the conjuncture that maximally two of them can be guaranteed at one time [6].

In general, this theorem perfectly matches the needs of today’s Internet systems. Ideally we expect a service to be available during the whole time period of network connection by which the service is connected to the network/Internet [1]. Thus if a network connection is available the service should be available as well [9,10]. To achieve good performance, requests need to be processed by a distributed system. If the number of servers are increased the probability of server failure or network failure is also increased. A system therefore needs to take this into account and be designed in such a way that these failures are transparent for the client and the impact of such failure is minimized [2]. The abbreviation of the theorem comes from these three properties:

Consistency: This property requires that each operation executed within a distributed system where data is spread among many servers ends with the same result as if executed on one server with all data.

Availability: This property of a distributed system requires that sending a request to any functional node should be enough for a requester to get the response. By complying with this property a system is tolerant to failure of any nodes caused, for instance, by network throughput issues.

Partition Tolerance: A distributed system consists of many servers interconnected by a network. A frequent requirement is distributing the system across more data centers to eliminate the failure of one of them. During network communication, failures are frequent. Hence, a system needs to be fail-proof against an arbitrary number of failed messages among servers. Temporary communication interruption among a server set must not cause the whole system to respond incorrectly [9].

Eventual consistency is defined as follows:

Definition 1: Eventual consistency.

Eventual delivery: An update executed at one node eventually executes at all nodes.

Termination: All update executions terminate.

Convergence: Nodes that have executed the same updates eventually reach an equivalent state (and stay).

Example 1: Consider a case where data item
R=0 is stored on all three nodes. Assume that we have the following sequence of writes and commits:
W(R=3) C W(R=5) C W(R=7) C on node0. Now reads from node1 could return
R=3 and reads from node2 could return
R=5 . This is eventually consistent as long as reads from all nodes eventually return the same value. Note that this final value could be
R=3. Eventual consistency does not restrict the order in which the writes must be executed.

MariaDB Demonstration

As already stated, normal master slave setup on MariaDB is eventually consistent. In this article we are interested in a situation where we have a multiple masters setup. We will use MariaDB 10.0. There are several possible topologies that could be considered here but we have selected a ring topology (see Figure 1).

Figure 1: MariaDB ring topology.

In this topology Node0 is master and Node1 is slave for Node0. Similarly, Node2 is slave for Node1. Let’s start configuration of the nodes with Node0:

1

2

3

4

5

6

7

8

9

10

11

12

[mysqld]

port = 4000

socket=/home/jan/mysql/mm/node0/mysql.4000.sock

datadir=/home/jan/mysql/mm/node0

basedir=/usr/local/mysql

log_error=/home/jan/mysql/mm/node0/error.log

binlog_format=ROW

log-bin

log-slave-updates=1

gtid-strict-mode=1

server_id=1

gtid-domain-id=1

Similarly Node1:

1

2

3

4

5

6

7

8

9

10

11

12

[mysqld]

port = 5000

socket=/home/jan/mysql/mm/node1/mysql.5000.sock

datadir=/home/jan/mysql/mm/node1

basedir=/usr/local/mysql

log_error=/home/jan/mysql/mm/node1/error.log

binlog_format=ROW

log-bin

log-slave-updates=1

gtid-strict-mode=1

server_id=2

gtid-domain-id=2

And finally Node3:

1

2

3

4

5

6

7

8

9

10

11

12

[mysqld]

port = 6000

socket=/home/jan/mysql/mm/node2/mysql.6000.sock

datadir=/home/jan/mysql/mm/node2

basedir=/usr/local/mysql

log_error=/home/jan/mysql/mm/node2/error.log

binlog_format=ROW

log-bin

log-slave-updates=1

gtid-strict-mode=1

server_id=3

gtid-domain-id=3

After this is done we can install the MariaDB databases and start the servers.

Now let’s create one table and add some data to it from the different nodes:

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

/usr/local/mysql/bin/mysql test-uroot\

-S/home/jan/mysql/mm/node0/mysql.4000.sock\

-e"create table a(a integer not null primary key) engine=innodb"

/usr/local/mysql/bin/mysql test-uroot\

-S/home/jan/mysql/mm/node0/mysql.4000.sock\

-e"insert into a values(1)"

/usr/local/mysql/bin/mysql test-uroot\

-S/home/jan/mysql/mm/node1/mysql.5000.sock\

-e"insert into a values(2)"

/usr/local/mysql/bin/mysql test-uroot\

-S/home/jan/mysql/mm/node2/mysql.6000.sock\

-e"insert into a values(3)"

After this all nodes are eventually consistent and return the same result set, for example:

1

2

3

4

5

6

7

8

9

MariaDB [test]> select * from a;

+---+

| a |

+---+

| 1 |

| 2 |

| 3 |

+---+

3 rows in set (0.00 sec)

From this we can conclude that MariaDB is eventually consistent also with a multiple masters setup when there are no conflicting operations done.

But what happens if there is a conflict? We can test this scenario by trying to insert a duplicate key to the table a. We try to insert a value 5 to both node0 and node2 so that the final commit commands are issued at about the same time.

1

2

3

4

5

6

7

8

MariaDB [test]> set autocommit = 0;

Query OK, 0 rows affected (0.00 sec)

MariaDB [test]> insert into a values (5);

Query OK, 1 row affected (0.00 sec)

MariaDB [test]> commit;

Query OK, 0 rows affected (0.27 sec)

Because we used the InnoDB storage engine and
autocommit was off, there is no error message shown on both client connections at commit time. This is because MariaDB does not support deferred constraint checks and no error is possible in the following case:

You insert 5 on server at node0, it succeeds.

Before the insert is replicated to server at node2, you, on server node2, insert 5, that also is OK because this is asynchronous replication.

Then the second insert is replicated from node2 to node0, this causes a conflict due to the duplicate key value 5, the replication thread gets the error and rolls back.

Thus the result set is the following on all three nodes:

1

2

3

4

5

6

7

8

9

10

MariaDB [test]> select * from a;

+---+

| a |

+---+

| 1 |

| 2 |

| 3 |

| 5 |

+---+

4 rows in set (0.00 sec)

This is also eventually consistent because all servers return exactly the same value and they have executed exactly the same transactions. From the server log we can find out that:

problem, and restart the slave SQL thread with "SLAVE START". We stopped at log

'jan-GE70-0NC-0ND-bin.000001' position 897

And from, for example, node0 you can see it with:

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

49

50

51

52

53

54

55

MariaDB [test]> show slave status\G

*************************** 1. row ***************************

Slave_IO_State: Waiting for master to send event

Master_Host: 127.0.0.1

Master_User: root

Master_Port: 6000

Connect_Retry: 60

Master_Log_File: jan-GE70-0NC-0ND-bin.000001

Read_Master_Log_Pos: 1036

Relay_Log_File: jan-GE70-0NC-0ND-relay-bin.000002

Relay_Log_Pos: 824

Relay_Master_Log_File: jan-GE70-0NC-0ND-bin.000001

Slave_IO_Running: Yes

Slave_SQL_Running: No

Replicate_Do_DB:

Replicate_Ignore_DB:

Replicate_Do_Table:

Replicate_Ignore_Table:

Replicate_Wild_Do_Table:

Replicate_Wild_Ignore_Table:

Last_Errno: 1062

Last_Error: Could not execute Write_rows_v1 event on table

test.a; Duplicate entry '5' for key 'PRIMARY',

Error_code: 1062; handler error

HA_ERR_FOUND_DUPP_KEY; the event's master log

jan-GE70-0NC-0ND-bin.000001, end_log_pos 1009

Skip_Counter: 0

Exec_Master_Log_Pos: 897

Relay_Log_Space: 1271

Until_Condition: None

Until_Log_File:

Until_Log_Pos: 0

Master_SSL_Allowed: No

Master_SSL_CA_File:

Master_SSL_CA_Path:

Master_SSL_Cert:

Master_SSL_Cipher:

Master_SSL_Key:

Seconds_Behind_Master: NULL

Master_SSL_Verify_Server_Cert: No

Last_IO_Errno: 0

Last_IO_Error:

Last_SQL_Errno: 1062

Last_SQL_Error: Could not execute Write_rows_v1 event on table

test.a; Duplicate entry '5' for key 'PRIMARY',

Error_code: 1062; handler error

HA_ERR_FOUND_DUPP_KEY; the event's master log

jan-GE70-0NC-0ND-bin.000001, end_log_pos 1009

Replicate_Ignore_Server_Ids:

Master_Server_Id: 3

Master_SSL_Crl:

Master_SSL_Crlpath:

Using_Gtid: No

Gtid_IO_Pos:

1 row in set (0.00 sec)

As seen from the logs, the problem is that replication between nodes has been stopped. However, there is a way to ignore replication errors caused by application errors by configuring with
--slave-skip-error=XXX , and
--slave_exec_mode=IDEMPOTENT. The
--slave_exec_mode option controls whether
IDEMPOTENT or
STRICT mode is used in replication conflict resolution and error checking. The
IDEMPOTENT mode causes suppression of duplicate-key and no-key-found errors. This mode is needed for multi-master replication and circular replication. Other valid errors caused by the application can be skipped using
--slave-skip-error .

To demonstrate, let’s set
--slave-skip-error=all and
--slave-exec-mode=IDEMPOTENT on all servers and restart them. We can now try to get the servers into different states (i.e. alternative futures). Execute the following on node0:

This situation is not eventually consistent and MariaDB can’t resolve the situation automatically. If application needs eventual consistency, it needs to resolve this conflict so that all databases again are in the same state that is correct by application rules.

Conclusions

Eventual consistency means that given enough time, over which no changes are performed, all successful updates will propagate through the system and all replicas will be synchronized. At any given time, there is no guarantee that the data accessed is consistent, therefore the conflicts have to be resolved. Using this definition MariaDB is eventually consistent if replication errors are not ignored even in cases when replication is stopped on replication errors and as long as replication is at some point of time (bounded time) continued and all servers return the same state. If replication errors are ignored, applications must correct the case where two or more servers are in different states.

Our original question was: Is MariaDB eventually consistent?

Answer:For most master slave(s) setups where all data is replicated to slaves MariaDB is eventually consistent. For multiple masters setups where only application handled error cases are ignored and where the application makes sure that servers can’t diverge to alternate futures, MariaDB is eventually consistent. However, there are replication configurations where MariaDB is not eventually consistent.