Attachments

No attachments exist

Google Summer of Code 2020

You are viewing an old version of this article. View
the current version here.

This year, again, we are participating in the Google Summer of Code. The MariaDB Foundation believes we are making a better database that remains application compatible with MySQL. We also work on making LGPL connectors (currently C, ODBC, Java) and on MariaDB Galera Cluster, which allows you to scale your reads & writes. And we have MariaDB ColumnStore, which is a columnar storage engine, designed to process petabytes of data with real-time response to analytical queries.

Histograms with equal-width bins in MariaDB

Histograms with equal-width bins are easy to construct using samples. For this it's enough
to look through the given sample set and for each value from it to figure out what bin this value can be placed in. Each bin requires only one counter.
Let f be a column of a table with N rows and n be the number of samples by which the equal-width histogram of k bins for this column is constructed. Let after looking through all sample
rows the counters created for the histogram bins contain numbers c[1],..,c[k]. Then
m[i]= c[i]/n * 100 is the percentage of the rows whose values of f are expected to be in the interval

(max(f)-min(f))/k *(i-1), max(f)-min(f))/k *i-1).

It means that if the sample rows have been chosen randomly the expected number of rows with the values of f from this interval can be approximated by the number m[i]*/100 * N.

To collect such statistics it is suggested to use the following variant of the ANALYZE TABLE command:

'WITH n ROWS' provides an estimate for the number of rows in the table in the case when this estimate cannot be obtained from statistical data.

'SAMPLING p PERCENTS' provides the percentage of sample rows to collect statistics. If this is omitted the number is taken from the system variable samples_ratio.

'IN RANGE r' sets the range of equal-width bins of the histogram built for the column col1. If this is omitted then and min and max values for the column can be read from statistical data then the histogram is built for the range [min(col1), max(col1)]. Otherwise the range [MIN_type(col1), MAX_type(col1) is considered]. The values beyond the given range, if any, are also is taken into account in two additional bins.

WITH k INTERVALS says how many bins are included in the histogram. If it is omitted this value is taken from the system variable histogram_size.

MariaDB Server: others

Support for GTID in mysqlbinlog

The --start-position and --stop-position options should be able to take
GTID positions; or maybe there should be new --start-gtid and --stop-gtid
options. Like --start-gtid=0-1-100,1-2-200,2-1-1000.

A GTID position means the point just after that GTID. So starting from
GTID 0-1-100 and stopping at GTID 0-1-200, the first GTID output will
probably be 0-1-101 and the last one 0-1-200. Note that if some domain is
not specified in the position, it means to start from the begining,
respectively stop immediately in that domain.

Starting and stopping GTID should work both with local files, and with --read-from-remote-server. For the latter, there are a couple of extra things that need doing in the master-slave protocol, see get_master_version_and_clock() in sql/slave.cc.

At the end of the dump, put these statements, to reduce the risk of those session variables incorrectly spilling into subsequent statements run in the same session:

connection encryption plugin support

As a follow-on to MDEV-4691 we would like GSSAPI encryption (in addition to authentication) support in MariaDB. I am told that the current plan is to create a plugin interface and then we can build GSSAPI encryption on top of that, so here is a ticket for that.

From having written GSSAPI for the internal interface, there were a couple things I would like to see in the plugin encryption interface.

First, GSSAPI is weird in that it does authentication before encryption (TLS/SSL are the other way around, establishing an encrypted channel and then doing authentication over it). Of course support for this is needed, but more importantly, packets must be processed in a fully serialized fashion. This is because encrypted packets may be queued while one end of the connection is still finishing up processing the authentication handshake. One way to do this is registering "handle" callbacks with connection-specific state, but there are definitely others.

Additionally, for whatever conception there ends up being of authentication and encryption, it needs to be possible to share more data than just a socket between them. The same context will be used for authentication and encryption, much as an SSL context is (except of course we go from authentication to encryption and not the other way around).

This ties into an issue of dependency. If authentication plugins are separate entities from encryption plugins in the final architecture, it might make sense to do mix-and-match authentication with encryption. However, there are cases - and GSSAPI is one - where doing encryption requires a certain kind of authentication (or vice versa). You can't do GSSAPI encryption without first doing GSSAPI authentication. (Whether or not it's permitted to do GSSAPI auth->encryption all over a TLS channel, for instance, is not something I'm concerned about.)

Finally, encrypted messages are larger than their non-encrypted counterparts. The transport layer should cope with this so that plugins don't have to think about reassembly, keeping in mind that there may not be a way to get the size of a message when encrypted without first encrypting it.

It's unfortunately been a little while since I wrote that code, but I think those were the main things that we'll need for GSSAPI. Thanks!

sort out the compression library chaos

As MariaDB is getting more storage engines and as they're getting more features, MariaDB can optionally use more and more compression libraries for various purposes.

InnoDB, TokuDB, RocksDB — they all can use different sets of compression libraries. Compiling them all in would result in a lot of run-time/rpm/deb dependencies, most of which will be never used by most of the users. Not compiling them in, would result in requests to compile them in. While most users don't use all these libraries, many users use some of these libraries.

A solution could be to load these libraries on request, without creating a packaging dependency. There are different ways to do it

hide all compression libraries behind a single unified compression API. Either develop our own or use something like Squash. This would require changing all engines to use this API

use the same approach as in server services — create a service per compression library, a service implementation will just return an error code for any function invocation if the corresponding library is not installed. this way — may be — we could avoid modifying all affected storage engines

Control over memory allocated for SP/PS

SP/PS (Stored Procedures / Prepared Statements) allocates memory till the PS cache of SP will be destroyed. There is no way to see how many memory allocated and if it grows with each execution (first 2 execution can lead to new memory allocation but not more)

Task minimum:

Status variables which count the memory used/allocated for SP/PS by thread and/or for the server.

Other ideas:

Automatic stop allocation in debvugging version after second execution and call exception on attempt.

Information schema by threads and SP/PS with information about allocated and used memory

Information can be collected in MEM_ROOTs of SP/PS. Storing info about status of mem_root before execution then checking after new allocated memory can be found.

MEM_ROOT can be changed to have debug mode which make it read only which can be switched on after second execution.

Speed up RowGroup data access methods reducing level of inderection

RowGroup is the unit of the data sent around CS cluster. It is basically a set of records for fixed size data types + additional storage area for binary data, e.g. strings.
Right now there are lots of RowGroup methods that access RowGroup fixed size data using extra level of inderection. Here is the example:

return ((int64_t) &data[offsets[colIndex]]);

This expression uses extra assembly instruction to calculate the effective address.
We want to remove 'offsets[colIndex]' part of this and similar expressions in RowGroup code.

Parallel sorting 2nd phase

Merge the presorted partial runs produced during the 1st phase.
Here is more detailed explanation of how sorting works as of 1.4.1

CS gets a portion of records data from previous steps of the query execution(RGData instance from the ring buffer of RGData-s) and produces a sorting run out of it using existing sorting class LimitedOrderBy. If the query contains LIMIT then we apply it at this phase. This allows to significantly reduce the data set cardinality. If the query contains LIMIT + OFFSET then CS builds a sorted run of the records that is up to LIMIT+OFFSET size. CS does this step in parallel dividing the whole data set into k runs where k is governed by a session variable - infinidb/columnstore_orderby_threads. At this phase CS tries to preallocate memory in QUEUE_RESERVE_SIZE batches.
CS merges and sorts k presorted partial runs produced by a previous phase in a single thread. If the query contains DISTINCT keyword CS rebuilds a hash map to preserve uniqueness.

We want to make 2nd phase also parallel using range partitioning of the presorted runs produced by the 1st phase. After 1st phase finishes we know the distribution of the sorting key values thus can divide the thread key values run into regions - buckets. Every 2nd phase thread takes values from corresponding region buckets (contains the same values region) from every 1st phase sorted run. Then all 2nd phase threads sorts its runs in parallel. In the end we put the sorted regions in asscending order of the key values into output stream.

Add a bitmap for NULL/empty data

Engine independent statistics for Columnstore

CS now has a very rudimentary query optimization capabilities and we want to improve the situtation. We consider to use Server's optimizer for the purpose but the Server needs statistics namely values distribution histograms and Number of Distinct Values distribution histograms.
There are different levels of complexity for the task:

implement standalone segment files reader that in the end populates both mysql.column_stats and mysql.table_stats using out of band mariadb client connection

implement ANALYZE TABLE functionality for Columnstore engine

implement ANALYZE TABLE and Histograms with equal-width bins for values distribution histograms(similar to MDEV-12313) together with NDV histograms to decrease I/O
We expect to have both unit and regression tests but this is optional.

Use JIT for aggregation functions

CS uses Volcano processing approach working on one value at a time. This is very inefficient way for analytics workload that usually uses lots of aggregation functions in projections, filtering or sorting.
We are interested in using JIT for basic aggregation functions: sum, avg, count, min, max. The patch must compile and run a program that processes and returns the aggregation function result. We were written this description having LLVM in mind as it is widespread and has a lots of examples in the wild.
I suggest to start looking at RowAggregation::addRowGroup() from ./utils/rowgroup/rowaggregation.cpp to get what it takmakees to get avg() function result.
Here is the link on how to build fast a CS developer environment.

Engine independent statistics for Columnstore

CS now has a very rudimentary query optimization capabilities and we want to improve the situtation. We consider to use Server's optimizer for the purpose but the Server needs statistics namely values distribution histograms and Number of Distinct Values distribution histograms.
There are different levels of complexity for the task:

implement standalone segment files reader that in the end populates both mysql.column_stats and mysql.table_stats using out of band mariadb client connection

implement ANALYZE TABLE functionality for Columnstore engine

implement ANALYZE TABLE and Histograms with equal-width bins for values distribution histograms(similar to MDEV-12313) together with NDV histograms to decrease I/O

We expect to have both unit and regression tests but this is optional.

Suggest a Task

Comments

Comments loading...

Content reproduced on this site is the property of its respective owners,
and this content is not reviewed in advance by MariaDB. The views, information and opinions
expressed by this content do not necessarily represent those of MariaDB or any other party.