After having issues in building cassandra_v2 ans server in general on OS X,
I switched over development from OS X to Linux by installing a fresh copy of CentOS7.
I had MariaDB server build on it successfully and cassandra_v2 skeleton incantation as well.
The changes were pushed to github and the commits are found here

This week, I started working on my GSoC project. First, I did findings on how CQL differs from SQL and what query clauses work in MariaDB and not in Cassandra.
CQL is purposefully similar to SQL used in relational databases like MariaDB and MySQL. This similarity lowers the barrier of entry for users familiar with relational databases. Many queries are very similar between the two. In fact, a lot of basic things are even exactly the same. The following is both valid CQL and SQL

Switching databases

USE testatebase;

Creating Tables

CREATE TABLE IF NOT EXISTS testTable (id INT PRIMARY KEY);

Altering Tables

ALTER TABLE testTable ADD newField INT;

Creating Indexes

CREATE INDEX myIndex ON testTable (newField);

Inserting Data

INSERT INTO testTable (id, newField) VALUES (1, 7);

Selecting Data

SELECT * FROM testTable WHERE newField = 7;

Counting Data

SELECT COUNT(*) FROM myTable;

Deleting Data

DELETE FROM testTable WHERE newField = 7;

Differences

Cassandra is a non-relational database, and so uses different concepts to store and retrieve data. In simple terms, a Cassandra keyspace is a SQL database, and a Cassandra column family is a SQL table (CQL allows you to interchange the words “TABLE” and “COLUMNFAMILY” for convenience). This difference necessitates a different syntax for creating and manipulating data:-Creating databases:

Cassandra’s keyspaces require more specifications than a standard relational database. Note that the example above is the simplest form. In a production environment, likely spread across several data-centers, a keyspace would be created with a different strategy and replication factor.

-Organizing Data

None of the standard relational stuff is going to work in CQL. There is no support for things like JOIN, GROUP BY, or FOREIGN KEY. Leaving these features out is important because it makes writing and retrieving data from Cassandra much more efficient.But sets of data tend to have relationships with one another. So without the relational tools of SQL, we need another strategy to represent these relationships. The problem is that reads in Cassandra tend to be more expensive than you might be used to when dealing with a relational database. Also with CQL, any query you plan to do in the future should already be organized into a column family. You want to look up addresses that correspond to a person? Then make a column family that consists of people and addresses. You can still store them separately, but you should also store them together. Having a column family to represent every query makes reads much more efficient. You’re not just going to have to de-normalize, you’re going to have to forget about normalization all together.

-Inserts vs Updates

The concept of cheap writes extends to updating data. This is because, unlike in SQL, a read is not performed during the update. The syntax is the same in CQL and SQL.

/* Updating data */UPDATE testTable SET newField = 2 WHERE id = 6;

However, if the row does not exist, it will still get created. Similarly as unintuitive, an INSERT statement will actually replace data if it exists. Because again, CQL does not perform a read while inserting. Without a read, there is no way to know if the data being inserted is replacing an existing record. This means that both inserts and updates are extremely fast.

-Time to Live

CQL enables you to set a TTL on a row. Meaning that you can set a row to expire 24 hours from the time it gets created. This is accomplished with the USING TTL command (values are in seconds).

/** @brief
This is a bitmap of flags that indicates how the storage engine
implements indexes. The current index flags are documented in
handler.h. If you do not implement indexes, just return zero here.
@details
part is the key part to check. First key part is 0.
If all_parts is set, MySQL wants to know the flags for the combined
index, up to and including ‘part’.
*/
ulong index_flags(uint inx, uint part, bool all_parts) const
{
return 0;
}

/** @brief
unireg.cc will call this to make sure that the storage engine can handle
the data it is about to send. Return *real* limits of your storage engine
here; MySQL will do min(your_limits, MySQL_limits) automatically.
@details
There is no need to implement …_key_… methods if your engine doesn’t
support indexes.
*/
uint max_supported_key_length() const { return 16*1024; /* just to return something*/ }

This summer, I will be working with the MariaDB project to upgrade an existing storage engine to use a new Library released by Datastax. I will be mentored by Sergey Petrunia.

I will be working on the Cassandra storage engine which is used as an interface between MariaDB and Cassandra clusters. MariaDB has a pluggable architecture which allows the development of new storage engines that can be loaded at runtime without modifying the server code. This is achieved by implementing an instance of the handler interface.

I this project, I will base my work on MariaDB 10.1 which I build from source using steps outlined in KB.