MariaDB TX, proven in production and driven by the community, is a complete database solution for any and every enterprise — a modern database for modern applications.

As a software engineer, I've always enjoyed discussing technology with like-minded devs who could offer alternative perspectives. And, like many engineers, I have moments where I really want to rant to an audience about an infuriating encounter I just had with tech or a project I'm working on.

These experiences encouraged my co-founder and I to create devRant: A community specially crafted with the wants and needs of developers in mind.

In this article I'm going to explain how Neo4j allows us to rapidly develop our app, scale, and analyze at a rate we wouldn't be able to achieve with a relational database.

The Very Basic Structure of Our Data

Above is a very basic diagram of how our data is structured. Even though a number of properties are omitted, the diagram covers a good chunk of our functionality.

Users post "rants," which contain some text, a timestamp, an optional image property (e.g., if there’s an image attached to the rant — yay memes!), etc. Users can post comments on rants, and they can also upvote rants and comments.

Why We're Using Neo4j as Our Primary and Only Datastore

We are a two-person team. I do all the backend development and most of the app development, and our other co-founder does all the UX and design work.

We are strong believers in building highly functional and effective MVPs (minimum viable product), but also know how to speed up the process of getting a product to market. Also, as someone who has done a lot of data engineering, I take app performance very seriously and try to build infrastructure with future scalability in mind.

Neo4j provided us with pretty much everything we needed to rapidly build a solid infrastructure for our app while simultaneously being a datastore that allows us to quickly add features that our users request. Beyond user-facing features, we can also use it to gather interesting data points on user behavior.

Let's dive a little deeper.

Planning for Scale, Rapid Prototyping, and the Relational Nightmare

First, in my opinion, there are some glaring scalability issues. To store the data in a relational database, we would need a table for rants, a table for comments, and another table for users. Then, we'd need a bunch of JOIN tables: One for users owning rants, one for users owning comments, one for users upvoting/downvoting rants, and one for users upvoting/downvoting comments.

In terms of scalability, the pieces that worry me the most about this use case are votes on rants and votes on comments. Voting is our simplest action, and I expect it to be our highest-volume one too.

I would never want to have a case where we had "too many" votes stored to properly and efficiently query them and to continue to store all votes. I see this as a problem with a relational database because, with a nice amount of growth, the vote table becomes highly trafficked in both reads and writes (we're constantly getting info about which user has voted on a rant, vote counts, etc.) and ends up with many awkward queries targeting it for various use cases.

The data format of Neo4j lends extremely well to having only three basic types of nodes:

Rants.

Comments.

Users.

It can then express simple user actions — like votes — by just using relationships. I like this for scalability because relationships are so cheap (compared to an indexed JOIN table), and the data is easily queryable in an efficient way.

For ease of implementation and rapid development, I like to try to store as few counters as possible (and if scale dictates they are needed, then add them). So when getting rant information, we might do a series of Cypher queries that look something like this:

We might run a query like that 10 times in a batch to get info about a batch of rants. In a graph database, all of the data for that query can easily be obtained from the rant node. In a relational database like MySQL, a similar query would require three subqueries querying JOIN tables for counts. This seems much less scalable and less flexible.

Beyond efficiency and scalability, the flexibility of a graph database allows us to quickly add features. Even seemingly complex features don’t require a bunch of tables to be set up to create a usable feature.

For instance, being able to report a user or a rant is a pretty important functionality. We were able to add this with the simple addition of a relationship between users, as opposed to a new table and another table to join on.

Keeping up the Pace

Founding a startup with a tiny, bootstrapped team is inherently difficult. I believe that for many projects, and especially ones like ours, using a graph database provides an excellent foundation for the needs of a rapidly growing technical startup. Using a graph database is providing us a storage engine that is, as previously mentioned, scalable, very flexible for new features, and also queryable for some pretty interesting analytics.

As a very basic example, say we wanted to see how many of our users have voted on at least two rants and have commented on at least two rants. We could just run a Cypher query like this (DISTINCT is added for comments because users can comment more than once on a rant, but can’t vote more than once.):

In the future, using Neo4j will allow us to take our product to the next level by adding in more complex algorithms surrounding our data. For example, we think it would be useful to cater a user's content feed to their actual preferences, which we could implement with some collaborative filtering, the user's interests, etc.

As we continue to improve devRant, one of our main focuses will be growth which I think is the case for many startups our size. Using a flexible database for our backend will allow us to focus on the things we really need in order to build our company while our storage engine allows our technical platform to grow and flourish.

MariaDB AX is an open source database for modern analytics: distributed, columnar and easy to use.