Database Design For Storing Chat Messages

Storing chat messages and retrieving those messages at a very fast rate is much needed for an chat app. Developers spend days to decide which DBMS to choose and how to design tables for storing messages. Sometimes they endup with difficult and bad designs. Most bad table designs require a lot of sorting. After doing a lot of research and smashing my brain I finally came up with a awesome way of storing and retrieving messages in SQL tables. My method doesn’t require any sorting. And retrieving is very fast even if you have only one server with good configuration.

In this tutorial I will be using MySQL as my database management system. But this same design can me implemented using any SQL, No-SQL or any other kind of database management system.

Total Messages Table

In this table we keep the total number of messages between two users. The identifier column will have two usernames separated by a colon. Therefore usernames cannot have colons. The identifier column is the primary key so it has unique values and the column is indexed.

You need to construct the identifier while producing SQL statement. Format of the identifier is that, both the usernames are sorted in ascending order.

Messages table

In this table we store all the messages between all users. The identifier_message_number column has identifier appended with the message number between those two users. The identifier and message number is separated by a colon. Then we store the message in the message column.

identifier_message_number column is the primary key. Therefore it will have unique values and is indexed.

Inserting Messages

While inserting a message you need to increment the total_messages value in total_messages column. After increment you need to add the message to the messages table.

Suppose user1 sended a message to user2 then the SQL statements will be,

Here identifier_message_number column is indexed therefore selection will be faster.

Conclusion

This table designs and formats makes insertion and selection very fast. This design makes it possible to run all queries in indexed columns. You can add more features and expand this tables. For questions on this leave comments. Thanks for reading.

As much as it sound flexible i’m so not, it’s much applicable. thanks anyways you did a lovely job, can you consider posting how to implement a chatting system using node.js and socket.io and storing values in SQL that should be very useful.

Hello. I like your idea of total messages count and local message ID (last part of your combo ID). Even though it’s denormalization it should really help so thanks for useful post. But I think I will modify the scheme a little. First why not use compound indexes instead of weird IDs? They would have the same advantages but more flexibility and nice looking. Another thing is that it’s not possible (or I don’t understand how) to get all conversations for particular user. I had similar problem when designing friendship feature and here’s what I did (inspired by this post) :

made two fields for user IDs: user_id_1 and user_id_2, first always lower added compound index for these fields at this point I can get all row for a user using ‘WHERE user_id_1 = :id OR user_id_2 = :id’, the problem is that only first condition will use index so I added another index for the second id only and use UNION for select: SELECT * … WHERE user_id_1=:id UNION SELECT * … WHERE user_id_2=:id

this works quite well on synthetic test table, unfortunately can’t check on real DB yet

pros:

seems to work fast and flexible enough on test DB and according to info on the web

cons:

weird queries everywhere these queries sometimes have nested queries but they are not real big as friendship table only have several int fields relatively heavy inserts but I don’t care as friend status is requested much more often than changed

Hope it will also help someone. Will think of combining two tricks in my application

To create code blocks or other preformatted text, indent by four spaces:

This will be displayed in a monospaced font. The first four
spaces will be stripped off, but all other whitespace
will be preserved.
Markdown is turned off in code blocks:
[This is not a link](http://example.com)