Internet of Things (IoT) solutions have a lot of moving parts that need to work together to make something useful. Each architectural choice impacts performance, from the end device hardware to the protocol used to transmit data to the back-end infrastructure like hosting and databases. Here, we'll look at the impact of choosing a database on an IoT solution, and some of the key questions that should be asked to help make the right choice.

Let's say you are the decision maker at your company who has the unenviable responsibility of coming up with the right database strategy for your IoT solution. Actually, finding answers is not that hard with all the information out there. What's difficult is asking the right questions. This article will focus on what questions an engineering director or manager should be asking in order to ensure that he or she has considered all options when selecting a database.

The first question is how much it costs, right? Wrong—this should actually be the last question. Although it might be tempting to understand cost first thing, it's not so easy to answer that question unless you first answer several others. Here's where I would start:

1. What type of database do you want?
There are SQL databases, NoSQL databases and, for IoT-specific workloads, time-series databases. It may help to understand the strengths and weaknesses of each of these, and to decide which major direction you want to take. The next seven questions will help direct this highest-level decision.

2. What type of workloads do you have?
Do you have transactional workloads, analytics workloads or a combination of both? The underlying storage engine is very different for various workloads, so different databases can be great with one but not the other. For example, there's rowstore vs columnstore. Typically (but not always), rowstore-based databases are efficient for performing transactional queries, but are not optimized for reading selected columns for analytical purposes. Columnstore-based databases are efficient at reading a lot of data for analytics processing in a high-performant manner, but are not so good when you have to perform a transactional update.

3. How scalable does your solution need to be?
Are you just connecting up tens or even hundreds of devices, or are you dreaming in the millions? Most implementations start small, but it's important to understand reasonable expectations for the near future. Some databases may be great to start with (low cost, high performance), but will not necessarily scale easily beyond a certain capacity—that is, unless you make changes to your application code, which brings us to the next question.

4. How dependent should your application be on your database?
Each application can be tied to a particular database based on the way it's partitioned, scaled and so forth. Tying to a database makes certain things easier and more streamlined, but can require lots of new code if a switch is needed down the line. Keeping an application decoupled from a particular database is very much desirable, so as not to lock yourself out of other options. It also keeps up with the paradigm of programming to an interface, not to a specific implementation.