Sizing Guidelines

Evaluate the overall performance and capacity requirements and determine the hardware and other resources.

When you plan to deploy a Couchbase Server cluster, perhaps the most common (and important) question that comes up is: how many nodes do I need and what size do they need to be?
To learn more, contact Couchbase Support for sizing help.
You can also read about sizing in How many nodes.

With the introduction of Multi-Dimensional Scaling (MDS), sizing is becoming more challenging, and this guideline aims to help users better size their clusters.

General Considerations

The sizing of your Couchbase Server cluster is critical to its overall stability and performance.
While there are obviously many variables that go into this, the idea is to evaluate the overall performance and capacity requirements for your workload and dataset.
Then divide that into the hardware and resources you have available.
Your application wants the majority of reads coming out of the cache, and the I/O capacity to handle its writes.
There needs to be enough capacity in all areas to support everything the system is doing while maintaining the required level of performance.

This guideline will refer to five determining factors one should be aware of when sizing a Couchbase Server cluster: RAM, Disk (I/O and space), CPU and network bandwidth.

Sizing Couchbase Server Resources

There are several resources you need to consider when sizing a Couchbase Server node:

Number of CPUs

RAM

Disk space

Network

CPU

CPU refers to the number of cores required to run your workload.

RAM

RAM is frequently one of the most crucial areas to size correctly.
Cached documents allow the reads to be served at very low latency and consistently high throughput.

This resource represents the main memory allocated to Couchbase Server and should be determined by the following factors:

Full Text Search (FTS), where the minimum RAM allocation is 512MB, and recommended is 2048MB+ RAM.

Table 1. Minimum RAM quota for Couchbase Server components

Component

Minimum RAM

Data Service

1GB

Indexing Service powered by ForestDB

256MB

Indexing Service powered by Memory-Optimized Index

256MB minimum, 1GB recommended.

Full text search

512MB

Disk space

Requirements for your disk subsystem are:

Disk size, which refers to the amount of the disk storage space needed to hold your entire data set.

Disk I/O, which is a combination of your sustained read/write rate, the need for compacting the database files, and anything else that requires disk access.

To better support your Couchbase Server, keep in mind the following:

Disk space continues to grow if fragmentation ratio keeps climbing.
To mitigate that, add enough buffer in your disk space to store all the data.
Also, keep an eye on the fragmentation ratio in the UI and trigger compaction process when needed.

SSD is desired but not required.
SSD will bring much better performance than HDD regarding disk throughput and latency.

Table 2. Minimal and Recommended Hardware Resources

Resource

Minimum

Recommended

CPU

4

8 and above.
Both the number and performance of CPUs are important.

RAM

4GB

16GB and above.
More RAM means that more data can be served from memory and allow for better performance.

Disk

8GB

16GB and above

Network

Enough network bandwidth is vital to the performance of Couchbase Server.
A reliable high-speed network for intra-cluster and inter-cluster communications has a huge effect on overall performance and scalability of Couchbase Server.
Most people can do this with a 1GB interface, some need 10GB.

Data Service Nodes

Data Service nodes handle data service operation, such as create/read/update/delete (CRUD) operations.

It is important to keep use cases and application workloads in mind since different application workloads have different resource requirements.
For example, if your working set needs to be fully in memory, you might need large RAM size.
On the other hand, if your application requires only 5% of data in memory, you will need enough disk space to store all the data and fast enough disks for your read/write operations.

You can start sizing the Data Service nodes by asking the following questions:

Do you plan to use XDCR?
Please refer to Couchbase Server 4.0 documentation about different deployment topology.

What’s your working set size and what are your data operation throughput or latency requirements?

Answers to the above questions can help you better understand the capacity requirement of your cluster and provide a better estimation for sizing.

Inside Couchbase Server, we looked at performance data and customer use cases to provide sizing calculations for each of the areas: CPU, Memory, Disk, and Network.

A use case for RAM sizing was used as an example with the following data:

Table 3. Input variables for sizing RAM

Input Variable

value

documents_num

1,000,000

ID_size

100

value_size

10,000

number_of_replicas

1

working_set_percentage

20%

Table 4. Constants for sizing RAM

Constants

value

Type of Storage

SSD

overhead_percentage

25%

metadata_per_document

56 for 2.1 and higher, 64 for 2.0.x

high_water_mark

85%

Based on the provided data, a rough sizing guideline formula would be:

Table 5. Guideline Formula for Sizing a Cluster

Variable

Calculation

no_of_copies

1 + number_of_replicas

total_metadata

(documents_num) * (metadata_per_document + ID_size) * (no_of_copies)

total_dataset

(documents_num) * (value_size) * (no_of_copies)

working_set

total_dataset * (working_set_percentage)

Cluster RAM quota required

(total_metadata + working_set) * (1 + headroom) / (high_water_mark)

number of nodes

Cluster RAM quota required / per_node_ram_quota

Based on the above formula, these are the suggested sizing guidelines:

Table 6. Suggested Sizing Guideline

Variable

Calculation

no_of_copies

= 1 for original and 1 for replica

total_metadata

= 1,000,000 * (100 + 56) * (2) = 312,000,000

total_dataset

= 1,000,000 * (10,000) * (2) = 20,000,000,000

working_set

= 20,000,000,000 * (0.2) = 4,000,000,000

Cluster RAM quota required

= (312,000,000 + 4,000,000,000) * (1+0.25)/(0.85) = 6,341,176,470

This tells you that the RAM requirement for whole cluster is 7GB.
Note that this amount is in addition to the RAM requirements for the operating system and any other software that runs on the cluster nodes.

Index Service Nodes

A node running the Index Service must be sized properly to create and maintain secondary indexes and to perform index scan for N1QL queries.

Similarly to the nodes running the data service, there is a set of questions you need to answer to take care of your application needs:

What is the length of the document key?

Which fields need to be indexed?

Will you be using simple or compound indexes?

What is the minimum, maximum, or average value size of the index field?

How many indexes do you need?

How many documents need to be indexed?

How often do you want compaction to run?

Answers to the above questions can help you better understand the capacity requirement of your cluster and provide a better estimation for sizing.
At Couchbase, we looked at performance data and customer use cases to provide sizing calculations for each of the areas: CPU, Memory, Disk and Disk I/O.

Here is a disk size example:

Table 7. Disk Sizes

Input variable

Value

docID

20 bytes

Number of index fields

1

Secondary index

24 bytes

Number of documents to be indexed

20M

When we calculate disk usage for the above test cases, there are a few factors we need to keep in mind:

Compaction is disabled.
This case illustrates the worst-case scenario for disk usage.

Fragmentation will affect the disk usage.
The larger the fragmentation, the more disk you will need.

Based on our experiment, we noticed that the above index consumes 6GB of disk space.

Query Service Nodes

A node running the Query Service executes queries for your application needs.

Since the Query Service doesn’t need to persist data to disk, there are very minimal resource requirements for disk space and disk I/O.
You only need to consider CPU and memory.

There are a few questions that will help size the cluster:

What types of queries you need to run?

Do you need to run stale=ok or stale=false queries?

Are the queries simple or complex (requiring JOINs, for example)?

What are the throughput and latency requirements for your queries?

Different queries have different resource requirements.
A simple query might return results within milliseconds while a complex query may require several seconds.

At Couchbase, we looked at performance data and customer use cases to provide sizing calculations.

Let’s use CPU/Memory utilization for an example for a Query Service.
Assume that you have a user profile store, which stores user’s name, email and address.
You would like to query based on user’s email address, and you create a secondary index on email.
Now you would like to run a query that looks like this:

Select * from bucket where email = "foo@gmail.com"

By default, N1QL uses stale=ok for a consistency model.

To achieve max throughput of this query, we noticed that it utilized 24 cores completely to achieve an 80% latency of 5ms against a bucket of 20M documents.