Executive Summary

Highly distributed systems with large data stores in the form of NoSQL databases are becoming increasingly important to enterprises, not just to hyperscale organizations. NoSQL databases are being deployed for capturing patient sensors data in health care, smart meter analysis in utilities, customer sentiment analysis in retail, and various other use cases in different industries. NoSQL database systems help organizations store, manage, and analyze huge amounts of data on distributed system architecture. The sheer volume of data and the distributed system design needed to manage this large data at a reasonable cost necessitated a different category of database systems, leading to NoSQL databases. Oracle NoSQL Database is part of the NoSQL database family and is based on a distributed, key-value architecture.

Because enterprises evaluate and assess new technologies for enterprise-wide adaptability, Yahoo Cloud Serving Benchmark (YCSB) is the standard benchmark tool employed for testing and is the same tool used in this paper to evaluate Oracle NoSQL Database for YCSB Benchmark Testing.

Analysis and discussion are provided for throughput and latency testing results with YCSB.

Fusion ioMemory SX300 PCIe Application Accelerators

Performance tests were conducted using a single Fusion ioMemory SX300-1600 PCIe device per server, on a three-node Oracle NoSQL Database cluster setup. The Fusion ioMemory SX300 PCIe series is Generation 3 of the world-renowned Fusion ioMemory platform. It has an intelligent controller that supports all major operating systems, including Microsoft Windows, UNIX, Linux, VMware ESXi, and Microsoft Hyper-V. The Fusion ioMemory SX300 PCIe series is available in sizes ranging from 1.25 to 6.4 TB of addressable, persistent flash storage. It provides a random read/write performance up to 235K/375K IOPS, with 92-microsecond read latency and 15-microsecond write latency.

Oracle NoSQL Database
Oracle NoSQL Database uses a simple data model, with a paradigm of key-value pairs of major and minor keys. Each key-value pair is hashed into multiple buckets, called partitions. These partitions are mapped to a replication group, similar to logical grouping for the data subset. A set of replication nodes provides both high availability and read scalability for each replication group. A storage node runs these replication nodes, which have one master and two replica nodes on each storage node. Each application or client is intelligently connected to the correct replica node by the NoSQL database driver. Oracle NoSQL Database also provides the flexibility to match to the customer requirements, emphasizing consistency and durability. The figure below shows how the keys are hashed into the partition and grouped together.

Test Configuration Details

The benchmark test environment consists of a three-node Oracle cluster setup on three Lenovo System x3650 servers. Each of these servers has 16 Intel Xeon E5-2690 processors with 128 GB RAM. A 10 GbE network interconnect is used for intra-node communication in the Oracle NoSQL Database cluster and YCSB client.

Storage for the large Oracle NoSQL Database data store is initially configured with traditional spinning disks and later switched to Fusion ioMemory. An XFS file system is created for hosting the Oracle NoSQL Database data store.

The table below provides the complete hardware and software components used for this test environment.

Bootstrapping Storage Nodes to Host the NoSQL Instance
Before hosting Oracle NoSQL Database instances, you must bootstrap the services. As discussed previously, capacity was defined as 2, which stores the Oracle NoSQL Database key values in two separate storage directories (data1 and data2) on the storage node. A Storage Node Agent (SNA) process runs on each storage node to facilitate communications between other SNA processes and the client applications. The SNA listens to the registry port specified with the –port parameter, as well as –admin and –harange ports for administration and replication processes.

Configuring Key-Value Stores
When all the storage nodes have been created and started, the storage nodes need to be configured to work as one distributed cluster of the key-value store. This key-value store is built based on the initial planning of replication nodes and number of shards participating in the cluster.

Name the KVSTORE as “kvstore”.

Create and deploy the datacenter as “California” with a replication factor of 3.

Deploy a storage node and create a Storage Node Pool, such as “Mypool”

Join “Mypool” to the storage node.

Repeat the above steps on each server.

Join storage nodes from other nodes with “Mypool” created on the first server.

Create a topology with all storage nodes connected to Mypool and with 100 partitions.

NoSQL Cluster Verification
Oracle NoSQL Database provides both a command line and an admin console for cluster deployment verification and management. The following commands provide cluster verification for both options.

YCSB Test Configuration Setup

The figure below shows the YCSB configuration setup for the three-node Oracle NoSQL Database cluster benchmark. YCSB consists of a workload parameter file, which defines the type of workload to be executed, a read-write percentage and a dataset size to be used. YCSB also provides command-line parameters that can be passed during execution, such as the type of NoSQL to be used, YCSB client threads, etc.

The testing configuration loads the Oracle NoSQL Database cluster with dataset sizes of 32 GB, 128 GB, 256 GB, and 1 TB. Then workloads A, B, and C are executed for both Uniform and Zipfian distribution. To evaluate Fusion ioMemory benefits for Oracle NoSQL Database – including small to large datasets, heavy writes to heavy reads, and uniform workload distribution to Zipfian distribution – its performance was compared with that of standard spinning disks.

All workload combinations resulted in 24 different tests for this benchmark. This performance evaluation helps organizations provide sufficient data points to meet their performance evaluation criteria, based on application workload demands.

YCSB Test Results and Analysis

This section analyzes the YCSB benchmarking with an Oracle NoSQL Database cluster using Fusion ioMemory storage.

1 Billion Records Operation Type

Storage Type

Ops/Sec

Latency (ms)

Total time (ms)

Insert

HDD

71,083

6

14,405,597

Fusion ioMemory Devices

84,004

5

9,650,287

Delete

HDD

85,603

9

11,962,175

Fusion ioMemory Devices

104,942

8

9,757,765

Table 2:Data loading and latency results summary for a 1 TB dataset

Chart 1:Data loading and latency results summary for a 1 TB dataset

The first benchmarking step was to load the dataset to an Oracle NoSQL Database cluster, which was done using the YCSB load routine for all full datasets. The largest dataset loaded to the Oracle NoSQL Database cluster for both hard disk and Fusion ioMemory devices is 1 TB, and its test results are captured for analysis in Table 1 and Chart 1.

As shown in the chart above, a 1 TB dataset of 1 billion records on spinning disk is ingested in 14,405 seconds, with 71,083 Ops/sec and 6 ms latency. For the same 1 TB dataset on Fusion ioMemory devices, data ingestion completes in 9,650 seconds. This is 33% faster, with 84,004 Ops/sec at 5 ms latency. As shown in Table 4, delete operations are completed 18% faster. This result is important for key-value data stores, which capture session data for web applications and data from devices such as patient sensors and load the data into the Oracle NoSQL Database datastore.

Workload A: 50% Read / 50% Write

This test evaluates the performance benefits of Fusion ioMemory devices for Oracle NoSQL Database cluster, with balanced workloads. Throughput test results for Fusion ioMemory vs. HDDs are shown in Table 3 and plotted in Chart 2, and latency results are shown in Table 4 and Chart 3. The 32GB dataset test involves in-memory operations and hence both HDD and Fusion ioMemory results are in the same range. Fusion ioMemory provides only a marginal 1.4x performance benefit over hard disk drives for the 32GB test case.

Data (Keys) required for this mixed workload operation are served from the server memory with minimal I/O operations from the disk.

Storage

Workload Distribution Type

Throughput Ops/Sec

32 GB

256 GB

512 GB

1 TB

HDD

Uniform

90,600

13,073

10,468

8,009

HDD

Zipfian

80,798

26,359

20,765

16,598

Fusion ioMemory

Uniform

127,125

96,045

94,470

88,118

Fusion ioMemory

Zipfian

90,341

92,829

88,376

84,760

Table 3:Throughput results summary for Workload A

Chart 2:Workload A (Zipfian) throughput test results

As the dataset increases from 32 GB to 256 GB, the 512 GB to 1 TB Fusion ioMemory storage provides a significant performance benefit, with a minimal drop in throughput and a marginal increase in latency.

Storage Type

Workload Distribution Type

Latency in Milliseconds

32 GB

256 GB

512 GB

1 TB

HDD

Uniform

9

229

243

288

HDD

Zipfian

6

159

186

194

Fusion ioMemory

Uniform

9

14

14

14

Fusion ioMemory

Zipfian

9

8

8

10

Table 4:Latency results summary for Workload A

Chart 3:Workload A (Zipfian) latency test results

Fusion ioMemory storage provides up to a 3x to 11x throughput benefit, with a 16x to 23x latency reduction compared to HDD operations. For larger datasets, HDD performance drops considerably: a 32 GB dataset on HDD provided 80,798 ops/sec with 6 ms latency, but a 1 TB dataset provided only 16,598 IOPS with 194 ms latency. That is nearly a 5x drop in performance and 32x increase in latency. The performance advantage of Fusion ioMemory storage is useful for web applications using Oracle NoSQL Database, such as in immediate capture of user session actions.

Workload B: 95% Read / 5% Write

Workload B is a read-intensive mixed workload with 95% read and 5% write operations. Throughput test results for Fusion ioMemory vs. HDDs are shown in Table 5 and plotted in Chart 4, and latency results are shown in Table 6 and Chart 5. In this workload, Fusion ioMemory storage provides up to a 19x to 48x improvement in throughput operations and a 16 to 45x reduction in latency, compared to hard disks, for datasets ranging from 256 GB to 1 TB. (For a 32 GB dataset completely residing in memory, the throughput and latency metrics remain the same for both Fusion ioMemory and HDD storage.) This improved performance is beneficial for applications such as photo tagging, where adding a tag is a write operation, but all the other operations are reads.

Storage types

Workload Distribution Type

Throughput Ops/Sec

32 GB

256 GB

512 GB

1 TB

HDD

Uniform

313,733

6,714

4,895

4,695

HDD

Zipfian

304,496

7,466

5,656

9,090

Fusion ioMemory

Uniform

318,600

131,234

140,801

176,354

Fusion ioMemory

Zipfian

312,345

255,217

275,292

283,243

Table 5:Throughput results summary for Workload B

Chart 4:Workload B (Zipfian) throughput test results

Storage types

Workload Distribution Type

Latency in Milliseconds

32 GB

256 GB

512 GB

1 TB

HDD

Uniform

9

238

229

243

HDD

Zipfian

6

165

159

186

Fusion ioMemory

Uniform

9

10

14

14

Fusion ioMemory

Zipfian

9

9

8

8

Table 6:Latency results summary for Workload B

Chart 5:Workload B (Zipfian) latency test results

Workload C: 100% Read / 0% Write

Workload C is a read-only workload with 100% read operations. Generally this kind of workload is used for caching user profiles in web applications. Table 7 and Chart 6 provide the throughput performance numbers, and Table 8 and Chart 7 capture the latency metrics. For larger datasets from 256 GB to 1 TB, Fusion ioMemory devices provided a 17x to 48x performance advantage, with latency improvements from 30x to 100x, compared to the same operations on HDDs.

Storage types

Workload Distribution Type

Throughput Ops/Sec

32 GB

256 GB

512 GB

1 TB

HDD

Uniform

377,187

7,291

5,360

4,397

HDD

Zipfian

378,075

14,723

10,080

8,286

Fusion ioMemory

Uniform

378,889

131,948

157,830

215,180

Fusion ioMemory

Zipfian

379,408

251,450

295,594

357,685

Table 7:Throughput results summary for Workload C

Chart 6:Workload B (Zipfian) Latency test results

Storage types

Workload Distribution Type

Latency in Milliseconds

32 GB

256 GB

512 GB

1 TB

HDD

Uniform

1

216

229

243

HDD

Zipfian

1

190

159

186

Fusion ioMemory

Uniform

1

7

7

5

Fusion ioMemory

Zipfian

1

6

6

4

Table 8:Latency results summary of Workload C

Chart 7:Workload B (Zipfian) latency test results

Scalability Tests

The maximum performance limit of this 3-node Oracle NoSQL Database cluster setup was tested with Fusion ioMemory storage by increasing from 2 shards to 3 shards. This configuration resulted in each server having one shard with one master, and two replicas of other shard nodes, as shown in the figure below.

With this setup a mixed workload was run; Workload B (95% read / 5% write) results were shown previously. As seen in Chart 8 below, 3 shards provide a 45% to 79% improvement in throughput IOPS and a 35% latency reduction for dataset sizes of 256 GB to 1 TB. Additional shards participating in the mixed workload increase the performance and reduce latency.

Chart 8:Oracle NoSQL Database, 2- and 3-shard scalability results

Conclusion

Enterprises are increasingly adopting NoSQL databases in various industry verticals such as healthcare (patient sensors), automotive (auto sensors), utilities (smart meter analysis), etc. The YCSB benchmark is well known and used across the industry for evaluating distributed NoSQL databases. Oracle NoSQL Database is an excellent choice; it’s easy to deploy, with a simple key-value data model. It provides flexibility for consistency and durability, based on organization requirements. The performance analysis in this white paper demonstrates that Oracle NoSQL Database with Fusion ioMemory devices is a perfect combination to deliver both high performance and scalability benefits. For large dataset operations with mixed, write-intensive, or read-intensive workloads, Fusion ioMemory products deliver up to 11x to 48x improvement in performance. This also translates to fewer servers with higher performance storage for optimized NoSQL cluster deployment.