DDS Hands On

Introduction

In 2009, I downloaded RTI Inc.’s Data Distribution Service (DDS) communication middleware software developer kit along with a free 30 day developer license. RTI’s product supports a myriad of server and embedded operating systems (plus .NET and JVM) and languages, but I chose to experiment with the C++ API libraries for the Win32 platform. The purpose of this blog page is to recount my hands-on experience using the RTI DDS middleware – over a weekend and on my own time.

Getting Started

After downloading the package from the RTI site, I ran the windows installation utility and chose to install all the files in the default installation folder at c:\program files\RTI\. In addition to the binary DDS libraries and several distributed application C++, C, C# and Java program samples, the following documentation files are provided by RTI:
• Getting Started
• Release Notes
• Platform Notes
• What’s new
• User’s Manual
• C, C++, .NET, and Java API Reference Manuals (both HTML and pdf versions)

Note that unlike other messaging solutions like CORBA and JMS, DDS is a truly distributed middleware communication solution. In other words, there are no brokers or proxies that need to be downloaded, installed, configured, and launched into the background before running applications. All the functions of a classical broker, including auto-discovery of participants, address/port handling and naming (identifying data types and topics) are handled in a fully distributed, symmetric, and reliable fashion behind the scenes – hidden from distributed application developers (but tunable either programatically during runtime or by a statically loaded XML configuration file).

Running The Distributed Hello_Simple Application On One Machine

The simplest example application provided in the RTI distro is named Hello_Simple. I loaded the provided solution file into the MS Visual C++ 2008 Express IDE and built the 2 projects in the solution. At the completion of the build process, the following two 3MB executables were created: HelloPublisher.exe and HelloSubscriber.exe. The figure below shows the single machine test environment that I used to run the Hello_Simple application processes. For test purposes, I chose to statically link the DDS binary libraries with the application code.

I opened two command console windows and launched the HelloPublisher.exe app in one window and the HelloSubscriber.exe app in the other. The figure below shows the window outputs after I typed in the message: “hello Mr. DaSilva“.

For grins, I opened a third console window and launched a second copy of HelloSubscriber.exe. The result is shown below.

After running the Hello_Simple pub and sub components, I opened the source code in Visual C++ Express editor and browsed through it. The Hellopublisher.cpp file was a mere 128 lines long and the Hellosubscriber.cpp file was 160 lines long. I wondered how big the source files would be for an equivalent CORBA-based program pair. In a nutshell, the publisher source code performs the following sequence of actions:

• Creates a DDS domain participant
• Creates the “Hello World” topic
• Creates a data writer for the topic
• Loops on user console input and invokes the data writer to publish each user input string as a sample of the “Hello World” topic

The subscriber source code performs the following sequence of actions:

• Creates a domain participant
• Creates the “Hello World” topic
• Creates a data reader
• Creates an asynchronous listener
• Binds the listener to the data reader
• Sleeps until the DDS middleware invokes a listener method to process the received sample of the “Hello World” topic
• Sleeps until the next DDS notification

In addition to providing asynchronous notification of middleware events, DDS supports synchronous polled listening if the app requires it. Notice that neither the publisher nor subscriber has to register with a broker or lookup any service in a registry. DDS is a simple and sweet boon to maintenance/change because it also provides built-in auto-discovery of participants and topics.

Running The Distributed Hello_Dynamic Application On A Distributed Network

Running DDS-based pub-sub applications on one machine is fun, but the real rush occurs when you distribute the app component processes and run them on a network of processor nodes. The figure below shows the physical configuration of my home network.

In addition to Hello_Simple, RTI provides another distributed application example named Hello_Dynamic. This app is comprised of 780 SLOC in 8 source files. Hello_Dynamic implements a simple throughput test where the publisher continuously sends topic samples to the subscriber(s), which periodically prints out some basic statistics at one second intervals. After building the lone executable that serves as either a publisher or subscriber (via command line argument entry), I manually distributed the app to each of the 3 networked machines. It was as simple as copying the 3 MB hello.exe file along with the 30 day license file to each of the 3 networked machines.

The figure below shows the console window output of the subscriber instance before the publisher was launched. It shows the list of metrics that the subscriber calculates on the fly during runtime at one second intervals. Note the rightmost column that reads “Throughput Mbps“. Also note that the default Quality of Service (QoS) profile of “best effort delivery” was used for all subsequent testing.

The figure below shows snapshots of the scrolling publisher and subscriber windows during runtime. For this specific snapshot, a 1024 byte message size was used and both the pub and sub application components were running on the same machine; hence the large 200+ Mbps throughput values.

Network Throughput Test Results

With the distributed Hello.exe DDS app installed on each of the three networked machines, I ran some throughput tests on different system deployment configurations. The results are shown in the two figures below. The first drawing summarizes the baseline throughput that I measured on the single machine configuration. The second shows a three machine deployment with the publisher running on one node and a varying number of subscribers running on the other two nodes.

Note that as expected, the larger 4KB message size yields faster throughput than the 2KB message (because of less overhead per message traversing the TCP/IP stack) . Also note the factor of 20 decrease in throughput that comes with the tyranny of geographical distribution. As the rule of thumb goes: “Don’t distribute if you don’t have to“.

During my exploratory testing, I tried to launch 2 redundant publishers in order to explore the built-in fault detection and auto-failover capability of DDS, but the simple Hello.exe example application wasn’t designed to leverage the capability. When I launched a second publisher while the first one was still transmitting messages, each subscriber detected the second publisher and gracefully exited with the following message:

“Detected multiple publishers, or the publisher was restarted. If you have multiple publishers on the network or you restart the publisher, the statistics produced won’t be accurate. Done.“

Playin’ With The Source Code

“Latency” is yet another distributed application example that comes with the DDS development kit. In the Latency app, the publisher process records the time of each message transmission just before it’s pushed into the middleware for external transmission. The Latency subscriber is designed to immediately echo each message back to the publisher, which timestamps the received echo and computes latency as half the value of the two way round trip time.

I successfully built the RTI Latency publisher and subscriber components and then launched them each (publisher first, as per the instructions) in their own command window on the same Win32 box. They ran fine, but for some reason they didn’t sync up. The publisher kept waiting to detect a subscriber even though the subscriber was already running. After perusing the source code and seeing several calls to a proprietary RTI DDS API extension, I decided to give up trying to measure latency with the example application.

Next, I sketched out the design of a simple latency-measuring app that I planned to build and try out. The sequence diagram below shows what I was planning to code up.

I cloned the source code for the RTI Hello_Simple app and started coding up the design. I was able to successfully get the TrkMsgReflector to echo back each received message. I was also able to get the TrkMsgPublisher to receive and timestamp each echo. However, I couldn’t get the app to run consistently. On both sides, I kept getting lots of interlaced “unable to take data from the data reader, error 11” return codes from the DDS API. I don’t have much doubt that I could’ve gotten the app to run consistently and output some latency measurements, but, because I actually do like to do other things on weekends and in my spare time, I decided to call it quits.

End Game

I had a blast playin’ with RTI’s powerful implementation of the OMG DDS middleware standard. Despite the OMG DDS standard’s sophistication, the RTI implementation is a high quality and relatively easy to use product that comes with great programmer documentation. Even though it may be pricey in the short run, it sure seems to fit the bill for developing and (most importantly) maintaining long-lived, data-centric, loosely-coupled, high performance, scalable, distributed applications. As the saying goes, “you get what you pay for“.

I surely don’t know everything about distributed communication middleware solutions, but from what little that I do know, there’s probably not a better performing or easier to use product on the market today. Hats off to RTI Inc. for a job well done.

BTW, there are at least two other implementations of the DDS standard available for experimentation, benchmarking, and learning. One is supported by OCI and it’s named OpenDDS. The other is offered by PrismTech and it’s named OpenSplice. Both are open source and freely downloadable from the web. If you’re interested in distributed software system development, give one of these products a whirl – and report back your experience to me – please.

RTI engineer here: Just came across your blog, and how cool! I’m glad you liked our product and found it useful – that’s probably as much of a rush as getting data to move across the network the first time…