Industrial IoT

The Internet transformed how people communicate, what they do and how they work together. Now, it is connecting machines and devices together into intelligent systems that will transform the world. These connected systems make up the Industrial IoT (IIoT).

Products

RTI provides the intelligent connectivity framework designed for the Industrial IoT. RTI Connext DDS delivers the edge-to-cloud connectivity software needed to streamline, control and monitor the most demanding IIoT systems.

Services

There’s no room for error in designing, integrating and deploying mission-critical Industrial IoT systems. The Professional Services team works with RTI customers to increase efficiency and drive project success.

Developers

From downloads to Hello World, we've got you covered. Find all of the tutorials, documentation, peer conversations and inspiration you need to get started using Connext DDS today.

Resources

RTI provides a broad range of technical resources to learn about the RTI Connext product line and its underlying Object Management Group (OMG) Data Distribution Service (DDS) technology.

Company

RTI is the Industrial Internet of Things connectivity company. Across industries and across the world, companies trust RTI software and services to make their mission-critical applications work as one.

Profiling Distributed Applications with Perf

I, like many developers, have been in situations where I needed to take an existing application and make it faster–basically by removing slow code and replacing it with fast code. I know now to follow one simple rule when it comes to optimizing code:

Whatever code I think is slowing down the application, is where I should look last.

Profiling is a trade that makes you come to terms with the limitations of your intuition very quickly. I realized early on I needed cold, hard measurements to tell me which parts of my code needed optimizing. Fortunately there is a plethora of profiling tools available that can measure just about anything related to how your code is running.

Tools, however, do not necessarily make profiling easy. Interpreting measurements can be tricky, and variables need to be tightly controlled when conducting experiments. In particular, multi-threaded and distributed applications are hard to profile.

Anyone who ever had to debug a race-condition will be familiar with how time-sensitive the behavior of multi-threaded applications is. Profiling multi-threaded applications has similar challenges, as timing becomes a significant factor in the measurement.

Profilers like callgrind slow down your program significantly, and therefore impact timing. An example that shows a limitation of such profilers is mutex contention. Your application may run slowly because a mutex is being heavily used, causing your code to spend a lot of time in lock functions. A tool like callgrind would not reveal this, as it counts instructions, not time.

There is another class of profilers which do “statistical profiling.” These profilers allow you to run your program like you normally would, while taking periodic snapshots of where the application is spending its time. These profilers need to run for some time to produce accurate results, but can do so with minimal impact on timing. That makes them a great fit for profiling multithreaded and/or distributed applications!

I wanted to share a profiling workflow using the Linux perf tool, that I found to be especially useful as it allows me to quickly identify performance “hotspots.” I will use the c/hello_dynamic example from RTI Connext 5.3.0 as a target for measuring performance.

First, make sure that perf is installed on your Linux machine. On Ubuntu, I had to run this command to install perf on my machine:

Next, you need to download a GitHub project that can convert the output from perf to what is called a “FlameGraph,” which is a visual representation of the collected profiling data. Run this command from a location that is convenient to access (like your home directory):

Now navigate go to the hello_dynamic example in the rti_workspace/examples/c folder. Build the code with these commands (make sure NDDSHOME is set to the RTI Connext installation):

export DEBUG=1make -f makefile_Hello_x64Linux3gcc4.8.2

The platform in the name of the makefile might be different from your platform. Note how we set the DEBUG environment variable. We do this so that the binary has debugging symbols, which will allow us to see the names of functions in the callstacks that perf outputs.

After some time, hit control-C to exit the publisher. Perf will have produced a file called “perf.out”. We now need to translate this file into something the FlameGraph tool understands, using a script from the FlameGraph repository:

perf script -f | ~/FlameGraph/stackcollapse-perf.pl > out.perf-folded

From here, we can generate the FlameGraph image:

~/FlameGraph/flamegraph.pl out.perf-folded > perf.svg

When you open the perf.svg file in a webbrowser, it should look something like this:

The horizontal axis represents the time spent in a particular function, whereas the stacked bars represent the call stack of your application. You can click on each bar to zoom in to that particular stack.

Try re-running the publisher, but without a subscriber. You will notice that the right part of the flamegraph will disappear, as DDS does not send out any data when there are no subscribers!

The perf tool can do a lot more than what this blog describes. If you know of other settings or tools that have made your profiling life easier, let us know!