The motivation for this work harks back to a Guerrilla forum in 2014 that essentially centered on the same topic as the title of our paper. It was clear from that discussion that commenters were talking at cross purposes because of misunderstandings on many levels. I had already written a much earlier blog post on the key queue-theoretic concept, viz., holding the $N/Z$ ratio constant as the load $N$ is increased, but I was incapable of describing how that concept should be implemented in a real load-testing environment.

On the other hand, I knew that Jim Brady had presented a similar implementation in his 2012 CMG paper, based on a statistical analysis of the load-generation traffic. There were a few details that I couldn't quite reconcile in Jim's paper but, at the CMG 2015 conference in San Antonia, I suggested that we should combine our separate approaches and aim at a definitive work on the subject. After nine months gestation (ugh!), this 30-page paper is the result.

Although our paper doesn't contain any new invention, per se, the novelty lies in how we needed to bring together so many disparate and subtle concepts in precisely the correct way to reach a complete and consistent methodology. The complexity of this task was far greater than either of us had imagined at the outset. The hyperlinked Glossary should help with the terminology, but because there are so many interrelated parts, I've put together the following crib notes in an effort to help performance engineers get through it (since they're the ones that most stand to benefit).

Standard load testing tools have a finite number of virtual users

Web traffic is characterized by an indeterminate number of users

Attention is usually focused on the performance of the SUT (system under test)

We focus on the DVR (driver) side performance for web traffic

Examine distribution of arriving requests and their mean rate

Web traffic should be a Poisson process (just like A.K. Erlang used in 1909)

Check the traffic is indeed Poisson by measuring the coefficient of variation ($CoV$)

Must have $CoV = 1$ for a Poisson process (Principle B in the paper)

Originally, I assumed the paper would be no more than a third it's current length but, try as we might, that was not to be. My only defense is: it's all there, you just need to read it. Apologies in advance, but hopefully, the crib notes will help.

For emulating IoT traffic, most I believe are using non-standard tooling to emulate that devices. For example, a cluster to run containers that have device logic in them. So, they may indeed test up to the actual number of emulated devices vs. simulating with a combination of virtual users and think time. So this may be a difference in approach.

[*] http://perfdynamics.blogspot.com/2015/03/hadoop-scalability-challenges.html. Big Data is usually part of IoT solution.

With this type of complexity, it seems critical to have telemetry needed to do performance and scalability analysis.

1. Ideally you would capture telemetry for each message with timestamps at key points in data flow, including correlation id for end to end visibility 2. Or if that is not possible (because development has not been done, or throughput is too high) just capture metrics that are needed for performance and scalability analysis:At important points along message data flow capture: time, count, rate. What metrics would you recommend capturing in a messaging system like this? For example in (#2) in stream ingestion code, this may be cluster of N VMs mapped 1:1 to a partition in IoT Hub, each processing messaging in batches of X messages. Then sending for further processing to (#3) Stream Analytics or for storage, analytics to (#4) Data