What is Real-User Monitoring?

How ExtraHop's real-user monitoring with wire data works to provide some of the most valuable data to businesses in every field.

A Little History

I'll admit it, when I started at ExtraHop I really didn't know much about Real-User Monitoring (RUM), nor did I care to. You see, I'm a bit of a privacy nut and the idea of any code built in to any website or application to track my experience, performance, or whatever is just a big NOPE for me.

You should see the lengths I go, be it extensions or modified settings in my browser, to minimize what information unauthorized sites can collect about me. It's a tad bit ridiculous. To give you a quick example, when people come to me for a quick demonstration or to see something new I'm working on, I generally respond with, "Can we do this at your computer?" The amount of extra clicks I perform to make things work for me quickly causes others to run away. For me, it is normal, for them I am crazy or paranoid or both.

I tell you all of this so you can get a semblance of my initial apprehension when asked to develop a RUM solution for ExtraHop in early 2015. Having said that, I've become a complete convert to Real-User Monitoring. In fact, I am just going to go ahead and say it, YOU NEED THIS. Actually, I'll go one further -- if you don't have some form of RUM built in to your site, you might be the crazy one. You are literally leaving some of the most valuable data about your application, and how your users are interacting with it, on the table.

Skeptical? I don't blame you.

In their recent report Use Data- and Analytics-Centric Processes With a Focus on Wire Data to Future-Proof Availability and Performance Management, Vivek Bhalla & Will Capelli write:

Focus availability and performance processes on assessing the impact of IT system events and behaviors on the end user of applications and IT services, both internal and external to the enterprise. There is no other consistently available end-to-end perspective other than that provided by the end user. Note that the wire data source and intracode instrumentation data source play a crucial role in providing insight into end-user experience.*

For the last year, we at ExtraHop have been extolling on the values of Real-User Monitoring, especially when combined with the power of wire data.

And further, with regard to four factors Gartner mentions driving the importance of wire data, the fourth is:

A focus on end-user experience will become even more definitive of availability and performance management over the next five years than it is at present. While both log and API data sources can deliver insights into end-user experience, the data they provide is almost always taken from server-located files. These are remote from the points where end users are actually accessing applications or services. Therefore, the insights almost always presuppose an extensive chain of inferences. The wire data source, by contrast, can almost always deliver direct observations of end-user experience, because this experience is reflected in the packet header information.*

Wire data is the only data source that is truly unbiased. Combine that with the valuable end-user experience information observed via Real-User Monitoring, and you have the only real end-to-end insight into how your applications are performing. Let's face it, if you are only monitoring your infrastructure, you are missing the entire black box that we call the Internet and how that affects the ways in which your users interact with your applications. Client systems have different OSes, different browsers, different connection bandwidth...all of this can affect how pages are rendered, which is what the user experiences.

As a confirmed privacy nut, I feel good that I can honestly say that the Real-User Monitoring solution that we at ExtraHop have developed delivers real insight without being creepy.

What is Real-User Monitoring?

If you don't know already, you have to be asking, what is Real-User Monitoring? In a nutshell, it is a way to instrument how a user perceives your site or application is performing. If I were to ask you the following questions, could you answer them from a user's perspective?

How long is your website taking to become usable? What about fully loading?

Is everything on your website actually loading?

How long is each user spending on each page?

How many of your users are on a mobile device? Which browser? Is performance different depending on the OS or browser they are using?

How are the network hops between your infrastructure and your user affecting their perception of your application's performance?

How is the overall health of your infrastructure affecting your application's performance?

Are people abandoning your website because things are too slow?

Think about that last question; are you potentially losing revenue because someone got tired of waiting for your website to load? These are important questions, and they can only be answered if you have Real-User Monitoring.

Real-User Monitoring from ExtraHop utilizes a piece of Open Source software called Boomerang. Boomerang is client-side javascript that is loaded as part of your website or application. It determines all of the various timing and performance information that we are interested in, and returns that information back to you in the form of a beacon that can be observed by ExtraHop.

What does Real-User Monitoring look like?

What people really want to know is what RUM looks like, how to use it, and how quickly can you diagnose a problem. The best way to do this is to just dive in. First, I'll show you the top of our Real-User Monitoring dashboard, which shows us the general state of things over a certain time period (the last 30 minutes, in this case):

Nothing terribly exciting going on so far. The Longest Load Time widget shows us that the longest time anyone has had to wait for any page in the last 30 minutes has been 3.52 seconds -- not too bad and within our norm. We can see that our server farm was operating normally via the Server Processing Time widget. There were a couple of Dropped Segments In and RTOs In being reported on our network, but again nothing too scary so far. There is a small spike around 23:09 that looks interesting though and also notice that the Page Load Cancelled widget is not zero. We never want to see that. Let's investigate further.

Hmmm, well that is obvious. See that large, green bar standing out from the rest? Looks like one of our users had a hard time connecting to a part of our infrastructure around that time. A TCP Connect is one of the steps in the life cycle of a web page load and because Boomerang can interrogate the user's browser using the W3C Navigation Timing API, we can see how long each step takes.

Also notice how that single problem affected the Client Processing Time for the same request. In the bottom chart, you can see that there was a delay between the web page becoming Interactive and Complete.

Even more insight with ExtraHop Explore Appliance

If you also have the ExtraHop Explore Appliance, records for each and every transaction are automatically being stored and you can see exactly which page the user had a problem with.

As you can see, this wasn't the slowest page to load for this user's session, but the nature of a slow TCP Connection causes the browser to wait and appear to do nothing. In this instance, the user decided not to wait and hit Stop in his browser. Luckily, it was for a non-revenue generating part of our application, but next time it may not be. Because there weren't any corresponding network or server issues reported in our infrastructure at the same time, we can feel safe saying that this was a blip outside of our control and on the Internet.

What this exercise showed is that sometimes random slow downs do happen, but more often than not, performance problems have a cascading effect. In a lot of cases, a small code change to your application can have unintended consequences. For instance, every browser executes JavaScript slightly differently. When you roll out new functionality in your application, wouldn't you like to know that it isn't adversely affecting your customers that are using an iPhone? You can't know that without using a solution like Real-User Monitoring.

When your IT Operations team decides to upgrade your database server farm that is the backend of your web-based application, wouldn't you like to know that the upgrade didn't break functionality, drastically slow things down, or cause your web servers to start throwing errors? You can't know that without a combination of Real-User Monitoring and wire data analytics.

How do I deploy Real-User Monitoring?

If you're an ExtraHop user, you can download the RUM Bundle from our bundles page, and get plenty of info on how to use it from our forums. Not a customer yet? Check out our interactive online demo and see the power of wire data analytics for yourself.

Conclusion

Wire data analytics is only getting more important for IT Operations Analytics and IT Operations Management, and for businesses with applications that are closely tied to revenue streams, Real-User Monitoring is an important part of the picture. The original release of our RUM solution was a huge step forward, and the newest version is both a great place to start for newcomers, and a great upgrade for those already using RUM in their environments.