Days in the life of a professional packet shepherd.

998 Extra Eyes on Your Applications

Last September one of the vendors that presented at Networking Field Day 6 was ThousandEyes, a San Francisco-based company founded in 2010 where the startup vibe thrived.

ThousandEyes is one of several cloud-based monitoring solutions out there, but what I found unique is that they have something for everyone. Is your company using cloud-based SaaS (Software as a Service) applications? ThousandEyes can let you monitor the performance and reachability of your SaaS applications from your corporate office, giving you continuous metrics on the services your company relies on. Is your company the purveyor of cloud-based apps? In that case, ThousandEyes can provide a view into your application hosting environment, including network path and application-level metrics, from a multitude of test points around the world.

The ThousandEyes presentation at NFD6 was impressive. It was clear that they had some extremely smart brains working on their product. ThousandEyes researchers are using various mathematical models to analyze BGP behavior and infer network conditions. They are working with new network testing concepts such as Paris Traceroute to further improve their metrics and understanding of network paths. ThousandEyes blends much of this data into something they call “X-Layer” which is intended to pinpoint application performance issues no matter where in the network or application stack they may lie.

ThousandEyes employs hosted probes located in data centers around the world to do network and application reachability and performance tests and watch BGP routing data. In addition, ThousandEyes provides the option for an enterprise to run their own private agents (a Linux package or virtual appliance) that can be used in a variety of ways. These software agents require few resources and when I asked about the possibility of running the ThousandEyes package on Linux-based network appliances, I was told this should be possible (probably assuming some CPU processor architecture requirements).

Following my introduction to ThousandEyes, I showed one of my consulting clients a bit about the product. After a free trial, they ended up becoming a ThousandEyes customer, which has given me some opportunity to get a bit of real-world stick time with the service.

One of my favorite features is the ability for a ThousandEyes customer to send test results or a problem report to a non-ThousandEyes-customer. The receiver can click a link and actually access the ThousandEyes interface in order to view details and interact with the system. This is a huge improvement over just emailing some static bit of data to someone, as it provides the recipient the opportunity to drill in, change parameters, and use the ThousandEyes data and UI to assist in their analysis of the reported event.

As a good example, just this morning I got an email from the ThousandEyes system that looked like this:

A shared test email notice

I clicked the link, and was presented with a browser view of a BGP event at this client’s site from early this morning. The selected spot on the timeline shows the BGP path into this client’s AS just before the route change event. You can see the peak shortly after indicating a number of path changes during an upcoming sample period:

By moving the sliders toward the middle of the screen I was able to “explode” the summaized nodes shown in the BGP diagram, resulting in a more detailed view from the perspective of the next few hops out from my customer’s AS:

Next, I zoomed in on the timeline and selected the spot in the middle of the routing convergence event:

The view dynamically changed to show what happened next. Clearly, the path into my customer’s AS moved from 7922/33287 (Comcast) to AS 209 (Qwest). I believe the red shading on the probes indicates a loss of reachability to the target prefix from those TE probe points. Also, in this next screenshot, notice the little “chat bubble” on the left edge of the screen:

Clicking the chat bubble shows me the text that my client had added to the problem view (which was also in the body of the email alert I received). In the next screenshot, you can see I selected the “updates” metric (number of BGP updates about this prefix). I used this info, plus the log messages the customer provided me in the “chat” notes, to provide a likely diagnosis of the event.

Unfortunately, after clicking the “Comment” button, I got an error that I didn’t have permissions to add a comment, so something might be wrong there. I just emailed the update to my client contact and will follow-up with them about the commenting issue.(UPDATE: Just after posting this, I got an email from Ricardo Oliveira, Co-Founder and CTO of ThousandEyes. He let me know that blocking a comment from a non-TE-user is actually by design right now, but they will be adding a knob in the future to let customers choose whether to let non-TE-users add comments to shares or not. Thanks Ricardo!)

The ThousandEyes sharing feature allowed me, a third-party, the ability to use ThousandEyes to manipulate the available data, and to assist me in providing my customer a rapid response. The ThousandEyes customer can set parameters around the sharing including how much “context” data is included before or after the event, how long the shared data is valid to be viewed publicly, etc. At NFD6, ThousandEyes even indicated that customers can use the built-in messaging system to ask ThousandEyes reps to help them diagnose an issue. Of course, ThousandEyes can also generate reports, trigger alerts based on an event or threshold, and provide a “big board” dashboard view.

This specific example shows just a BGP routing event, but in a future post I’ll also show ThousandEyes’ excellent web-application performance monitoring and explain how my client, which provides a web-based app to their customers, uses it. The truth is, the various metrics I’ve shown above and the application stats/tests that ThousandEyes also provides can mostly be found through other means or other services but I think ThousandEyes’ key strengths lie in their ability to effectively tie it all together (everything from network path through application-level transaction processing time) in a common view that a user can drill in and out on to find the data they need, and providing a fast, clean, slick, web-based interface to do that analysis. Further, the excellent sharing capabilities provide a level of collaboration that I’ve not seen in other web-based performance monitoring tools.

As you might imagine, the product is evolving and there are still challenges to be overcome. At the NFD6 presentation Ivan Pepelnjak, Ed Horley, and I gave the head brains at ThousandEyes a good grilling on issues such as multipathing, IPv6, and various protocol considerations. Despite this, I came away impressed with the products, as well as the ThousandEyes leadership team’s understanding of the issues we raised.

ThousandEyes was a sponsor of Networking Field Day 6. In addition to a presentation, ThousandEyes provided me a branded tee shirt and a reusable water bottle. At no time did they ask for, nor where they promised any kind of consideration in the writing of this review. The opinions and analysis provided within are my own and any errors or omissions are mine and mine alone.

Advertisements

Share this:

Like this:

Related

3 thoughts on “998 Extra Eyes on Your Applications”

Very cool! I don’t understand everything, but I realize that the details are there for those who need them, namely my trustworthy CCIE consultant. I adore the fact that this could help me, an SMB who relies on cloud-based services for all of my companies core IT functions!

Hey Jim, thanks for reading! I’ve actually been meaning to mention this service to you since you’ve become so cloud-dependent. You’d actually asked me at one point about performance monitoring for your cloud services and I think ThousandEyes is definitely worth a look for you.