Infrastructure monitoring lets you see one system level stats like CPU, Memory, Disk, etc. Combined with the DripStat APM, this gets you the most detailed 360 degree view of your systems.

The new 'Servers' tab in the UI lets you access it.

Slice and Dice across Dynamic Infrastructure

We described this in detail when we announced the preview a few months ago. Essentially DripStat EXPLORE technology allows you to Slice n Dice through data of multiple hosts in real time with a single interfact.

This makes it easy to compare and pin point root cause of your issues across your entire infrastructure.

Detailed Per Process Stats

We have since also added extremely detailed 'per process' stats, which allow you to see per process:

CPU

Memory

Thread Count

Instance Count

Disk I/O

File Descriptors

Pricing

DripStat Infrastructure monitoring will be priced separately from DripStat APM. However, it is completely FREE through all of August. We will be putting up a separate pricing page shortly.

Whats Next

We will be adding a ton of features very rapidly to DripStat Infra in the next few weeks. Expects lots of updates to the UI. However the Infrastructure Monitoring Agent is extremely stable and has been tested heavily, so go ahead and use it in production.

Keep Dripping!

]]>While Application Monitoring (APM) does a great job of showing you code level stats, one still needs to see system level stats like CPU, Memory, Disk, etc. We are launching DripStat Infrastructure Monitoring to allow you to get the most detailed 360 degree view of your systems.

Slice and Dice,

]]>http://blog.dripstat.com/introducing-dripstat-infrastructure-monitoring/170671a0-4190-475e-be87-8f37e24c10d9Thu, 13 Apr 2017 20:41:23 GMTWhile Application Monitoring (APM) does a great job of showing you code level stats, one still needs to see system level stats like CPU, Memory, Disk, etc. We are launching DripStat Infrastructure Monitoring to allow you to get the most detailed 360 degree view of your systems.

Slice and Dice, with a unified dynamic interface

DripStat EXPLORE technology allows you to Slice n Dice through data in real time. We designed the entire Infrastructure Monitoring experience with this in mind.

Infrastructure Monitoring exposes a single unified interface that dynamically adapts itself as you view details of single or multiple servers. The result is an experience that is extremely intuitive and fluid.

How it works

1 - Select the category of data you want to view.

2 - Select the hosts you want to view the data for.

1 - Single Host View

If you select a single host, a set of highly detailed graphs for that host is presented.

2 - Multiple Host View

If you select multiple hosts, the graphs change to allow you to easily compare the stats across multiple hosts.

Another example is the 'Disks' tab. Selecting a single host shows data on a per disk basis for that host.

Selecting multiple hosts however, changes the stats shown to aggregate of all the disks connected to each host. This allows you to easily compare various disk stats across those hosts.

Plugins

We will soon have plugins for all kinds of individual infrastructure components, eg Redis, Kafka, SQL, etc, to allow you to view detailed system level stats of each of your components. These will be maintained and supported directly by us.

Join the Preview

Today we are announcing the Preview of Infrastructure Monitoring. Infrastructure Monitoring will be priced and sold separately from the DripStat APM product. The final release should be around the end of this month.

]]>Scalability Report graphs now have a checkbox to 'Hide Outliers', which is enabled by default. This allows a much better visualization of the scalability of a metric since the outliers are filtered out from the graph.

With Outliers Visible

With Outliers Hidden

]]>http://blog.dripstat.com/improved-scalability-report/0f8ab563-daa3-4dcb-a776-4c96fdda336cWed, 21 Dec 2016 20:25:58 GMTScalability Report graphs now have a checkbox to 'Hide Outliers', which is enabled by default. This allows a much better visualization of the scalability of a metric since the outliers are filtered out from the graph.

With Outliers Visible

With Outliers Hidden

]]>Percentiles have been the number 1 most requested feature for DripStat. Today we are glad to finally bring it to you! Percentiles are extremely helpful in letting you understand the 'worst case' performance of your application.

There is now a 'Percentiles' tab in the Transactions and Application Overview dashboard.

Accessing

]]>http://blog.dripstat.com/introducing-percentiles/203a8fcb-484b-4f7f-ad12-dd1aec24dd22Mon, 28 Nov 2016 03:50:37 GMTPercentiles have been the number 1 most requested feature for DripStat. Today we are glad to finally bring it to you! Percentiles are extremely helpful in letting you understand the 'worst case' performance of your application.

There is now a 'Percentiles' tab in the Transactions and Application Overview dashboard.

Accessing it shows you the Median, 90th, 95th and 99th percentile response times.

Make sure you upgrade to DripStat Agent 11 to enable Percentiles for your application.

]]>Azure DocumentDB is now a first class citizen of DripStat. We now support monitoring calls to DocumentDB starting with Agent 10.0.3

DocumentDB will show up as a database in DripStat. Stats on individual DocumentDB operations can be viewed. They show the collection name too.

DocumentDB will appear as

]]>http://blog.dripstat.com/introducing-azure-documentdb-monitoring-support/55c9b49d-2802-42bb-aeee-b9f1e330a8ecThu, 03 Nov 2016 16:05:06 GMTAzure DocumentDB is now a first class citizen of DripStat. We now support monitoring calls to DocumentDB starting with Agent 10.0.3

DocumentDB will show up as a database in DripStat. Stats on individual DocumentDB operations can be viewed. They show the collection name too.

DocumentDB will appear as a layer in Application Overview and Transaction Stats.

All the features of DripStat available to all other databases are now available for DocumentDB too.

If you are using DocumentDB for your Planet scale applications, start Dripping!

]]>Today we are introducing the JVM Profiler in DripStat.

You can now run our sampling based Profiler on your JVMs from within DripStat. This is really useful if you see a lot of time is being spent in 'Java Code' and you want to know which exact method it is.

You can now run our sampling based Profiler on your JVMs from within DripStat. This is really useful if you see a lot of time is being spent in 'Java Code' and you want to know which exact method it is.

How to Profile

Navigate to the 'Profiler' tab in any application and select a JVM to start profiling. The longer the duration, the more samples are collected.

Happy Dripping!

]]>We keep a small portion of our data in MongoDB. We recently evaluated RethinkDB as a replacement for our Mongo cluster, after reading about how much more robust its storage engine is. Here is our technical analysis of why we had to stick with Mongo:

1 - No POJO library

]]>http://blog.dripstat.com/mongodb-vs-rethinkdb-why-we-had-to-choose-mongodb/7f726094-75c0-4639-b1b1-9dd136df6c5eThu, 06 Oct 2016 03:27:25 GMTWe keep a small portion of our data in MongoDB. We recently evaluated RethinkDB as a replacement for our Mongo cluster, after reading about how much more robust its storage engine is. Here is our technical analysis of why we had to stick with Mongo:

1 - No POJO library for Java

With Mongo, we use the Spring Data MongoDB library. It takes care of converting plain Java objects to/from the json format that Mongo drivers need. It results in typesafe, high level code and we never have to interact with the low level Mongo driver.

With RethinkDB, we found no such thing. The only way to interact with it was through its Java driver, which forced us to manually convert our objects to/from Maps. This essentially wiped out all the ease of using a document database in the first place. It meant everything that takes a single line with Spring and Mongo would take many, many lines with RethinkDB.

2 - MongoDB Cloud Manager

MongoDB's Cloud Manager service takes care of handling all the administrative tasks of MongoDB. Specifically:

a - Installation

Mongo Cloud installs MongoDB on our private servers. Its the best of both worlds, since we get Mongo inside our virtual network but its managed like a PAAS. We don't have to care about the exact installation steps. While RethinkDB might be easy to install, nothing beats clicking a button in a GUI.

b - Upgrades

For production, the story doesn't end at installation. With RethinkDB, we would have to write custom scripts to upgrade the database whenever a new version was released. We would have to put in effort to understand the upgrade procedure and notes of each release to ensure the update completed with minimal to no downtime.

With MongoCloud, upgrades are fully automated and just requires a few clicks.

c - Backup

A database without a backup is just a cache.

Mongo Cloud takes full care of backups and restore without us needing to do anything manual.

With RethinkDB, we would have to write custom scripts to run its command line backup tool and ensure the script was always working.

Conclusion

RethinkDB might have been a better database engine. However, choosing a database product doesn't stop at the engine. We realized that both our development and administrative burden would increase by an order of magnitude if we switched to RethinkDB. Thus, for now we continue to stick with Mongo.

]]>This weekend when we updated our backend to enhance security. If you use Java 6, (or a very early version of Java 7), your JVM would not have been able to connect to DripStat servers since then.

Remedy:

Please upgrade to Agent 9.0.7 and all will be good

]]>http://blog.dripstat.com/agent-update-required-for-java-6-users/7a7fa5c2-d3b4-4e18-90f4-c604daa3024eWed, 05 Oct 2016 17:49:41 GMTThis weekend when we updated our backend to enhance security. If you use Java 6, (or a very early version of Java 7), your JVM would not have been able to connect to DripStat servers since then.

Starting from Agent 9.0.7, you no longer need to specify extra system property for Java 6 jvms to use TLS.

]]>We spent a year running DripStat on Google Cloud. This is a purely technical view of our experience with it.

The good

1 - No Reboots. Ever!

In a whole year, not a single VM ever rebooted! Google's Live Migration tech does a super good job. We were spoilt to

]]>http://blog.dripstat.com/one-year-on-google-cloud-whats-great-whats-not/556c8851-5250-4cc2-8303-da03484da647Tue, 27 Sep 2016 00:54:21 GMTWe spent a year running DripStat on Google Cloud. This is a purely technical view of our experience with it.

The good

1 - No Reboots. Ever!

In a whole year, not a single VM ever rebooted! Google's Live Migration tech does a super good job. We were spoilt to the point of sometimes thinking whether we really need those replicas..

2 - Flexible VM sizes

We don't use Docker, so having the ability to create a VM exactly the size we needed was very welcome.

3 - Extremely fast VM creation

New VMs were spun in seconds, compared to minutes for other cloud providers.

The Bad

1 - Cloud SQL

We found Google Cloud SQL to be much inferior to AWS RDS. We also found it to be very slow. While 2nd Gen Cloud SQL claims to be much better, it was still in Beta at that time. The fact that it has no option for PostgreSQL eventually meant we had to use RDS.

2 - Subpar PAAS Services

Google Cloud simply doesn't have the breadth of services like AWS. The ones that do exist felt nowhere near as mature as their AWS counterparts. We tried using Google Cloud DataStore and its APIs were in terrible shape and the network latency was too high. We eventually ended up not using any of Google Cloud's PAAS services.

3 - Network issues

While the VMs were stable, we frequently encountered network issues, even inside the Virtual Network. Time and again we would see the network latencies among VMs spike up for a few minutes. Once we had a network issue with the load balancer that took many, many hours to resolve.

4 - No Connection Draining APIs

Connection Draining allows you to detach your VMs from the load balancer, while ensuring that the 'requests in flight' complete properly. This is used everytime we deploy our code.

The AWS load balancer has explicit APIs to start/stop draining of instances attached to a load balancer. Google Cloud has no such thing. It relies on an extremely crude method of a custom health check url giving errors to put a VM in draining state. This meant we had to write some custom code get GCloud Load Balancers to drain our VMs while not being actually unhealthy.

5 - Ubuntu Repositories

Apparently Ubuntu maintains a separate copy of its 'apt' repo for google cloud. The Ubuntu images on Google Cloud point to this repo. However, once the Ubuntu version in question goes out of maintenance period, the repo is completely obliterated. To the point where apt-get update also wont work. The only workaround then is to modify the Ubuntu apt config to point to the central 'apt' repo. We thus learnt (the hard way) to always use the Ubuntu LTS versions. This though is more an issue on Ubuntu's side than Google's.

Conclusion

If you simply want to use VMs and don't care about rest of the PAAS stuff, Google Cloud is absolutely amazing. We haven't seen the kind of VM stability on any other cloud provider.

]]>We have been using Kotlin in the DripStat backend since Kotlin's 1.0 release using the Kotlin Intellij plugin. Here is a summary of our experiences as of Kotlin 1.0.3.

The Good

At the language level, Kotlin seems to be excellent.

1 - Seamless Java interop

Kotlin nails

]]>http://blog.dripstat.com/kotlin-in-production-the-good-the-bad-and-the-ugly-2/910af410-401d-4a77-b131-bc0b0fde5cd6Sat, 24 Sep 2016 17:13:14 GMTWe have been using Kotlin in the DripStat backend since Kotlin's 1.0 release using the Kotlin Intellij plugin. Here is a summary of our experiences as of Kotlin 1.0.3.

The Good

At the language level, Kotlin seems to be excellent.

1 - Seamless Java interop

Kotlin nails the Java interop at the language level. It is completely seamless. This was the reason why we were comfortable introducing Kotlin in our codebase in the first place.

2 - Less verbosity

Kotlin code is much less verbose than Java code. This makes it more pleasing to both read and write.

3 - Null checks

Kotlin enforces null checks at the language level. When you interact with Java code, it even extends that to the runtime level. This has resulted in catching some bugs pretty early and also more robust code.

What can be improved

1 - Lack of parallelStream()

Kotlin's collection api have no equivalent to Java's parallelStream(). This is dearly missed.

2 - Cannot subclass Data classes

This is something that we feel the need for more and more as our codebase grows. It seems to be planned for Kotlin 1.1 but doesnt exist as of today.

3 - Type inference on method return values

While type inference everywhere else results in more concise code, on method return values, it results in an actual loss of information. You cannot tell the type of the variable unless you look at the called method's signature, and the IDE plugin currently doesn't show the type. While Kotlin does allow specifying the variable type, it results in much more verbosity than Java.

Whats Broken

We have found that the Intellij plugin for Kotlin is extremely buggy and very far behind Java.

1 - Editor crashes

This is such a huge issue since 1.0.3 that we cannot write any more Kotlin code. The editor frequently stops doing syntax highlighting, code completion etc. The only remedy when this happens is to restart the entire IDE.

The Call Hierarchy view frequently doesn't show all calls to a method if that method is used across both Java and Kotlin. This is an extremely serious bug. Entire technical decisions can be based on whether a piece of code is used in a certain location. The fact that the Call Hierarchy view shows incomplete information has a huge impact.

While it does have very basic move class, rename class, rename method refactorings, the vast majority of refactorings from Java are simply not present. Even the rename refactoring is extremely limited.

Conclusion

Kotlin seemed to promise a better Java, without compromises. It does achieve that at the language level. However, a big part of using Java is Jetbrains' own excellent Java tooling. Here the Kotlin plugin has a lot of work to do to catch up with what Jetbrains has built over 15 years for Java.

]]>Traditional routers like the React-Router are good when your application has master-detail style 'navigation'. It fails when your application is composed of components that communicate with each other whose state you want to save in the URL.

Where traditional router works well

A scenario where traditional routers work well is

]]>http://blog.dripstat.com/rethinking-routing-for-reactjs-applications/5b94fa0f-ef6f-45dc-a1e7-84c021666ae2Fri, 23 Sep 2016 20:44:32 GMTTraditional routers like the React-Router are good when your application has master-detail style 'navigation'. It fails when your application is composed of components that communicate with each other whose state you want to save in the URL.

Where traditional router works well

A scenario where traditional routers work well is when you have a hierarchy of master-detail style views and you want to navigate across them.

An example is the 'Alerts' UI of DripStat:

Here is how its hierarchy works:

You click on the top level 'Alerts' button.

It opens a page with a navbar.

You click on individual pages like 'Incidents' and 'Violations' to navigate to them

Where traditional router fails

Where traditional routers fail is when you have a bunch of components that depend on each other's state.

Take for example, this typical DripStat dashboard:

It has 7 components with state of their own. The state of each of those components needs to be serialized in the URL so when the user refreshes the page, he sees it in the same state.

The traditional router will suggest putting these states in a hierarchy with a url like this:

/state1/state2/state3/..../state7

It is forcing a hierarchy on our components when there is no absolute hierarchy to speak of. Now every component needs to know the index of the state of every other component in the url, both above and below it. This has many issues:

What if we change the layout of the components on the page, eg, put the Time Range selector below the JVM filter?

What if we introduce an extra component in between?

What if we remove one?

What if we want to reuse a portion of the UI on a different page?

What if we want to load multiple components in parallel, instead of one by one in a hierarchy?

All of the above modifications would require going through and updating the code of every single other component. Each of the component's code would also be full of complex routing code.

Our Solution

Allow changing the layout and number of the components on page, without needing to update routing code

Allow reuse of portions of UI in different pages.
Eg - 'Pinned Transactions' shows a lot of the same UI as 'Transactions' page.

Components should be able to load in parallel, independently of one another.

We make heavy use of the Flux pattern and Redux to solve this. The state of all the components is kept in a reducer. (Depending on your needs, you may decide to put them all in a single reducer or one reducer per component. We assume single reducer for this article).

All the individual components just subscribe to this reducer to read/write their own state or the state of some other component.

We use the React-Router just to route to top level components, and then let our routing mechanism take care of it from there.

Now our components can communicate with each other, can be added/removed and not have to worry about the url. The top level component takes care of that.

We have used this approach throughout DripStat, falling back to React-Router only when it really makes sense, eg on the 'Alerts' page. It has allowed us to rapidly iterate on our UI while keeping it bug-free from routing related issues.

]]>Eariler this week, we introduced the Live dashboard to enable you to see a Live feed of your Application's data. It has already become the most popular dashboard inside DripStat.

Today, we are taking Live to the next level. Now you can see Live data for all your applications across

]]>http://blog.dripstat.com/live-data-across-your-entire-infrastructure/f6d9a53a-cb91-4fe8-9afb-4c044b61916fFri, 23 Sep 2016 16:58:41 GMTEariler this week, we introduced the Live dashboard to enable you to see a Live feed of your Application's data. It has already become the most popular dashboard inside DripStat.

Today, we are taking Live to the next level. Now you can see Live data for all your applications across your entire infrastructure, all at once!

Here is a description of the individual charts:

Percentage Time Per Layer

This shows the percentage of time your applications are spending in each layer, calculated across all your applications.

Per Application Stats

These graphs show for each individual application:

Response Time

Throughput

Error Rate

CPU usage

Filters

You can choose which apps are displayed in the graph by using the filter on top.

Next Steps..

How do I enable this?

Make sure you have DripStat Agent 9 installed. If you already have it installed from earlier this week, you don't have to do anything.

I want to see it now!

If you already have Agent 9 installed, just click the 'Live' button on the top bar. Otherwise, head to the Live Demo area for a taste.

]]>Today we are setting a new milestone for APM. The 'Live' Dashboard allows you to see second-level metrics in realtime!

Metrics come in streaming at a 2 second granularity. No more hitting refresh to see latest data.

The 'Live' Dashboard shows the following metrics:

Response Time - Overall for App,

]]>http://blog.dripstat.com/introducing-live-dashboard-per-second-metrics-in-realtime/b72633f8-b5a7-4eaf-9f43-88e0a235e7f2Mon, 19 Sep 2016 01:25:30 GMTToday we are setting a new milestone for APM. The 'Live' Dashboard allows you to see second-level metrics in realtime!

Metrics come in streaming at a 2 second granularity. No more hitting refresh to see latest data.

The 'Live' Dashboard shows the following metrics:

Response Time - Overall for App, broken down by Layer

Throughput and Error Rate - Overall for App

Response Time - Per JVM

Throughput - Per JVM

Error Count - Per JVM

Used Heap Size - Per JVM

CPU Usage - Per JVM

GC Pause Time - Per JVM

The difference 'Per Second' granularity makes

Here is a Throughput chart at Minute level granularity. It looks like a smooth graph with even throughput.

However, looking at the same data at a Second level granularity, we see that the Throughput really spikes at the tick of the minute and is mostly pretty low throughout the minute.

Where to see Live Metrics in the UI?

Click on an Application, then click on the 'Live' tab.

How to enable Live metrics for my apps?

Upgrade your DripStat Agent to 9.0.
Thats it.

Note that this will work on Java 7 and higher only. Read documentation for full system requirements.

I want to see it like right now!

Head to the 'Live Demo' area to check it out.

]]>OrientDB is now supported in DripStat.

Upgrade to Agent 8.1.10 or higher and you will be able to see time spent in OrientDB at both the Transaction and Application level.

OrientDB stats will also appear in the Database tab for both Application and Cross Application views.