Over the last six months, we have been quietly working away to complete the first official “1.0” release of Deedle, which we are delighted to announce today. There are a lot of changes and new features in the 1.0 release, including:

We are continuing to improve the RProvider, in particular developing support for interacting with remote and out-of-process R sessions. We’ve also tagged a number of enhancements that are suitable for other enthusiastic souls to pick up and contibute. See up-for-grabs.

Yesterday we announced Deedle, our new Open Source library for exploratory data analysis in C# and F#. Deedle (almost) stands for “Dotnet Exploratory Data Library”. This is a library for .NET with similar capabilities to the widely respected Pandas library for Python.

Deedle was developed by Tomas Petricek, with assistance from Adam Klein and myself.

We are finding Deedle to be extremely powerful for research. We hope others will find it similarly useful and make improvements to make it an even better package.

The talk goes into the way that we utilize Cassandra, as well as the types of tuning that we’ve done to the JVM. Slides 2-20 show our use case, 22-41 show the types of performance tuning that we’ve done for our workload.

Jake Luciani and Carl Yeksigian will be presenting at the NYC* Tech Day 2013. They will be talking about how BlueMountain has harnessed Cassandra to deliver a scalable time series database. If this is an interesting topic for you, make sure to register!

Glance, a metrics dashboard

Glance is the dashboard that we’ve developed internally to take advantage of the metrics captured by Riemann and the metrics that are available through Graphite.

Currently, Glance is only able to look at the data that is exposed through Graphite. We are working on adding in support Riemann’s websocket protocol, which will also enable real-time metrics to be pushed out to the dashboard.

Get your focus back

Glance hides the navigation after you have selected your dashboard. Mousing over the navigation tab will cause it to reappear.

This small feature allows more space for the metrics to take up on your screen. Once you have a dashboard selected, Glance gets out of the way so that you can focus on the metrics.

Search here, search there

The search box will search the metrics display on the page as well as those on the server. This allows quick filtering of metrics as well as finding metrics not included.

Easily define new dashboards

Glance uses a JavaScript API to define new pages, allowing for easy deployment and usage.

glance.page("cpu","CPU")
.find("*.cpu")
.asPercent(1.0)

All in browser

Glance uses HTML5 technologies to push the logic of metric capture and dashboard creation to the user’s browser. This enables a static page to be served by the server and quick load on the client side.

We’ve tried to use the best technologies that are suited for our use case. These technologies include:

Our infrastructure at BlueMountain includes Windows machines, Linux machines, and a lot of custom software. As we scale up the number of machines on our grid and integrate new software including Cassandra, we find our monitoring requirements becoming higher. Riemann fits in very well for our monitoring. Its push model allows us to be proactive instead of reactive; we can receive alerts before our users complain.

Because Riemann can feed directly into Graphite, we can have nice graphs for our historical data and a dashboard that alerts us to changing conditions. Below is a collection of graphs from our current Cassandra cluster.

Here at BlueMountain we like to perform statistical analysis of data. The stats package R is great for doing that. We also like to use the data retrieval and processing capabilities of F#. F#’s interactive environment lends itself pretty well to data exploration, and we can also easily access our existing .NET-based libraries. Once we are done, we can build and release production-supportable applications.

Nothing on the .NET platform competes with R for statistical functionality, so we set about bridging the gap between F# and R. F# 3.0 provides a nice innovative mechanism for doing this, through Type Providers.

Any of the calls above that begin R. are actually evaluated inside the R engine.

This produces a lovely pair plot like this:

While we intend to continue to enhance the provider to meet our needs, we really hope others will do the same. If you use F# and work in the statistical/econometrics space, please try it out. If you use R and are looking for a robust environment in which to develop applications, also try it (and F#) out. If you have ideas for improvements, please feel free to share them with us. And if you develop enhancements/fixes, please submit a pull request!

The RProvider is built on the RDotNet project, which handles all the gnarly interop with unmanaged data structures used by R.DLL. The Type Provider provides an easy-to-use layer on top of that to use R from F#. Many thanks go to the RDotNet author, Kosei.

We are very pleased to launch this blog, through which the members of the BlueMountain Quantitative Strategy team will share opinions, provide useful information about technology and announce open-source software.