about turning data into insightful knowledge – for business and personal curiosity

Main menu

Tag Archives: Shiny

Agglomerative hierarchical clustering is a simple, intuitive and well-understood method for clustering data points. I used it with good results in a project to estimate the true geographical position of objects based on measured estimates. With this tutorial I would like to describe the basics of this method, how to implement it in R with hclust and some ideas on how to decide where to cut the tree. This was also a great opportunity for composing anohter Shiny/D3.js app (GitHub for the code, shinyapps.io for the app) – something I wanted to do for a while now. At the end of the text I am writing a bit about what I learned in that regard.

(This article is referring to an initial proof-of-concept version of r-big-pivot)

I have to admit that I very much enjoy pivoting through data using Excel. Its pivoting tool is great for getting a quick insight into a data set’s structure and for discovering interesting anomalies (the sudden rise of deaths due to viral hepatitis serves as a nice example). Unfortunately Excel itself is handicapped by several restrictions:

a maximum number of one point something million rows per data set (which is crucial because the data needs to be formatted long)