Deerwalk Blog

Traditional population health analytics vendors calculate utilization through the use of ‘summary tables’. This approach aggregates information from medical claims about a visit, service or procedure and stores that information in a slimmed down format (the summary table itself) that is digestible on lower performing SQL servers. Each utilization type has a summary table – an ER Visit summary table, an Admission summary table, a CT Scan summary table – and these summary tables, rather than[...]

In this blog post, I discuss a (very) high level process for designing a Hadoop & HBase based system. Since SQL based solutions are what people are most familiar with, I will start out discussing how things would be designed in a relational manner and then talk about how the NOSQL solution differs from this. This seems to be the norm when discussing NOSQL solutions.

In data analytics, incremental processing for the aggregation is very important. When we want to serve real time data, we can not run over the old data and newly added data to calculate the overall aggregate. This makes incremental processing the first priority for real time data analytics. This in turn requires processing over the structured dataset. In this article I will try to compare some of the MPP(Massively Parallel Processing) architectures in this light.