Sessions at Open Source Bridge 2009 about CouchDB on Wednesday 17th June

Apache CouchDB makes the perfect vehicle for extremely distributed applications. CouchDB can serve HTML and other static web assets, while providing dynamic access to data. CouchDB's validation functions and rendering capablities mean you can write your web app in pure JavaScript. Once it's done it can spread through ad-hoc sharing. When users have full access to the source code, interesting things happen. The application models are different. I'll run you though some of the major differences. Learning how to program extremely distributed case will make you stronger at building more traditional scalable web services.

I will introduce the theory and goals of clustering algorithms. The literature in statistical analysis is made up of dense mathematical equations; so I will translate equations into pseudocode to make the topic more accessible to programmers.

I will expand on the theoretical discussing by demonstrating a simple example of a clustering problem: how to group volcanos in Alaska by geographical proximity. I will move on to algorithms with real-world applications, such as how to group users with similar tastes given a database of user ratings.

I may touch on more advanced techniques to improve the accuracy of resulting clusters. I will also discuss current limitations of statistical analysis. As an example, Netflix' ongoing competition for an algorithm that can predict whether or not a user will like the movie Napolean Dynamite.

The examples from the talk will be implemented using JavaScript and CouchDB. My hope is that people from many different language and environment backgrounds will have some experience with JavaScript. And the data-processing capabilities of CouchDB are well suited to clustering algorithms.