Deutscher Hof

This year April 24th I got accepted for the GoogleSummer of Code program. I will be working on the integration of Text mining and Topic Modeling Tools for R Project for Statistical Computing.

GoogleSummer of Code, often abbreviated to GSoC, is an international program, where students construct free and open-source software during the summer. The program is open to university students aged over 18.

Over next couple months I will be writing here about progress of my works.

Here is an abstract of my coding project:
The goal of this project is to create a user friendly API for an integrated workflow to perform typical text mining, natural language processing, and topic modelling tasks. This would include complete process of topic modelling:

Loading data, including loading text files from a local filesystem, as well as harvesting texts from the internet (via the package stylo/tm)

In the first stage, I plan to integrate a few packages as mentioned above. Future development assumes construction of a package integrating more tools in the similar fashion as caret for predictive modelling. The GoogleSummer of Code is planned to be just outset of a bigger project.