For those of you who use the R language to produce models, I have created an R script equivalent to produce the benchmark model.

All the text is copy-paste of the wonderful benchmark walkthrough provided by DrivenData, I did not add any feature engineering/insights/analysis of any kind.
The only things that’s changed is the code itself in order to match the R language.

{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Dengue fever is bad. It's real bad. Dengue is a mosquito-borne disease that occurs in tropical and sub-tropical parts of the world. In mild cases, symptoms are similar to the flu: fever, rash and muscle and joint pain. But severe cases are dangerous, and dengue fever can cause severe bleeding, low blood pressure and even death.\n",
"\n",
"Because it is carried by mosquitoes, the transmission dynamics of dengue are related to climate variables such as temperature and precipitation. Although the relationship to climate is complex, a growing number of scientists argue that climate change is likely to produce distributional shifts that will have significant public health implications worldwide.\n",
"\n",
"We've [launched a competition](https://www.drivendata.org/competitions/44/) to use open data to predict the occurrence of Dengue based on climatological data. Here's a first look at the data and how to get started!"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"As always, we begin with the sacred `import`'s of data science:"
]
},