Why

To create a buzz in Melbourne around Data Science and reverse the brain drain

To solve a real world problem that could impact the lives of all Australians

To have fun!

How it works

Read this web page and sign up at the bottom

Attend one or more of the 3 events in order to sign the NDA (Non-Disclosure Agreement), get the data, form teams and start your analysis,Thu 13th April – kick offSat 15th April – hackday 1Sun 23rd April – hackday 2

Continue working with your team and submit a slide deck before 3pm, Wed 3rd May

Five top teams will be pre-selected to pitch their findings on the night of Fri 5th May

The Kaggle competiion will continue for a further 4 weeks with the winners revealed at our conference on the 2nd June

The top 5 teams in the Kaggle competition as of 12 noon on 28th May will also be offered 2 free tickets to the conference on the 2nd June

Internships will also be up for grabs to entrants of the Datathon (ANZ, iSelect, KPMG, EstimateOne, Billcap, Transurban, Tiberius Data Mining, Northraine and others). You must submit a presentation to be eligible to apply and performance in the Kaggle part will also be taken into consideration by the companies as they review your applications

A potential speaking spot at the WombatMeDaScIn conference during Melbourne Data Science Week

If you are a company and would like the opportunity to take on participants of the datathon as interns, or help sponsor the prizes them please fill out the form here.

FAQ

What is a Datathon? You work for an analytics consultancy that is pitching to a client for a major piece of work. The client collects data as a by-product of its operations and wants to see if any business value can be extracted from it. You have been given 3 weeks to demonstrate the potential usefulness of the data and put together findings to present to the client.

What is the data? Trust us that it is the best data set we could have hoped for. It is previously unseen and successful analytics could have a positive impact on the lives of all Australians. We’re keeping the exact content under wraps, so you’ll have to turn up to find out.

Do I need to be to be a data science rock star to enter? No, this is all about learning and knowledge transfer. Even if you’ve never done anything like this before, please come along. We offer tutorials and mentors on hack day to get you started.

What do I need to bring? You will need your laptop with your favourite tools installed. Bring lots of curiosity and energy. Don’t forget your power cord!

What software can I use? You can use whatever you like. We recommend you have a database set up to load the data into.

Do I need to already have a team? No, we are expecting most people will form teams on the hack day. The organisers will be around to facilitate this with a special event. Don’t worry if you don’t know anyone; lots of people won’t.

Can I enter as an individual? Yes, but the judging panel will favour teamwork for the pitching part. Each participant can only be part of one submission; you cannot be both on a team and an individual. The Kaggle part is considered separate and you do not have to be in a team – to increase your chance of getting an internship you should enter the Kaggle part as an individual.

Why do I need to sign a Non-Disclosure Agreement? This is real data from a real ‘client’. It is a condition of them releasing it to you that an NDA is first signed. It basically means you will not use the data for any other purpose, and that you will delete it at the end of the contest.

What if I need help? There will be a handful of very experienced ‘mentors’ floating around the room on hack day. The purpose of them being there is to give ‘training’ on tools and techniques to munge the data – please use them! We will also host a selection of tutorials.

What will be revealed about the data? Not much – it is your job to figure things out. On hack day 2, the data owners will be there to give a short presentation and answer any questions you have.

How ‘big’ is the data? In total it will be ~ 5GB unzipped and there will be about 50 million rows of data in total. It is split into several files of bite-sized chunks and each file can be worked on individually – you will not need to load in everything to start analysis.

Can we use additional data? Totally – but it has to be publicly available.

Are there set tasks? No, we provide very little initial guidance. As a true ‘data explorer’, you will have to come up with your own questions for the data. We want the datathon to be just like a real data science consulting task. Ask yourself what the data provider might want to learn, and how you might go about presenting that.

What, no guidance? Well maybe this year as the data is so awesome and vast, we will give some suggestions as to the type of problems that need to be solved. Also don’t assume that we know anything about the data already, so things like data quality and sanity checking should be addressed.

How will it be judged? The main focus of our panel will be on the team’s ability to translate their findings into meaningful, easily understandable, actionable and valuable insights. They have a hypothetical budget to allocate and you need to convince them it’s worth spending it on your analytics.

Is this like a Kaggle competition? There is a predictive component with separate prizes that will be run on Kaggle. This will run for an extra 4 weeks, with the winners being awarded the prizes at the Data Science Melbourne conference on June 2nd. You can enter as an individual or in a team and one member of the team must be at the presentation evening to be eligible for the prize.

How do we communicate and stay up to date? To ask questions, use the forum here or use the Kaggle forum if it’s about the prediction competition. Once you sign up you will be getting regular email updates closer to the event.

What are the rules? Each participant can only be part of one team in the pitching competition, and one team in the Kaggle competition. At least one team member should be present on the pitch night to be eligible for a prize. You can be in different teams for the pitching contest and the Kaggle competition, but we strongly encourage you to put in an entry for both the pitching contest and the Kaggle competition.You cannot pass on the data to anyone else – all participants must have signed the NDA and collected the data in person from one of the 3 events.

How do I apply for an internship? Instructions are on the read_me.pdf that is included with the data.

How do we submit our entries to the insights competition?Instructions are on the read_me.pdf that is included with the data.

WHEN

Day 1

13 Apr 2017

17:00 - 18:30

Zendesk, cnr Collins & Queen

Evening Launch

Come along after work to sign the non-disclosure agreement, get the data and hear a short presentation about proceedings. Attending the launch event is not mandatory, but will give you an early start. For those who are away over the Easter weekend it will be a chance to get the data. Don’t forget to bring your laptop if you want to get the data!

First Saturday – Hack Day I

Wondering what to do on the long Easter weekend? On Saturday, we will provide everything you need to work on your data investigation: food, drinks, a co-working space, wifi – and, of course, the dataset. If you are looking to join a team, this is a great opportunity to ask around and/or attend our special team formation event. We will host a couple of (optional) ‘master classes’ to demonstrate tools, techniques and skills to get you going.

Day 3

23 Apr 2017

10:00 - 16:00

RMIT Swanston Academic Building, 445 Swanston St, Melbourne

One Week In – Hackday II

On the 2nd Sunday you can reconvene with your team, with some experts on hand to help you out.

Day 4

05 May 2017

17:45 - 20:00

nab Arena, 700 Bourke, Docklands

Pitch Time

On the final night, we will decide which team takes home the honour of Melbourne Datathon champions! Five pre-selected teams will give their pitches before our professional panel. This session is also part of Melbourne Knowledge week, and anyone can attend whether you are a participant or not. After the pitches join us across the road at Platform 28 for a post datathon drink.

Catherine Lopes

Gregory Hill

Mike Da Gama

Sarah Pizzey

Mentors

There will be a few experienced people floating around and available to help you out with technical things. Please use them, it’s a good opportunity to get a one on one tutorial.

If anyone else wants to help, just turn up on the hack day.

Hackdays Detailed Schedule

Saturday 15th April - Telstra, 242 Exhibition Street

Morning

09:00

Arrival
Welcome to the 2017 Melbourne Datathon hackday number 1! If you have not yet signed an NDA, please do so upon signing in. If you are looking for a team, grab a name sticker and follow the instructions. After signing in, make your way to the data station to load up the dataset.

10:00-10:30

Forming Teams
Attend this event if you are looking for a team. We will have muffins and instructions waiting for you.

11:15-11:45

Getting Started - Phil Brierley
In this presentation we will give a short demo of loading the data in a couple of tools (example code for this will also be included with the data)

12:30

Lunch - sitting 1
A pizza lunch will be served in the kitchen area. It will be busy so grab a pizza and take it to share with your team. Gluten free pizzas will be available in this first batch only

1:15

Lunch - sitting 2
More pizza will arrive

Afternoon
Optional tutorials in the presentation area for those who want to join us.

2:00-2:30

An Initial Analysis - Shane Butler
We'll be challenging Shane, a data scientists at Telstra, to see what he has been able to come up with in the first morning.

3:30 - 4:00

Data Visualisation with Yellowfin - Edgar Kautzner
Edgar will show what he has discovered in the data using Yellowfin.

6:00pm

End of our time at Gurrowa. I'm sure we can find a local pub to continue.

Sunday 23rd April - RMIT, 445 Swanston St

Morning

09:15+

Arrival
Welcome to the 2017 Melbourne Datathon hackday number 2!
If you have not yet signed the NDA and loaded the data, then there is still an oportunity to participate by attending this event. For those who have already started, it is a chance to get back together with your team.

10:00-10:30

Forming Teams
For those who have not yet formed a team then there will be an opportunity to meet others at this session.

11:00-11:30

Data walkthrough, Q&A
So far we have told you little about the data.
In this presentation, our data sponsor will give a quick overview of the data and answer any questions you may have.

12:30

Lunch
Snacks will be available - for anything more substantial please bring your own.

Afternoon
Optional tutorials in the presentation area for those who want to join us.

Sign Up!

Registrations for the Melbourne Datathon 2017 are closed as of April 7. We are at absolute maximum capacity and we want to make sure we can put up a great event. If you missed out, join our Meetup to stay connected and keep an eye out for next year’s datathon!