Doors open at 18:00, talks start at 18:30, about 9pm we move to a pub. We are ready to host 150 folks in the room so there may be plenty of people to discuss data science questions with!

Please remember to unRSVP if you realize you can't make it - it will help a lot for our crew.

And make sure you follow @pydatawarsaw for any updates and early announcements.

First Talk: Michał Jaślan (Onwelo) - HawqData

As you know analyzing data is really demanding task. Business domain knowledge, analytic technics/algorithms, tools define areas that need to be all covered by successful analyst. To make thing a little simpler, I would like to share some experiences related to HAWQ analytical database.

Second Talk: Bartosz Biskupski & Wojciech Walczak (Samsung) - How much meaning can you pack into a real-valued vector? Semantic similarity measuring using recursive auto-encoders.. The presentation will start with a brief overview of AI research and development at Samsung R&D in Poland. We will then describe a solution, developed in one of our projects, that has won the Semantic Textual Similarity (STS) task within the SemEval 2016 research competition. The goal of this competition was to measure semantic similarity between two given sentences on a scale from 0 to 5. At the same time the solution should replicate human language understanding. The presented model is a novel hybrid of recursive auto-encoders (a deep learning technique) and a WordNet award-penalty system, enriched with a number of other similarity models and features used as input for Linear Support Vector Regression.

About the AAIA'16 Data Mining Challenge, evaluation metric and data. Which tricks, ideas are the best. And more about Feature Extraction, Model Selection and blending

Fourth Talk

Maciej Bryński (Innovation In IT) - Apache Airflow

Apache Airflow is a tool for managing processes of the data processing. As part of the presentation will describe the best things about this tool and limits. Maciej Bryński will also show that Apache Airflow can replace both of Cron, Jenkins as well as Oozie.

PS1: default language is English, but there may be some exceptions from the rule

PS2: presentation part is mainly Python focused but not only, We expect to host a number of guests working with R, Scala and other languages.

direct contact: [masked], [masked]

See you !

Event Organizers

PyData Warsaw

This is a group for anyone interested in Machine Learning and Artificial Intelligence and their applications such as Big Data, predictive analytics, data science and robotics. Our main focus is Python based technology stack, but not only. All skill levels are welcome. We started this group because we want to meet other enthusiasts in the area. We&#39;re looking forward to exploring exciting new id...