Grant Sanderson - Concrete before Abstract

schedule 1 week ago

Sold Out!

45 Mins

Keynote

Intermediate

This talk outlines a principle of technical communication which seems simple at first but is devilishly difficult to abide by. It's a principle I try to keep in mind when creating videos aimed at making math and related fields more accessible, and it stands to benefit anyone who regularly needs to describe mathematical ideas in their work. Put simply, it's to resist the temptation to open a topic by describing a general result or definition, and instead let examples precede generality. More than that, it's about finding the type of example which guides the audience to rediscover the general results for themselves. We'll look, aptly enough, at examples of what I mean by this, why it's deceptively difficult to follow, and why this ordering matters.

schedule 2 months ago

Sold Out!

45 Mins

Keynote

Intermediate

Since we originally proposed the need for a first-class language, compiler and ecosystem for machine learning (ML) - a view that is increasingly shared by many, there have been plenty of interesting developments in the field. Not only have the tradeoffs in existing systems, such as TensorFlow and PyTorch, not been resolved, but they are clearer than ever now that both frameworks contain distinct "static graph" and "eager execution" interfaces. Meanwhile, the idea of ML models fundamentally being differentiable algorithms – often called differentiable programming – has caught on.

Where current frameworks fall short, several exciting new projects have sprung up that dispense with graphs entirely, to bring differentiable programming to the mainstream. Myia, by the Theano team, differentiates and compiles a subset of Python to high-performance GPU code. Swift for TensorFlow extends Swift so that compatible functions can be compiled to TensorFlow graphs. And finally, the Flux ecosystem is extending Julia’s compiler with a number of ML-focused tools, including first-class gradients, just-in-time CUDA kernel compilation, automatic batching and support for new hardware such as TPUs.

This talk will demonstrate how Julia is increasingly becoming a natural language for machine learning, the kind of libraries and applications the Julia community is building, the contributions from India (there are many!), and our plans going forward.

schedule 3 weeks ago

Sold Out!

45 Mins

Case Study

Intermediate

Within Nokia Software Digital Experience, we build products that increase customer satisfaction and reduce churn through proactive identification of the user problems and that enable service providers to resolve problems faster. To achieve such tasks, ML and DL techniques are now contributing a lot to these successes. However, there is usually a long journey between building a first model up-to delivering a field-proven product. Besides providing highlights on how machine and deep learning are used today to boost the broadband connection, this talk will reveal some challenges encountered and best-practices involved to reach the expected quality level.

schedule 2 weeks ago

Sold Out!

45 Mins

Talk

Beginner

In recent years, there has been a lot of research in the area of sequence to sequence learning with neural network models. These models are widely used for applications such as language modeling, translation, part of speech tagging, and automatic speech recognition. In this talk, we will give an overview of sequence to sequence learning, starting with a description of recurrent neural networks (RNNs) for language modeling. We will then explain some of the drawbacks of RNNs, such as their inability to handle input and output sequences of different lengths, and describe how encoder-decoder networks, and attention mechanisms solve these problems. We will close with some real-world examples, including how encoder-decoder networks are used at LinkedIn.

schedule 1 week ago

Sold Out!

45 Mins

Talk

Intermediate

We will trace the journey of NLP over the past 50 odd years. We will cover chronologically Hidden Markov Models, Elman networks, Conditional Random Fields, LSTMs, Word2Vec, Encoder-Decoder models, Attention models, transfer learning in text and finally transformer architectures. Our emphasis is going to be on how the models became powerful and simple to implement simultaneously. To demonstrate this, we take a few case studies solved at INSOFE with a primary goal of retaining accuracy while simplifying engineering. Traditional methods will be compared and contrasted against modern models and show how the latest models actually are becoming easier to implement by the business. We also explain how this enhanced comfort with text data is paving way for state of the art inclusive architectures

schedule 2 weeks ago

Sold Out!

45 Mins

Talk

Beginner

Won't it be great to have ML models that can update their “learning” as and when they make mistake and correction is provided in real time? In this talk we look at a concrete business use case which warrants such a system. We will take a deep dive to understand the use case and how we went about building a continuously learning system for text classification. The approaches we took, the results we got.

For most machine learning systems, “train once, just predict thereafter” paradigm works well. However, there are scenarios when this paradigm does not suffice. The model needs to be updated often enough. Two of the most common cases are:

When the distribution is non-stationary i.e. the distribution of the data changes. This implies that with time the test data will have very different distribution from the training data.

The model needs to learn from its mistakes.

While (1) is often addressed by retraining the model, (2) is often addressed using batch update. Batch updation requires collecting a sizeable number of feedback points. What if you have much fewer feedback points? You need model that can learn continuously - as and when model makes a mistake and feedback is provided. To best of our knowledge there is a very limited literature on this.

schedule 4 months ago

Sold Out!

45 Mins

Talk

Intermediate

It is too tedious to keep on asking questions, seek explanations or set thresholds for trends or anomalies. Why not find problems before they happen, find explanations for the glitches and suggest shortest paths to fixing them? Businesses are always changing along with their competitive environment and processes. No static model can handle that. Using dynamic models that find time-delayed interactions between multiple time series, we need to make proactive forecasts of anomalous trends of risks and opportunities in operations, sales, revenue and personnel, based on multiple factors influencing each other over time. We need to know how to set what is “normal” and determine when the business processes from six months ago do not apply any more, or only applies to 35% of the cases today, while explaining the causes of risk and sources of opportunity, their relative directions and magnitude, in the context of the decision-making and transactional applications, using state-of-the-art techniques.

Real world processes and businesses keeps changing, with one moving part changing another over time. Can we capture these changing relationships? Can we use multiple variables to find risks on key interesting ones? We will take a fun journey culminating in the most recent developments in the field. What methods work well and which break? What can we use in practice?

For instance, we can show a CEO that they would miss their revenue target by over 6% for the quarter, and tell us why i.e. in what ways has their business changed over the last year. Then we provide the prioritized ordered lists of quickest, cheapest and least risky paths to help turn them over the tide, with estimates of relative costs and expected probability of success.

schedule 1 month ago

Sold Out!

45 Mins

Talk

Advanced

Despite the increasing number of data scientists who are being asked to take on managerial and leadership roles as they grow in their careers, there are still few resources on how to manage data scientists and lead data science teams. There is also scant practical advice on how to serve as head of a data science practice: how to set a vision and craft a strategy for an organization to use data science.

In this talk, I will describe my experience as a data science leader both at a political party (the Democratic Party of the United States of America) and at a fintech startup (Even.com), share lessons learned from these experiences and conversations with other data science leaders, and offer a framework for how new data science leaders can better transition to both managing data scientists and heading a data science practice.

schedule 2 months ago

Sold Out!

45 Mins

Talk

Intermediate

Causal questions are ubiquitous in data science. For e.g. questions such as, did changing a feature in a website lead to more traffic or if digital ad exposure led to incremental purchase are deeply rooted in causality.

Randomized tests are considered to be the gold standard when it comes to getting to causal effects. However, experiments in many cases are unfeasible or unethical. In such cases one has to rely on observational (non-experimental) data to derive causal insights. The crucial difference between randomized experiments and observational data is that in the former, test subjects (e.g. customers) are randomly assigned a treatment (e.g. digital advertisement exposure). This helps curb the possibility that user response (e.g. clicking on a link in the ad and purchasing the product) across the two groups of treated and non-treated subjects is different owing to pre-existing differences in user characteristic (e.g. demographics, geo-location etc.). In essence, we can then attribute divergences observed post-treatment in key outcomes (e.g. purchase rate), as the causal impact of the treatment.

This treatment assignment mechanism that makes causal attribution possible via randomization is absent though when using observational data. Thankfully, there are scientific (statistical and beyond) techniques available to ensure that we are able to circumvent this shortcoming and get to causal reads.

The aim of this talk, will be to offer a practical overview of the above aspects of causal inference -which in turn as a discipline lies at the fascinating confluence of statistics, philosophy, computer science, psychology, economics, and medicine, among others. Topics include:

The fundamental tenets of causality and measuring causal effects.

Challenges involved in measuring causal effects in real world situations.

Distinguishing between randomized and observational approaches to measuring the same.

Provide an introduction to measuring causal effects using observational data using matching and its extension of propensity score based matching with a focus on the a) the intuition and statistics behind it b) Tips from the trenches, basis the speakers experience in these techniques and c) Practical limitations of such approaches

Walk through an example of how matching was applied to get to causal insights regarding effectiveness of a digital product for a major retailer.

Finally conclude with why understanding having a nuanced understanding of causality is all the more important in the big data era we are into.

schedule 1 month ago

Sold Out!

90 Mins

Tutorial

Beginner

Cleaning, preparing , transforming, exploring data and modeling it's what we hear all the time about data science, and these steps maybe the most important ones. But that's not the only thing about data science, in this talk you will learn how the combination of Apache Spark, Optimus, the Python ecosystem and Data Operations can form a whole framework for data science that will allow you and your company to go further, and beyond common sense and intuition to solve complex business problems.

schedule 1 month ago

Sold Out!

45 Mins

Case Study

Intermediate

Ending poverty and zero hunger are top two goals United Nations aims to achieve by 2030 under its sustainable development program. Hunger and poverty are byproducts of multiple factors and fighting them require multi-fold effort from all stakeholders. Artificial Intelligence and Machine learning has transformed the way we live, work and interact. However economics of business has limited its application to few segments of the society. A much conscious effort is needed to bring the power of AI to the benefits of the ones who actually need it the most – people below the poverty line. Here we present our thoughts on how deep learning and big data analytics can be combined to enable effective implementation of anti-poverty programs. The advancements in deep learning , micro diagnostics combined with effective technology policy is the right recipe for a progressive growth of a nation. Deep learning can help identify poverty zones across the globe based on night time images where the level of light correlates to higher economic growth. Once the areas of lower economic growth are identified, geographic and demographic data can be combined to establish micro level diagnostics of these underdeveloped area. The insights from the data can help plan an effective intervention program. Machine Learning can be further used to identify potential donors, investors and contributors across the globe based on their skill-set, interest, history, ethnicity, purchasing power and their native connect to the location of the proposed program. Adequate resource allocation and efficient design of the program will also not guarantee success of a program unless the project execution is supervised at grass-root level. Data Analytics can be used to monitor project progress, effectiveness and detect anomaly in case of any fraud or mismanagement of funds.

schedule 1 month ago

Sold Out!

20 Mins

Demonstration

Intermediate

In machine learning, hyperparameters are parameters that governs the training process itself. For example, learning rate, number of hidden layers, number of nodes per layer are typical hyperparameters for neural networks. Hyperparameter Tuning is the process of searching the best hyper parameters to initialize the learning algorithm, thus improving training performance.

We present Katib, a scalable and general hyper parameter tuning framework based on Kubernetes which is ML framework agnostic (Tensorflow, Pytorch, MXNet, XGboost etc). You will learn about Katib in Kubeflow, an open source ML toolkit for Kubernetes, as we demonstrate the advantages of hyperparameter optimization by running a sample classification problem. In addition, as we dive into the implementation details, you will learn how to contribute as we expand this platform to include autoML tools.

schedule 2 months ago

Sold Out!

45 Mins

Talk

Intermediate

Anecdotally only 2% of the models developed are productionized, i.e., used day to day to improve business outcomes. Part of the reason is the high cost and complexity of productionization of models. It is estimated to be anywhere from 40 to 80% of the overall work.

In this talk, we will share Scribble Data’s insights into productionization of ML, and how to reduce the cost and complexity in organizations. It is based on the last two years of work at Scribble developing and deploying production ML Feature Engineering Platform, and study of platforms from major organizations such as Uber. This talk expands on a previous talk given in January.

First, we discuss the complexity of production ML systems, and where time and effort goes. Second, we give an overview of feature engineering, which is an expensive ML task, and the associated challenges Third, we suggest an architecture for Production Feature Engineering platform. Last, we discuss how one could go about building one for your organization

schedule 4 weeks ago

Sold Out!

45 Mins

Talk

Intermediate

Various applications need lower dimensional representation of shapes. Midcurve is one- dimensional(1D) representation of a two-dimensional (2D) planar shape. It is used in applications such as animation, shape matching, retrieval, finite element analysis, etc. Methods available to compute midcurves vary based on the type of the input shape (images, sketches, etc.) and processing approaches such as Thinning, Medial Axis Transform (MAT), Chordal Axis Transform (CAT), Straight Skeletons, etc., all of which are rule-based.

This presentation talks about a novel method called MidcurveNN which uses Encoder-Decoder neural network for computing midcurve from images of 2D thin polygons in supervised learning manner. This dimension reduction transformation from input 2D thin polygon image to output 1D midcurve image is learnt by the neural network, which can then be used to compute midcurve of an unseen 2D thin polygonal shape.

schedule 1 month ago

Sold Out!

45 Mins

Talk

Intermediate

Beyond computer games and neural architecture search; practical applications of Deep Reinforcement Learning to improve classical classification or detection tasks are few and far between. In this talk, I will share a technique and our experiences of applying D-RL on improving the distribution input datasets to achieve state of the art performance, specifically on object detection tasks.

Beyond open source datasets, when it comes to building neural networks for real-world problems, dataset matters, which is often small and skewed.The talk presents a few fresh perspectives on how to artificially increase the size of datasets while balancing the data distribution. We show that these ideas result in 2% to 3% increase in accuracy on popular object detection tasks, whereas small and skewed datasets yield up to 22% increase in model accuracies.

schedule 1 month ago

Sold Out!

45 Mins

Talk

Beginner

Multi-label text classification is an interesting problem where multiple tags or categories may have to be associated with the given text/documents. Multi-label text classification occurs in numerous real-world scenarios, for instance, in news categorization and in bioinformatics (gene classification problem, see [Zafer Barutcuoglu et. al 2006]). Kaggle data set is representative of the problem: https://www.kaggle.com/jhoward/nb-svm-strong-linear-baseline/data.

Several other interesting problem in text analytics exist, such as abstractive summarization [Chen, Yen-Chun 2018], sentiment analysis, search and information retrieval, entity resolution, document categorization, document clustering, machine translation etc. Deep learning has been applied to solve many of the above problems – for instance, the paper [Rie Johnson et. al 2015] gives an early approach to applying a convolutional network to make effective use of word order in text categorization. Recurrent Neural Networks (RNNs) have been effective in various tasks in text analytics, as explained here. Significant progress has been achieved in language translation by modelling machine translation using an encoder-decoder approach with the encoder formed by a neural network [Dzmitry Bahdanau et. al 2014].

However, as shown in [Dan Rosa de Jesus et. al 2018] , certain cases require modelling the hierarchical relationship in text data and is difficult to achieve with traditional deep learning networks because linguistic knowledge may have to be incorporated in these networks to achieve high accuracy. Moreover, deep learning networks do not consider hierarchical relationships between local features as pooling operation of CNNs lose information about the hierarchical relationships.

We show one industrial scale use case of capsule networks which we have implemented for our client in the realm of text analytics – news categorization. We explain how traditional deep learning methods may not be useful in the case when single-label data is only available for training (as in many real-life cases), while the test data set is multi-labelled – this is the sweet spot for capsule networks. We also discuss the key challenges faced industrialization of capsule networks – starting from providing a scalable implementation of capsule networks in TensorFlow, we show how capsule networks can be industrialized by providing an implementation on top of KubeFlow, which helps in productionization.

1. History of impact of machine learning and deep learning on NLP.

2. Motivation for capsule networks and how they can be used in text analytics.

schedule 1 month ago

Sold Out!

45 Mins

Demonstration

Beginner

For United Airlines, running a Safe and Efficient airline is core to our business. And with such a complex operation, we need to constantly track key events that keep the airline running smoothly. While tracking these events can be time-intensive and laborious, we believe developments in deep learning and edge computing are going to help us simplify that process. Over the past few months, United’s Data Science team has been exploring how to leverage advances in computer vision to solve some of these problems. Our presentation will focus on solving one of these tasks: timing how long it takes for passengers to exit an aircraft. We’ll provide an overview of key concepts of video analytics, share how we leveraged open source technology to build a solution and provide a demonstration of our work.

schedule 4 months ago

Sold Out!

45 Mins

Tutorial

Intermediate

The field of Artificial Intelligence powered by Machine Learning and Deep Learning has gone through some phenomenal changes over the last decade. Starting off as just a pure academic and research-oriented domain, we have seen widespread industry adoption across diverse domains including retail, technology, healthcare, science and many more. More than often, the standard toolbox of machine learning, statistical or deep learning models remain the same. New models do come into existence like Capsule Networks, but industry adoption of the same usually takes several years. Hence, in the industry, the main focus of data science or machine learning is more ‘applied’ rather than theoretical and effective application of these models on the right data to solve complex real-world problems is of paramount importance.

A machine learning or deep learning model by itself consists of an algorithm which tries to learn latent patterns and relationships from data without hard-coding fixed rules.Hence, explaining how a model works to the business always poses its own set of challenges.There are some domains in the industry especially in the world of finance like insurance or banking where data scientists often end up having to use more traditional machine learning models (linear or tree-based).The reason being that model interpretability is very important for the business to explain each and every decision being taken by the model.However, this often leads to a sacrifice in performance. This is where complex models like ensembles and neural networks typically give us better and more accurate performance (since true relationships are rarely linear in nature).We, however, end up being unable to have proper interpretations for model decisions.

To address and talk about these gaps, I will take a conceptual yet hands-on approach where we will explore some of these challenges in-depth about explainable artificial intelligence (XAI) and human interpretable machine learning and even showcase with some examples using state-of-the-art model interpretation frameworks in Python!

Govind Chada - Using 3D Convolutional Neural Networks with Visual Insights for Classification of Lung Nodules and Early Detection of Lung Cancer

schedule 1 month ago

Sold Out!

45 Mins

Case Study

Intermediate

Lung cancer is the leading cause of cancer death among both men and women in the U.S., with more than a hundred thousand deaths every year. The five-year survival rate is only 17%; however, early detection of malignant lung nodules significantly improves the chances of survival and prognosis.

This study aims to show that 3D Convolutional Neural Networks (CNNs) which use the full 3D nature of the input data perform better in classifying lung nodules compared to previously used 2D CNNs. It also demonstrates an approach to develop an optimized 3D CNN that performs with state of art classification accuracies. CNNs, like other deep neural networks, have been black boxes giving users no understanding of why they predict what they predict. This study, for the first time, demonstrates that Gradient-weighted Class Activation Mapping (Grad-CAM) techniques can provide visual explanations for model decisions in lung nodule classification by highlighting discriminative regions. Several CNN architectures using Keras and TensorFlow were implemented as part of this study. The publicly available LUNA16 dataset, comprising 888 CT scans with candidate nodules manually annotated by radiologists, was used to train and test the models. The models were optimized by varying the hyperparameters, to reach accuracies exceeding 90%. Grad-CAM techniques were applied to the optimized 3D CNN to generate images that provide quality visual insights into the model decision making. The results demonstrate the promise of 3D CNNs as highly accurate and trustworthy classifiers for early lung cancer detection, leading to improved chances of survival and prognosis.

Anant Jain - Adversarial Attacks on Neural Networks

schedule 3 months ago

Sold Out!

45 Mins

Talk

Intermediate

Since 2014, adversarial examples in Deep Neural Networks have come a long way. This talk aims to be a comprehensive introduction to adversarial attacks including various threat models (black box/white box), approaches to create adversarial examples and will include demos. The talk will dive deep into the intuition behind why adversarial examples exhibit the properties they do — in particular, transferability across models and training data, as well as high confidence of incorrect labels. Finally, we will go over various approaches to mitigate these attacks (Adversarial Training, Defensive Distillation, Gradient Masking, etc.) and discuss what seems to have worked best over the past year.