Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our User Agreement and Privacy Policy.

Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our Privacy Policy and User Agreement for details.

Migrating Existing Open Source Machine Learning to Azure

Your data scientists have created predictive models using open-source tools, proprietary software, or some combination of both, and now you are interested in lifting and shifting those models to the cloud. In this talk, I'll describe how data scientists can transition their existing workflows — while using mostly the same tools and processes — to train and deploy machine learning models based on open source frameworks to Azure. I'll provide guidance on keeping connections to data sources up-to-date, evaluating and monitoring models, and deploying applications that make use of those models.

Typical data scientist coding workflow looks something like this: You start small with a single DSVM. You perfect the code on just a subset of the data. Don’t worry about big data right away to keep good code pace. Once you are satisfied code works on a single machine, try to scale it up to larger VMs. Can GPUs or HPC type configurations help. And finally start working with your full dataset. You should have a good idea what kind of config you may need based on the single VM and scale up scenarios. From your DSVM desktop you can connect to remote Spark nodes, submit jobs to a batch pool, leverage a scale set which can autoscale. Bottom line is that ue DSVM for the tools, use different Azure services to help you scale up and scale out.

Azure Batch provides APIs for creating pools of resources, and then scheduling jobs and tasks to those resources. And the best part is that there is no charge for using Batch: you just pay for the compute and storage resources.

We understand that cost savings are paramount for customers.

Azure offers flexible consumption. Mix and match low-priority VMs at discounted rates with on-demand VMs, along with per-minute billing to address with your priorities and budget.

Left: Shared Data Stores, both cloud and on Prem. Center: DSVM as dev environments in the cloud Right: Trained Models and Code deployed from DSVM to Other Production systems or DSVMS used in Production as well.

Migrating Existing Open Source Machine Learning to Azure

3.
Visual Studio [Code] Tools for AI
VS & VS Code extensions to
streamline computations in
servers, Azure ML, Batch AI, …
End to end development
environment, from new project
through training
Support for remote training & job
management
On top of all of the goodness of
VS (Python, Jupyter, Git, etc)
THR3129 Getting Started with Visual Studio Tools for AI, Chris Lauren

10.
Azure Batch Batch pools
Configure and
create VMs to cater
for any scale: tens
to thousands.
Automatically scale
the number of
VMs to maximize
utilization.
Choose the VM
size most suited
to your
application.
Batch jobs and tasks
Task is a unit of execution;
task = command line application
Jobs created and tasks submitted
to a pool; tasks are queued, then
assigned to VMs.
Any application, any
execution time; run
applications unchanged.
Automatic detection and
retry of frozen or failing
tasks.

15.
• Traditionally, static-sized clusters were the standard, so
compute and storage had to be collocated
• A single cluster with all necessary applications would be
installed onto the cluster (typically managed by YARN, or
something similar)
• The cluster was either over-utilized (jobs had to be
queued due to lack of capacity) OR was under-utilized
(there were idle cores that burned costs)
• Teams of data-scientists would have to submit jobs agaisnt
a single cluster - this meant that the cluster had to be
generic, preventing users from truly customizing their
clusters specifically for their jobs
Traditional / On-Premise Paradigm
DataStore

16.
• With cloud computing, customers are no longer limited to
static size clusters
• Each job, or set of jobs, can have its own cluster so that a
customer is only charged for the minutes that the job runs
for
• Each user can have their own cluster, so that they don’t
have to compute for resources
• Each user can have their own custom cluster that is
created specifically for their experience and their
workload. Each user can install exactly the software they
need without polluting other user’s experiences
• IT admins don’t need to worry about running out of
capacity or burning dollars on idle cores
Modern / Cloud Paradigm
DataStore