Machine Learning

KEY FEATURES
* An in-depth exploration of Julia's growing ecosystem of packages
* Work with the most powerful open-source libraries for deep learning, data wrangling, and data visualization
* Learn about deep learning using Mocha.jl and give speed and high performance to data analysis on large datamore » sets
BOOK DESCRIPTION
Julia is a fast and high performing language that's perfectly suited to data science with a mature package ecosystem and is now feature complete. It is a good tool for a data science practitioner. There was a famous post at Harvard Business Review that Data Scientist is the sexiest job of the 21st century. (https://hbr.org/2012/10/data-scientist-the-sexiest-job-of-the-21st-century).
This book will help you get familiarised with Julia's rich ecosystem, which is continuously evolving, allowing you to stay on top of your game.
This book contains the essentials of data science and gives a high-level overview of advanced statistics and techniques. You will dive in and will work on generating insights by performing inferential statistics, and will reveal hidden patterns and trends using data mining. This has the practical coverage of statistics and machine learning. You will develop knowledge to build statistical models and machine learning systems in Julia with attractive visualizations.
You will then delve into the world of Deep learning in Julia and will understand the framework, Mocha.jl with which you can create artificial neural networks and implement deep learning.
This book addresses the challenges of real-world data science problems, including data cleaning, data preparation, inferential statistics, statistical modeling, building high-performance machine learning systems and creating effective visualizations using Julia.
WHAT YOU WILL LEARN
* Apply statistical models in Julia for data-driven decisions
* Understanding the process of data munging and data preparation using Julia
* Explore techniques to visualize data using Julia and D3 based packages
* Using Julia to create self-learning systems using cutting edge machine learning algorithms
* Create supervised and unsupervised machine learning systems using Julia. Also, explore ensemble models
* Build a recommendation engine in Julia
* Dive into Julia’s deep learning framework and build a system using Mocha.jl
ABOUT THE AUTHOR
Anshul Joshi is a data science professional with more than 2 years of experience primarily in data munging, recommendation systems, predictive modeling, and distributed computing. He is a deep learning and AI enthusiast. Most of the time, he can be caught exploring GitHub or trying anything new on which he can get his hands on. He blogs on anshuljoshi.xyz.
TABLE OF CONTENTS
1. The Groundwork – Julia's Environment
2. Data Munging
3. Data Exploration
4. Deep Dive into Inferential Statistics
5. Making Sense of Data Using Visualization
6. Supervised Machine Learning
7. Unsupervised Machine Learning
8. Creating Ensemble Models
9. Time Series
10. Collaborative Filtering and Recommendation System
11. Introduction to Deep Learning « less

Key Features
Learn to use various data analysis tools and algorithms to classify, cluster, visualize, simulate, and forecast your data Apply Machine Learning algorithms to different kinds of data such as social networks, time series, and images A hands-on guide to understanding the nature of datamore » and how to turn it into insight Book Description Beyond buzzwords like Big Data or Data Science, there are a great opportunities to innovate in many businesses using data analysis to get data-driven products. Data analysis involves asking many questions about data in order to discover insights and generate value for a product or a service. This book explains the basic data algorithms without the theoretical jargon, and you'll get hands-on turning data into insights using machine learning techniques. We will perform data-driven innovation processing for several types of data such as text, Images, social network graphs, documents, and time series, showing you how to implement large data processing with MongoDB and Apache Spark.
What you will learn
Acquire, format, and visualize your data
Build an image-similarity search engine
Generate meaningful visualizations anyone can understand
Get started with analyzing social network graphs
Find out how to implement sentiment text analysis Install data analysis tools such as Pandas, MongoDB, and Apache Spark
Get to grips with Apache Spark Implement machine learning algorithms such as classification or forecasting
About the Author
Hector Cuesta is founder and Chief Data Scientist at Dataxios, a machine intelligence research company. Holds a BA in Informatics and a M.Sc. in Computer Science. He provides consulting services for data-driven product design with experience in a variety of industries including financial services, retail, fintech, e-learning and Human Resources. He is an enthusiast of Robotics in his spare time. « less

About This Book
Perform data analysis and build predictive models on huge datasets that leverage
Apache Spark Learn to integrate data science algorithms and techniques with the fast and scalable computing features of Spark to address big data challenges
Work through practical examples on real-worldmore » problems with sample code snippets
Who This Book Is For
This book is for anyone who wants to leverage Apache Spark for data science and machine learning. If you are a technologist who wants to expand your knowledge to perform data science operations in Spark, or a data scientist who wants to understand how algorithms are implemented in Spark, or a newbie with minimal development experience who wants to learn about Big Data Analytics, this book is for you!
What You Will Learn
Consolidate, clean, and transform your data acquired from various data sources
Perform statistical analysis of data to find hidden insights
Explore graphical techniques to see what your data looks like
Use machine learning techniques to build predictive models
Build scalable data products and solutions
Start programming using the RDD, DataFrame and Dataset APIs
Become an expert by improving your data analytical skills
In Detail
This is the era of Big Data. The words ‘Big Data’implies big innovation and enables a competitive advantage for businesses. Apache Spark was designed to perform Big Data analytics at scale, and so Spark is equipped with the necessary algorithms and supports multiple programming languages. Whether you are a technologist, a data scientist, or a beginner to Big Data analytics, this book will provide you with all the skills necessary to perform statistical data analysis, data visualization, predictive modeling, and build scalable data products or solutions using Python, Scala, and R. « less

Using machine learning to gain deeper insights from data is a key skill required by modern application developers and analysts alike. Python is a wonderful language to develop machine learning applications. As a dynamic language, it allows for fast exploration and experimentation. With its excellentmore » collection of open source machine learning libraries you can focus on the task at hand while being able to quickly try out many ideas.
This book shows you exactly how to find patterns in your raw data. You will start by brushing up on your Python machine learning knowledge and introducing libraries. You'll quickly get to grips with serious, real-world projects on datasets, using modeling, creating recommendation systems. Later on, the book covers advanced topics such as topic modeling, basket analysis, and cloud computing. These will extend your abilities and enable you to create large complex systems.
With this book, you gain the tools and understanding required to build your own systems, tailored to solve your real-world data analysis problems. « less

In the past few years the generation of data and our capability to store and process it has grown exponentially. There is a need for scalable analytics frameworks and people with the right skills to get the information needed from this Big Data. Apache Mahout is one of the first and most prominent Bigmore » Data machine learning platforms. It implements machine learning algorithms on top of distributed processing platforms such as Hadoop and Spark.
Starting with the basics of Mahout and machine learning, you will explore prominent algorithms and their implementation in Mahout development. You will learn about Mahout building blocks, addressing feature extraction, reduction and the curse of dimensionality, delving into classification use cases with the random forest and Naïve Bayes classifier and item and user-based recommendation. You will then work with clustering Mahout using the K-means algorithm and implement Mahout without MapReduce. Finish with a flourish by exploring end-to-end use cases on customer analytics and test analytics to get a real-life practical know-how of analytics projects.
Who This Book Is For
If you are a Java developer and want to use Mahout and machine learning to solve Big Data Analytics use cases then this book is for you. Familiarity with shell scripts is assumed but no prior experience is required. « less

Learn a simpler and more effective way to analyze data and predict outcomes with Python
Machine Learning in Python shows you how to successfully analyze data using only two core machine learning algorithms, and how to apply them using Python. By focusing on two algorithm families that effectivelymore » predict outcomes, this book is able to provide full descriptions of the mechanisms at work, and the examples that illustrate the machinery with specific, hackable code. The algorithms are explained in simple terms with no complex math and applied using Python, with guidance on algorithm selection, data preparation, and using the trained models in practice. You will learn a core set of Python programming techniques, various methods of building predictive models, and how to measure the performance of each model to ensure that the right one is used. The chapters on penalized linear regression and ensemble methods dive deep into each of the algorithms, and you can use the sample code in the book to develop your own data analysis solutions.
Machine learning algorithms are at the core of data analytics and visualization. In the past, these methods required a deep background in math and statistics, often in combination with the specialized R programming language. This book demonstrates how machine learning can be implemented using the more widely used and accessible Python programming language. « less

The R language is a powerful open source functional programming language. At its core, R is a statistical programming language that provides impressive tools to analyze data and create high-level graphics.
This book covers the basics of R by setting up a user-friendly programming environment and performingmore » data ETL in R. Data exploration examples are provided that demonstrate how powerful data visualization and machine learning is in discovering hidden relationships. You will then dive into important machine learning topics, including data classification, regression, clustering, association rule mining, and dimension reduction.
***** Who This Book Is For *****
If you want to learn how to use R for machine learning and gain insights from your data, then this book is ideal for you. Regardless of your level of experience, this book covers the basics of applying R to machine learning through to advanced techniques. While it is helpful if you are familiar with basic programming or machine learning concepts, you do not require prior experience to benefit from this book. « less

Clojure for Machine Learning is an introduction to machine learning techniques and algorithms. This book demonstrates how you can apply these techniques to real-world problems using the Clojure programming language.
It explores many machine learning techniques and also describes how to use Clojuremore » to build machine learning systems. This book starts off by introducing the simple machine learning problems of regression and classification. It also describes how you can implement these machine learning techniques in Clojure. The book also demonstrates several Clojure libraries, which can be useful in solving machine learning problems. « less

Apache Spark is a framework for distributed computing that is designed from the ground up to be optimized for low latency tasks and in-memory data storage. It is one of the few frameworks for parallel computing that combines speed, scalability, in-memory processing, and fault tolerance with ease of programmingmore » and a flexible, expressive, and powerful API design.
This book guides you through the basics of Spark's API used to load and process data and prepare the data to use as input to the various machine learning models. There are detailed examples and real-world use cases for you to explore common machine learning models including recommender systems, classification, regression, clustering, and dimensionality reduction. You will cover advanced topics such as working with large-scale text data, and methods for online machine learning and model evaluation using Spark Streaming. « less