Book Details

About This Book

Understand the different phases of data mining, along with the tools used at each stage

Explore the different data mining algorithms in depth

Become an expert in optimizing algorithms and situation-based modeling

Who This Book Is For

If you are a developer who is working on data mining for large companies and would like to enhance your knowledge of SQL Server Data Mining Suite, this book is for you. Whether you are brand new to data mining or are a seasoned expert, you will be able to master the skills needed to build a data mining solution.

Table of Contents

Chapter 1: Identifying, Staging, and Understanding Data

Data mining life cycle

Staging data

Understanding and cleansing data

Summary

Chapter 2: Data Model Preparation and Deployment

Preparing data models

Validating data models

Deploying data models

Summary

Chapter 3: Tools of the Trade

SQL Server BI Suite

References

Summary

Chapter 4: Preparing the Data

Listing of popular databases

Summary

Chapter 5: Classification Models

Input, output, and predicted columns

The feature selection

The Microsoft Decision Tree algorithm

The Microsoft Neural Network algorithm

The Microsoft Naïve Bayes algorithm

Summary

Chapter 6: Segmentation and Association Models

The Microsoft Clustering algorithm

The Microsoft Association algorithm

Summary

Chapter 7: Sequence and Regression Models

The Microsoft Sequence Clustering algorithm

The Microsoft Time Series algorithm

Summary

Chapter 8: Data Mining Using Excel and Big Data

Data mining using Microsoft Excel

Data mining using HDInsight and Microsoft Azure Machine Learning

Summary

Chapter 9: Tuning the Models

Getting the real-world data

Adding a clustering model to the data mining structure

Adding the Neural Network model to the data mining structure

Summary

Chapter 10: Troubleshooting

A fraction of rows get transferred into a SQL table

Error during changing of the data type of the table

Troubleshooting the data mining structure performance

Error during the deployment of a model

Summary

What You Will Learn

Get an overview of the data mining life cycle

Understand the intricacies of SQL Server BI Suite with the help of a practical example

Collate data from diverse data sources and build a data warehouse

Gain in-depth knowledge about the various data mining models such as classification, segmentation, association, and more

Perform data mining using Big Data and Excel add-ins

Work on real-world data and gain insights into it using various data mining algorithms

Fine tune data mining models

Troubleshoot problems encountered during data mining activities performed in this book

In Detail

Whether you are new to data mining or are a seasoned expert, this book will provide you with the skills you need to successfully create, customize, and work with Microsoft Data Mining Suite. Starting with the basics, this book will cover how to clean the data, design the problem, and choose a data mining model that will give you the most accurate prediction.

Next, you will be taken through the various classification models such as the decision tree data model, neural network model, as well as Naïve Bayes model. Following this, you'll learn about the clustering and association algorithms, along with the sequencing and regression algorithms, and understand the data mining expressions associated with each algorithm. With ample screenshots that offer a step-by-step account of how to build a data mining solution, this book will ensure your success with this cutting-edge data mining system.

Authors

Amarpreet Singh Bassan

Amarpreet Singh Bassan is a Microsoft Data Platform engineer who works on SQL Server and its surrounding technologies. He is a subject matter expert in SQL Server Analysis Services and reporting services. Amarpreet is also a part of Microsoft's HDInsight team.

Debarchan Sarkar

Debarchan Sarkar is a Microsoft Data Platform engineer. He specializes in the Microsoft SQL Server Business Intelligence stack. Debarchan is a subject matter expert in SQL Server Integration Services and delves deep into the open source world, specifically the Apache Hadoop framework. He is currently working on a technology called HDInsight, which is Microsoft's distribution of Hadoop on Windows. He has authored various books on SQL Server and Big Data, including Microsoft SQL Server 2012 with Hadoop, Packt Publishing, and Pro Microsoft HDInsight: Hadoop on Windows, Apress. His Twitter handle is @debarchans.

Alerts & Offers

Series & Level

We understand your time is important. Uniquely amongst the major publishers, we seek to develop and publish the broadest range of learning and information products on each technology. Every Packt product delivers a specific learning pathway, broadly defined by the Series type. This structured approach enables you to select the pathway which best suits your knowledge level, learning style and task objectives.

Learning

As a new user, these step-by-step tutorial guides will give you all the practical skills necessary to become competent and efficient.

Beginner's Guide

Friendly, informal tutorials that provide a practical introduction using examples, activities, and challenges.

Essentials

Fast paced, concentrated introductions showing the quickest way to put the tool to work in the real world.

Cookbook

A collection of practical self-contained recipes that all users of the technology will find useful for building more powerful and reliable systems.

Blueprints

Guides you through the most common types of project you'll encounter, giving you end-to-end guidance on how to build your specific solution quickly and reliably.

Mastering

Take your skills to the next level with advanced tutorials that will give you confidence to master the tool's most powerful features.

Starting

Accessible to readers adopting the topic, these titles get you into the tool or technology so that you can become an effective user.

Progressing

Building on core skills you already have, these titles share solutions and expertise so you become a highly productive power user.