Tag: Hadoop

5 Big Data Security Challenges

As interest in big data has exploded, organizations have rushed to grab competitive advantage by deploying analytics pipelines that exploit this newly available resource. Many projects have been set up in a “skunkworks” environment, often by data science teams. While this has accelerated the time to market for new features, it has created a potential […]

Polybase: Big (Data) Things in SQL Server 2016 – Part 2

Part Two: Transact-SQL Queries of Hadoop Data Three tasks are necessary if we are to query Hadoop data with Polybase. We must define the connection information, we must define how the semi-structured Hadoop data is to be parsed, and we must specify the row-and-column format for the parsed data that we will use to write […]

Polybase: Big (Data) Things in SQL Server 2016 – Part 1

Installation and Configuration of SQL Server 2016 Polybase Introduction Among SQL Server’s many new features introduced in 2016 is Polybase, a bridge between relational data stored in SQL Server and bulk data stored in the Hadoop ecosystem. Hadoop is an excellent store for what many folks call “semistructured” data. That is, data which cannot be […]

How to Use Power Query to Import Hadoop Data into Excel

When Power Query was first introduced early in 2013 it was known as the Data Explorer. In some ways, Data Explorer was a better name. The primary job of Power Query is to enable Excel users to examine data, decide what values need to be imported into Excel, and then complete the import process. If […]

How to Build a Predictive Model Using Azure Machine Learning

In this article, we’ll use Microsoft’s Azure Machine Learning (ML) service to predict breast cancer diagnoses from test data. If you don’t have an Azure account, a free trial is available. Note that, at the time of writing, ML is in preview, so the details may change. However, the basic concepts should still apply. Obtaining […]