What's New (Analysis Services - Data Mining)

SQL Server 2008 R2

Note

For SQL Server 2008 R2, there have been no changes to the content that is listed in this topic.

This latest release of Microsoft SQL Server Analysis Services introduces new features and enhancements. For information about the new in-memory analysis features, and about Sharepoint integrated mode for Analysis Services, see PowerPivot for Sharepoint, see PowerPivot for SharePoint.

SQL Server 2008 R2 supports the creation, management, and use of data mining models from Microsoft Excel when you use the SQL Server 2008 Data Mining Add-ins for Office 2007. The version of this popular free add-in can be used to connect to instances of Analysis Services that are using either SQL Server 2008 R2 or SQL Server 2008. You cannot directly use the in-memory multidimensional data sets that are created by PowerPivot for Excel.

You can install the Data Mining Add-ins on the same computer as the PowerPivot for Excel add-in, and use them within the same Excel 2010 workbook. However, to use the Data Mining add-ins you must have installed a 32-bit version of Excel 2010. The PowerPivot client can run on either a 32-bit or 64-bit version of Excel 2010.

Books Online no longer includes SQL Server sample databases and sample applications. These sample databases and sample applications are now available on the SQL Server Samples Web site. This Web site makes it easier for users to find these samples, and provides additional new samples that are related to Microsoft SQL Server and Business Intelligence. On the SQL Server Samples Web site, you can do the following:

Browse through samples contributed by developers, users, and the Microsoft Most Valuable Professional (MVP) community.

Download both sample databases and code projects.

View or participate in a discussion area where you can report issues and ask questions about the samples for each technology area.

Additional samples that use the PowerPivot client and the new Sharepoint Integrated instance of Analysis Services can be found on the site, PowerPivot.com.

The R2 release of SQL Server 2008 supports the following features that were new in SQL Server 2008.

Creation of Holdout Test Sets

When you create a mining structure, you can now divide the data in the mining structure into training and testing sets. The definition of the partition is stored with the structure, so that you can reuse the training and testing sets with any mining models that are based on that structure.

Filtering on Model Cases

You can now attach filters to a mining model, and apply the filter during both training and testing. Applying a filter to the model lets you control the data that is used to train the model, and lets you more easily assess the performance of the model on subsets of the data.

Cross-Validation of Multiple Mining Models

Cross-validation is an established method of assessing the accuracy of data mining models. In cross-validation, you iteratively partition the mining structure data into subsets, build models on the subsets, and then measure the accuracy of the model for each partition. By reviewing the returned statistics, you can determine how reliable the mining model is, and more easily compare models that are based on the same structure.

Enhancements to the Microsoft Time Series Algorithm

To improve the accuracy and stability of some predictions in time series models, a new algorithm has been added to the Microsoft Time Series algorithm. Based on the well-known ARIMA algorithm, the new algorithm provides better long-term predictions than the ARTxp algorithm that Analysis Services has been using. (ARTxp is an auto-regressive tree algorithm that is optimized for either a single time slice or short-term predictions.)

Drillthrough to Structure Cases and Structure Columns

In SQL Server 2008, if you enable drillthrough on a mining structure, you can query the mining structure and return details about the cases used for both training and testing. You can create drillthrough queries on a structure by using Data Mining Extensions (DMX).

Querying the Data Mining Schema Rowsets

In SQL Server 2008. many of the existing OLE DB data mining schema rowsets have been exposed as a set of system tables that you can easily query by using DMX statements. This makes it easy to retrieve metadata related to models and structures, to extract details from the mining model content, or to monitor an Analysis Services instance or service.

In SQL Server 2008, the default configuration of the Microsoft clustering algorithm was changed to use z-score normalization by default. The intent of this change is to minimize the effect of attributes that might have large magnitudes and many outliers. Typically, z-score normalization improves clustering results. However, it might alter the clustering results on non-normal distributions. Also, customers who migrate solutions from an earlier version of Analysis Services to SQL Server 2008 Analysis Services might notice that clustering models now produce different results. For more information, see Microsoft Clustering Algorithm Technical Reference

If you use Analysis Services to create OLAP cubes that you also use for data mining, you might find it much easier to design dimensions and their related hierarchies and attributes. The Dimension Designer includes a new Attribute Relationship designer that helps you design attribute relationships and make sure that attribute relationships follow best practices.