A New Pill for Your SSIS ETL Headaches

If you’re working with SQL Server 2005 Integration Services (SSIS), you might already have discovered one of the problems with its ETL process: SSIS doesn’t provide an easy way to extract, transform, and load data from unstructured or semi-structured data sources such as Microsoft Excel spreadsheets and reports, raw text files, Oracle databases, and ODBC data sources. According to Vassil Kovatchev, chief technology officer of Interactive Edge, the current SSIS solution for bringing unstructured data into the data flow is to write hard-coded, custom scripts—a time-consuming manual process. Interactive Edge provides a more satisfactory solution with its new Visual Studio plug-in component, DataDefractor. In a recent conversation with our editors, Kovatchev gave an example of a real-estate report in Excel that displayed time-period information in both columns and rows (years in columns, months on the rows). Kovatchev explained, “Writing a script to transform a report like this would typically take about five days. With DataDefractor, the transformation takes about ten minutes.”

This time savings is enough to make most database professionals sit up and take notice. The DataDefractor tool is a custom SSIS data source flow component that’s fact-oriented and rules-based. The wizard-like interface lets you customize dimensions and measures to quickly transform unstructured or semi-structured data into normalized, usable data. How did Microsoft overlook the need for this kind of component in SSIS? As Kovatchev explained, Microsoft is platform-oriented and is happy to rely on ISVs to fill in the gaps in the platforms it creates. Companies like Interactive Edge can then find opportunities to provide useful tools to make database pros’ lives easier. DataDefractor, which is currently in beta, will be officially released March 16.

From the Blogs

The quest for the Golden Record to achieve a single, accurate and complete version of a customer record is worth the pursuit to attain survivorship. Record matching and consolidation are only the beginning. Melissa Data takes a new approach. Learn how to apply intelligent rules based on reference data to make smarter and better decisions for data cleansing....More

On SQL Servers where Availability Groups (or Mirroring) isn’t in play, I typically recommend keeping a combination of on-box backups along with copying said backups off-box as well. Obviously, keeping databases AND backups on the SAME server is the metaphorical equivalent of putting all of your eggs in one basket – and therefore something you should avoid like the plague....More

One of the biggest strengths of AlwaysOn Availability Groups is that they allow DBAs to address both high availability and disaster recovery concerns from a single set of tooling or interfaces. But, this doesn’t mean that you won’t still need backups....More