Analyzing Money

Most analysis solutions I work on contain other people's data, and I maintain a professional distance from the confidential details that data might re­veal. If you’re a DBA or developer, you’re probably in a similar situation. But you gain more insight about OLAP when you analyze data that means something to you personally. My first meaningful experience with OLAP came when a friend sent me a cube containing 13 years of college football data. The cube contained all the teams and their win and loss statistics in different playing situations. I spent a couple of hours intently pivoting, drill­ing down, charting, and creating custom measures, and I found some interesting trends in the data. For ex­ample, some college teams are more likely to win on the road than at their home stadiums.

One way you can set up a cube that has personal meaning is to put your Microsoft Money data into an OLAP cube to explore. If you don’t track your finances in Money—or if looking at your finances would be too depressing—you might still find this example valuable because it demonstrates all the steps involved in constructing a cube. You can download a trial version of Money from the Microsoft Web site and experiment with the Money sample file. I haven't tried to extract data from another application such as Intuit Quicken, but I’m sure if you use one, you can find a similar technique to extract data and build a cube. I think you'll find that being a user of one of your own cubes will help you create better analysis solutions. To build a cube, you need to get the data out of the application (in this case, Money) and into a table format (I chose Microsoft Access), then de­sign a cube in Analysis Manager and pro­cess the cube (load the data from the table format into the cube).

Extracting the Data

The first hurdle was to get the data out of Money. The task seemed trivial at first, but the process wasn't ob­vious. I use Money 2003 Deluxe, and the only export option I found created a Quicken Import Format (QIF) file containing the data from only one account. This data would be usable, but exporting the data from each of my accounts (e.g., check­ing, credit cards, savings, investment, IRA) individually would be a tedious process. Plus, QIF isn't particularly database friendly; every field is on a different line, and the fields are wrapped in punctuation marks. I'd need some time to develop a script or small application to mas­sage the QIF data into a for­mat that I could easily load into a database table. Then, I discovered that I could generate a report in Money and export the report data to a comma-delimited format (CSV) file. That was definitely the ticket because the applications I wanted to use—Microsoft Excel, Access, and Analysis Manager—can readily access CSV files.

My goal was to export all the individual transactions (such as a check or a credit-card charge) but ignore inter­account transfers (such as a payment from checking to my credit card). I didn’t want these transactions be­cause I wanted to limit my analysis to my true expenses and income. I discovered a way to achieve this limited focus by creating what Money calls a "Transactions by Category" report. After I generated this report, I chose the Customize option and changed a few settings. On the Rows & Columns tab, I checked the boxes to include all fields, I set subtotaling to None, and I selected the Show Splits box. These settings gave me the maximum amount of detailed transaction data. Next, I flipped to the Category tab and clicked Select All, then cleared the Include Transfers box to include all the data except the inter­account transfers. Next, I flipped to the Date tab and typed the date range of the transactions I wanted to export: 1/1/2002 to the current date. (If you’re using the Money sample file, you should choose 1/1/2000 to 1/1/2002.) Finally, I clicked OK and saw the report containing all the data I wanted to export.

When you’ve generated your report, you’ll see on the left side of the report under the heading Other Tasks the option Export to Microsoft Excel. This option generates a CSV file, launches Excel, and opens a spread­sheet that contains your exported data. Click the Export to Microsoft Excel op­tion, and type the name of the CSV file (I chose Sample .csv). After you select the filename and click OK, you’ll see your data in Excel.

Now comes the fun part. Any experienced DBA knows that data is never in the right format for a direct import. When you in­spect the Money data, you’ll see some problems. A blank line separates the transactions, the Memo field in investment transactions contains multiple pieces of information, the category field contains both the category and subcategory, and split transactions are a mess. You have to create a script or use an extraction, transformation, and loading (ETL) tool to fix all these problems. Every DBA has a favorite ETL tool, but I simply used Visual Basic for Applications (VBA) in Excel. I created a VBA macro that steps through Excel and reformats the data so that I can import it into Access and, from there, build my cube. For more information about the macro I cre­ated, see “Money Macro.”

If you have an OLE DB driver that reads Excel documents, you could go directly from this step to designing and processing the cube. I didn’t have the necessary driver, so I first loaded the Excel data from Sample.xls (the Excel version of Sample.csv) into an Ac­cess table. To do this, start Access and choose File, New, Blank Database. Pick a filename such as MSMoneySample.mdb. From the File menu, select Get External Data, Import. Select Sample.xls, and Access will show you the Import Spreadsheet Wiz­ard. In the wiz­ard, you need to make a few changes. On the wizard’s first page, select the First Row Contains Column Headings option. In the first column of the third page, change the field option called Indexed to No. Finally, click Finish, and close Access.

Creating the Cube

Now you have the data in a format that you can load into a cube. Launch Analysis Manager to start the cube-creation process. Table 1 shows the dimension structure I chose for the Money cube. In addition to the dimensions that Table 1 shows, I considered making a dimension out of the Reconciled column, but I couldn’t determine any value in analyz­ing reconciled transactions versus unreconciled transactions.

I won’t bore you with all the steps that Analysis Manager takes to create the cube, but I’ll give you a couple of hints that might help if you haven’t created a cube before. First, right-click the Analysis Server you want to use, and choose New Database to create a database. Next, right-click the Data Sources folder in your new database, and select New Data Source. In this step, you can choose the Microsoft Jet 4.0 OLE DB driver if you have the data in Access, as I did. Finally, right-click the Cubes folder, and select New Cube, Wizard to step through the process of creating your dimensions and measures.

After you’ve defined your dimensions and measures, I suggest making two changes from the defaults in the cube editor. First, format the Amount measure as currency, and second, enable drillthrough. You can change the display format of a measure by selecting the measure in the cube editor, displaying the properties, and flipping to the Advanced tab of the properties. You’ll see a property labeled Display Format that has a combo box of choices. You can also enable drillthrough from the cube editor. From the Tools menu, select Drillthrough Options. In the Cube Drillthrough Options dialog box, click Enable Drillthrough, and select the columns you want your OLAP front-end application to display when you drill through a cell of the cube. (I recommend select­ing all the columns.)

Drillthrough lets you see individual trans­actions that make up cell values in the cube. For example, say your money-management application says you spent $1543.21 on din­ing out in March, and you can’t believe it. Drill­through lets you see ex­actly which transactions you, your spouse, or someone else in the family entered in the Dining Out category for March.

If you’re like me, every chart or grid of numbers you see that’s based on the Money data will answer one question and prompt you to ask two more. If you have a good OLAP front-end tool, you can quickly navigate the data by pivoting in and out of di­mensions. Drill down or up, and you’ll learn things about your finances that you never knew before. You could learn these things by running reports in Money, but analyzing the re­ports would take you a long time, and you mightgiveup beforeyou got any information.

Using What You've Learned

The approach I used to build a cube from my Money data reveals problems you might en­counter when sourcing data for any anal­ysis application. For example, you need to be sure to explore the source system's reporting capabilities as an option for ex­tracting data. As I discovered in Money, you might find the results that the reporting capability re­turns to be closer to the data for­mat you want. Also, the technique I used in Excel to massage the data is an example of how to be creative in fixing data-quality or format problems when you're performing ETL of your source data.

Besides learning about your finances and becoming more financially responsible (yeah, right), you can reap another benefit when you browse data that’s meaningful to you: You learn what it’s like to be a user of your own analytic application. This experience will make you better at your job of creating analysis solutions for other people. For ex­ample, after you browse your own data for about 5 seconds, you’ll realize that drill­through isn't optional. You'll no longer take a number at face value; instead, you'll want to see what factors created that number and whether it's legitimate. I hope you enjoy going through this exercise as much as I enjoyed creating it!

From the Blogs

The quest for the Golden Record to achieve a single, accurate and complete version of a customer record is worth the pursuit to attain survivorship. Record matching and consolidation are only the beginning. Melissa Data takes a new approach. Learn how to apply intelligent rules based on reference data to make smarter and better decisions for data cleansing....More

On SQL Servers where Availability Groups (or Mirroring) isn’t in play, I typically recommend keeping a combination of on-box backups along with copying said backups off-box as well. Obviously, keeping databases AND backups on the SAME server is the metaphorical equivalent of putting all of your eggs in one basket – and therefore something you should avoid like the plague....More

One of the biggest strengths of AlwaysOn Availability Groups is that they allow DBAs to address both high availability and disaster recovery concerns from a single set of tooling or interfaces. But, this doesn’t mean that you won’t still need backups....More