Microsoft Analysis Services, MDX, DAX, Power Pivot, Power Query and Power BI

‘Events in Progress’ Part 3 – Average Age of Unpaid Invoices

Sorry for the relative quiet on here recently, I’ve been caught up with helping to organise SQLBits and other stuff… But anyway, this week I’m working with a company who asked me to tackle what must be a very common business problem: they have a fact table where one row represents an invoice they have generated, and they need to know at any point in time what the average age of all their unpaid invoices is, and monitor whether this value goes up or down. The point of this is, of course, is that if the average age of outstanding invoices starts to go up over time then it means that their customers are getting slower to pay them and they may potentially have a cashflow problem.

The solution is a variation on the ‘events in progress’ model that I blogged about recently but with an interesting twist that I thought was worth describing. First of all, we need to prepare some data so we can reproduce the scenario in Adventure Works – let’s pretend that the FactInternetSales table actually contains invoice data. We’ll need to massage the data a bit for two reasons:

In Adventure Works if you look at the OrderDate, ShipDate and DueDate columns there’s always exactly the same number of days between the dates, which doesn’t make for a great demo

In the real world some invoices will be unpaid when the fact table is loaded, others will be paid. We need to be able to handle both scenarios properly.

Once again, we need two fact tables. The first of these will contain all of our invoices, and for this I used the following named query in the Adventure Works DSV:

SELECT CustomerKey, OrderDateKey FROM dbo.FactInternetSales

The second will be a fact table that contains information on invoices that have been paid:

SELECT CustomerKey, OrderDateKey, CASE WHEN SalesTerritoryKey > 5 THEN ShipDateKey ELSE DueDateKey END AS PaidDateKey, CASE WHEN SalesTerritoryKey > 5 THEN 7 ELSE 12 END AS DaysToPayment FROM dbo.FactInternetSales WHERE (PromotionKey < 3)

A few things to notice in this second query:

I’ve filtered the fact table to remove a very small number of rows where PromotionKey is less than 3 – this will give us a few unpaid invoices to make the numbers more interesting.

I’ve created a new date key called PaidDateKey to make the invoices have different payment lengths.

Because these are paid invoices, we know how long they took to be paid and we can precalculate this and put it in the fact table in the DaysToPayment column.

We can now build the cube. It needs two measure groups, one based on each fact table, and we can add the Customer dimension too. Three measures need to be built: one count measure on each measure group and a sum measure on the DaysToPayment column. The Date dimension joins to the OrderDateKey on the All Invoices measure group, and to the PaidDateKey on the Paid Invoices measure group.

Now for the MDX. The first thing we need to do is find the total number of invoices generated to date and the total number of invoices that have been paid to date (and also to make sure that these totals aggregate up properly through the time dimension); we can then subtract the second from the first to find the total number of unpaid invoices:

This is the easy bit. In order to find out what the average of the number of days the unpaid invoices have actually been unpaid, we need to find the number of days that each unpaid invoice has been open for, sum that up, and divide it by the total number of unpaid invoices. And to find this sum of the number of days that all unpaid invoices are open, we need to do the following:

Find the total up to yesterday of the [Unpaid Invoices Count] measure. This gives the total number of days that every invoice has ever been open.

Then sum the total to date of the Days to Payment measure we created on the Paid Invoices fact table, to find the total number of days that all paid invoices were ever open

Then subtract this second figure from the first, which will give us the sum of the number of days all currently unpaid invoices have been open

Here’s the code:

CREATE MEMBER CURRENTCUBE.MEASURES.[Sum of Ages of Invoices To Date] AS NULL;
SCOPE(MEASURES.[Sum of Ages of Invoices To Date]);
SCOPE([Date].[Date].[Date].MEMBERS);
THIS = SUM(NULL:[Date].[Date].CURRENTMEMBER.PREVMEMBER
, MEASURES.[Unpaid Invoices Count]);
END SCOPE;
SCOPE([Date].[Date].[All Periods]);
THIS = (TAIL(EXISTING [Date].[Date].[Date].MEMBERS, 1).ITEM(0).ITEM(0)
, MEASURES.[Sum of Ages of Invoices To Date]);
END SCOPE;
END SCOPE;
CREATE MEMBER CURRENTCUBE.MEASURES.[Sum of Ages of Paid Invoices To Date] AS NULL;
SCOPE(MEASURES.[Sum of Ages of Paid Invoices To Date]);
SCOPE([Date].[Date].[Date].MEMBERS);
THIS = SUM(NULL:[Date].[Date].CURRENTMEMBER, [Measures].[Days To Payment]);
END SCOPE;
SCOPE([Date].[Date].[All Periods]);
THIS = (TAIL(EXISTING [Date].[Date].[Date].MEMBERS, 1).ITEM(0).ITEM(0)
, MEASURES.[Sum of Ages of Paid Invoices To Date]);
END SCOPE;
END SCOPE;
CREATE MEMBER CURRENTCUBE.MEASURES.[Difference of Sum of Ages] AS
MEASURES.[Sum of Ages of Invoices To Date] - MEASURES.[Sum of Ages of Paid Invoices To Date];

Hi Chris,
Nice solution, I can see a real business use for this. On a similar note, if you have access to the financial data, you can calculate the DSO ( days sales outstanding ). This is a measure used in finance to get at a similar number, though you would need a fact snapshot to calculate over time. The calculation, which you can google, is simpler however. It would be interesting to see how they compare.
This may not address other applications for your solution, so thank you for posting.
Mark

Hi Chris.
One question about mdx resolution for this part :
THIS = SUM(NULL:[Date].[Date].CURRENTMEMBER, [Measures].[All Invoices Count]);

When the currentmember is a day, does it means ssas will retrieve data for each individual day and make the sum after? or the query engin is smart enough to combine daily aggregation (for the month of the currentmember) and monthly aggregation (for previous monthes)
My concern is of y have 20 years of data and choose date in 2011 it can have a lot of cells to read.

If it use only daily aggregation, does this optimisation make sense (the goal is to force it to use monthly aggregations and to use daily aggregation only fors days in the current month):

Yes, it means SSAS will retrieve the data for each individual day and make the sum. SSAS does not take monthly, quarterly or yearly values but in actual fact there are two problems with doing this:
1) There could be calculations in place that mean resolving a date range to months, quarters and/or years would return incorrect results.
2) More importantly (and despite what you might think), in all recent versions of SSAS it’s almost always quicker to sum up a set of dates and not try to do the optimisation you’re thinking of. In the days of SSAS 2000 I did use this optimisation successfully (my second-ever blog post was on this subject: http://cwebbbi.wordpress.com/2004/12/31/tuning-ytd-style-calculations/) but today an expression like THIS = SUM(NULL:[Date].[Date].CURRENTMEMBER, [Measures].[All Invoices Count]); is going to be faster. I know this seems strange but you have to trust me :-)

Hi Chris and thanks for reply so quickly
yesterday i try to implement it on a support ticket tracking cube, the goal is to calculate backlog (works fine) and to calculate average delay to close tickets.
On the first year perf was good but if i some some date range far from the beginning response times are poor (like 15/20 sec on a core I7 + 8Go ram + ssd, it is ssas 2K8R2, aggregation are designed, only 3 dimensions and less than 50 000 rows for 6 years of data)
i implemented my workaround and go back under 1 sec whatever the date range i pick.
I will try to put mdx studio and check what happens on both queries and provide some materials.

Hi Chris, this was very helpful to a current project I am working on. I do have a question and I was hoping that you might have come accross a similar scenario. I have two dates, a StartDate and a CancellationDate. I need to find the total number of accounts and where the StartDate is less than a given date and the Cancellation Date is greater than the given date. I have tried several variation of the mdx and don’t get the numbers that I should have. Do you have any suggestions?

I can’t really make any suggestions without seeing your cube and MDX, I’m afraid – this sounds like a classic events-in-progress problem, so the MDX in this and earlier posts should work. What kind of problems are you seeing with the numbers?

Here is the SQL code I run that gets me the numbers I should have in the cube:
Start Date is 7/16/2012 and end date is 7/22/2012

Select COUNT(*) as Verified from ods.Account
where UtilityAcceptanceDate @EndDate
I get 1,047,594 when I run the query in SQL which is the correct number

When I look at the Cube and pull in the week I get 3,715 if I expand the week and add dates I get approximately 586,000 for each day. Not sure why this is working like it is I have a standard cube structure. I am temporarily pulling in the table twice and joining it to the date dimension one on UtilityAcceptanceDate and on for UtilityCancelDate. Below is my MDX:

CALCULATE;
CREATE MEMBER CURRENTCUBE.[Measures].[AcceptedCount]
AS null;
Scope(Measures.[AcceptedCount]);
Scope([Date].[Date].members);
This = sum(null:[Date].[Date].currentmember,[Measures].[AcceptanceCounts] );
End scope;
end scope;
CREATE MEMBER CURRENTCUBE.[Measures].[SalesCount]
AS null;
Scope(Measures.[SalesCount]);
Scope([Date].[Date].members);
This = sum(null:[Date].[Date].currentmember, [Measures].[SalesCounts]);
End scope;
end scope;

Have you got many missing dates, ie rows where one or both of the date values is null, or has a default value a long time in the past, or uses the unknown member? That could be one cause of this problem.

And are there a lot of these records? Could this be part of the explanation? Actually, I’m not sure it is, but it’s worth investigating. Do the AcceptedCount and SalesCounts measures return reasonable numbers?

I have a similar problem and I hope that you can help me. I have a fact table with flights from clients. Each line is a flight of a client. What I need is the average days between flights for all clients on a selected period. If the client only made one flight then it should be zero.

I think this is actually a different, easier problem. What I’d do here is, in the ETL, calculate the days since the customer’s last flight and store it in a measure in the fact table. You can then create a sum measure from it in the cube, and then to get the average create a second measure that counts the number of flights and divides the sum by the count.

Public MDX and SSAS Training Courses

I'm running several SSAS, MDX and PowerPivot-related training courses through Technitrain in 2015. Check out the Technitrain course catalogue for full details, and to see other upcoming courses from the likes of Allan Hirt, Jamie Thomson, Andy Leonard, Thomas Kejser and others.

Need some help?

As well as being a blogger, I'm an independent consultant specialising in Analysis Services, MDX, PowerPivot and DAX. I work with customers from all round the world solving cube design problems, performance tuning Analysis Services queries and calculations, writing complex MDX and delivering training courses, and I am happy to work on short-term engagements. For more details see http://www.crossjoin.co.uk