A few days ago I published the article FILTER vs CALCULATETABLE: optimization using cardinality estimation, where I try to explain why the sentence “CALCULATETABLE is better than FILTER” is not always true. In reality, CALCULATETABLE internally might use FILTER for every logical expression you use as a filter argument. What really matters is the cardinality of the table iterated by the FILTER, regardless of the fact it’s an explicit statement or an implicit one generated automatically by CALCULATETABLE.

In addition to the article, there is a digression related to the use of time intelligence functions, which returns a table and not a scalar values. These functions (such as DATESBETWEEN and LASTDATE) might seem better than FILTER, but this is not necessarily true.

For example, consider this statement:

CALCULATE (

SUM ( Movements[Quantity] ),

FILTER (

ALL ( 'Date'[Date] ),

'Date'[Date] <= MAX( 'Date'[Date] )

)

)

Can avoid the FILTER statement using DATESBETWEEN? Yes, we can replace the filter with the following expression:

CALCULATE (

SUM ( Movements[Quantity] ),

DATESBETWEEN (

'Date'[Date],

BLANK(),

MAX ( 'Date'[Date] )

)

)

Is this faster? No. DATESBETWEEN is executed by the formula engine, it’s not better than FILTER. But there is more. You might wonder why I’m using MAX instead of LASTDATE. Well, in the FILTER example there was a semantic reason, I would have obtained a different result. LASTDATE returns a table, not a scalar value, even if it is a table containing only one row, which can be converted into a scalar value. More important, LASTDATE performs a context transition, which would transform the row context produced by the FILTER iteration into a filter context, hiding the existing filter context that I wanted to consider in my original expression. Now, in DATESBETWEEN I don’t have this issue, so I can write it using LASTDATE obtaining the same result:

CALCULATE (

SUM ( Movements[Quantity] ),

DATESBETWEEN (

'Date'[Date],

BLANK(),

LASTDATE ( 'Date'[Date] )

)

)

But this is not for free. The LASTDATE function produces a more expensive execution plan in this case. Consider LASTDATE only as filter argument of CALCULATE/CALCULATETABLE, such as:

CALCULATE (

SUM ( Movements[Quantity] ),

LASTDATE ( 'Date'[Date] )

)

At the end of the day, a filter argument in a CALCULATE function has to be a table (of values in one column or of rows in a table), so using a table expression in a filter argument is fine, because in this case a table is expected and there are no context transitions. But think twice before using LASTDATE where a scalar value is expected, using MAX is a smarter choice.

Comment Notification

Comments

We have a measure in our Tabular model that uses the DATESBETWEEN function but unfortunately we are seeing the following error when a user selects a date for which there is no data held against that date:

"An invalid numeric representation of a date value was encountered"

Our measure is coded as follows:

Measure :=

CALCULATE (

SUM ( 'FactTable'[Amount] ),

DATESBETWEEN (

'Date'[Calendar Date],

IF (

MONTH ( FIRSTDATE ( 'Date'[Calendar Date] ) ) >= 7,

DATE ( YEAR ( FIRSTDATE ( 'Date'[Calendar date] ) ) - 1, 07, 01 ),

DATE ( YEAR ( FIRSTDATE ( 'Date'[Calendar date] ) ) - 2, 07, 01 )

),

IF (

MONTH ( FIRSTDATE ( 'Date'[Calendar Date] ) ) >= 7,

DATE ( YEAR ( FIRSTDATE ( 'Date'[Calendar date] ) ), 06, 30 ),

DATE ( YEAR ( FIRSTDATE ( 'Date'[Calendar date] ) ) - 1, 06, 30 )

)

),

'TableX',

'TableY',

'TableZ'

)

I can get this to run in SSMS by introducing an IF else BLANK() statement around the calculation, using the following code:

EVALUATE

(

SUMMARIZE (

'Date',

'Date'[Financial Year],

"Test", IF (

COUNTROWS ( 'FactTable' ) > 0,

CALCULATE (

SUM ( 'FactTable'[Amount] ),

DATESBETWEEN (

'Date'[Calendar Date],

IF (

MONTH ( FIRSTDATE ( 'Date'[Calendar Date] ) ) >= 7,

DATE ( YEAR ( FIRSTDATE ( 'Date'[Calendar date] ) ) - 1, 07, 01 ),

DATE ( YEAR ( FIRSTDATE ( 'Date'[Calendar date] ) ) - 2, 07, 01 )

),

IF (

MONTH ( FIRSTDATE ( 'Date'[Calendar Date] ) ) >= 7,

DATE ( YEAR ( FIRSTDATE ( 'Date'[Calendar date] ) ), 06, 30 ),

DATE ( YEAR ( FIRSTDATE ( 'Date'[Calendar date] ) ) - 1, 06, 30 )

)

),

'TableX',

'TableY',

'TableZ'

),

BLANK ()

)

)

)

However, I am unsure as to how to implement this fix into the measure code so that it can be evaluated in various different contexts. Can you advise?

I suppose the problem is different. If the DATE function receive a value that is not valid (e.g. the year is blank) then you get the error. I would check how to protect the DATE call in case the selection is not valid.

I took your DATESBETWEEN statement, and created two measures using FIRSTDATE and LASTDATE...

FIRSTDATE (

DATESBETWEEN (

'Date'[Date],

BLANK (),

LASTDATE ( 'Date'[Date] )

)

)

I wanted to see what date would be returned, based upon the filter context.

In the pivot, I put year and month on rows. I noticed that the results from LASTDATE respected the year/month filter context. But FIRSTDATE always returned the very first date in the 'Date'[Date] column, not respecting the year and month row context.

I was reading your book (Building Data Models with PowerPivot) and in Chap 12, you used DATESBETWEEN and LASTNONBLANK to update transaction balances with SUMX (page 361). The book is fantastic!!

I was curious about what dates would be returned by the DATESBETWEEN code snippet. So I tried the snippet in a measure with FISRTDATE (shown in my last post), and just put BalanceDate[Year} and BalanceDate[Month] on the rows of a pivot table.

The BalanceDate table in the exercise file only had dates from 2010, so I added the last two dates (Dec 30 and Dec 31) from 2009. Then I put the measure in the pivot and saw that on the row with 2010 as the year, the measure returned returned 12/30/2009. The filter context is year=2010, so I was surprised.

It seems using BLANK() inside the DATESBETWEEN function gives it the ability to ignore the filter context in a pivot table, when combined with FIRSTDATE or LASTDATE.

M1 = FIRSTDATE ( DATESBETWEEN ( Calendar[Date], BLANK (), BLANK () ))

M2 = LASTDATE ( DATESBETWEEN ( Calendar[Date], BLANK (), BLANK () ))

I put these two measures in a PivotTable, with Calendar[Year] on the row. Ironically, both measures do not respect the filter context of the pivot table cell, returning the first and last date provided by the DATESBETWEEN function, despite the year on the row.

the date returned by time intelligence functions only include *existing* dates in the date table - you will never see a date that is not stored in that table. So, first of all, check that all the days in a year exists.

Then, DATESBETWEEN overrides the existing filter context - when you apply a filter on a date column of a table marked as a date table, all the other columns are removed from the filter context. This behavior happens also when you apply a filter on the primary key of a column that defines a one-to-many relationship. Maybe this is the reason of the behavior your experienced.

I hope it helps,

Marco

May 26, 2015 4:59 AM

Leave a Comment

About Marco Russo (SQLBI)

Marco Russo is a consultant, writer and trainer specialized in Business Intelligence with Microsoft technologies. He runs the SQLBI.COM website, which is dedicated to distribute resources useful for BI developers, like Integration Services components, Analysis Services models, tools, technical information and so on. Marco is certified as MCT, MCDBA, MCSD.NET, MCSA, MCSE+I.