Microsoft OLAP by Mosha Pasumansky : ssashttp://sqlblog.com/blogs/mosha/archive/tags/ssas/default.aspxTags: ssasenCommunityServer 2.1 SP2 (Build: 61129.1)Get most out of partition sliceshttp://sqlblog.com/blogs/mosha/archive/2008/10/14/get-most-out-of-partition-slices.aspxTue, 14 Oct 2008 06:36:33 GMT21093a07-8b3d-42db-8cbf-3350fcbf5496:9471mosha3http://sqlblog.com/blogs/mosha/comments/9471.aspxhttp://sqlblog.com/blogs/mosha/commentrss.aspx?PostID=9471<p>Setting partition slice has always been an important optimization technique in Analysis Services. Every presentation talked about it and every whitepaper mentioned it, for example the <a href="http://www.microsoft.com/technet/prodtechnol/sql/2000/maintain/ansvcspg.mspx">Microsoft SQL Server 2000 Analysis Services Performance Guide</a> contains a chapter appropriately named “Define the Data Slice for Each Partition”, here is a quote from it:</p> <blockquote> <p><em>“If the data slice value for a partition is set properly, Analysis Services can quickly eliminate irrelevant partitions from the query processing and significantly reduce the amount of physical I/O and processor time needed for many queries issued against MOLAP and HOLAP partitions &lt;skip&gt; </em></p> <p><em>Caution: Creating a partition without setting the data slice is not a good practice, and can result in considerable overhead being added to the system (artificially increasing response times). Without the data slice, Analysis Services cannot limit a query to the appropriate partitions and must scan each partition even if zero cells will be returned. </em></p> <p><em>The data slice enables Analysis Services to determine which partitions contain data relevant to the query”</em></p> </blockquote> <p>However, after AS2005 was released, there was a new tip making it into presentations - “For MOLAP partitions no need to specify slice property, it is detected automatically”. Even though I briefly debunked it before <a href="http://sqlblog.com/blogs/mosha/archive/2007/06/02/notes-about-notes-from-microsoft-bi-conference.aspx">here</a>,&#160; this rumor just kept popping up, and it even made it as far as official “<a href="http://www.microsoft.com/technet/prodtechnol/sql/bestpractice/olapdbpssas2005.mspx">OLAP Design Best Practices for Analysis Services 2005</a>” document – here is the quote:</p> <blockquote> <p><em>“For MOLAP partitions, you do not have to specify the slice because the server will be able to figure out the relevant partitions after the data has been processed”</em></p> </blockquote> <p>Even worse, the <a href="http://msdn.microsoft.com/en-us/library/ms174764(SQL.90).aspx">official documentation</a> on data slices goes as far as stating:</p> <blockquote> <p>“<em>Data slices are applicable only to objects that use the ROLAP storage mode”</em></p> </blockquote> <p>But this is completely not true, and many people already discovered it. This Jesse Orosz’s <a href="http://jesseorosz.spaces.live.com/blog/cns!E322FD91218E57CF!304.entry">blog entry</a> has great description of 4 reasons why setting partition slice in MOLAP is important. In my blog I want to focus on #3 from his list – automatic slice detection is not perfect.</p> <p>Here is how automatic slice detection works – during building indexes for the partition, for every attribute it determines the minimum and maximum data id values in this attribute (for more detailed explanation see section “Partition Slice” in the <a href="http://www.amazon.com/Microsoft-Server-2005-Analysis-Services/dp/0672327821">Microsoft SQL Server 2005 Analysis Services book</a>, page 355). It is possible to discover what values exactly Analysis Services detected, either doing it hard way as described <a href="http://www.ssas-info.com/analysis-services-faq/29-mgmt/289-how-to-find-slice-values-that-analysis-services-assigned-partition-dimensions">here</a>, or doing it easy way as described <a href="http://cwebbbi.spaces.live.com/Blog/cns!7B84B0F2C239489A!1478.entry">here</a>. In either case, it is clear, that since slice autodetection operates with ranges of data ids, it is totally possible that partition which contains only two attribute members, but one with the lowest possible data id (2), and another one with highest possible data id, and as a result the range will cover all possible values, i.e. will be useless.</p> <p>On the other hand, if for certain attribute the data in partition contains only single value of member’s data id, then the range will be very effective – since such a range will be equivalent to slice on a single member.</p> <p>So this must be the root of all the confusion in documentation and in whitepapers. Prior to AS2005, partition slice could have only been a single member. And specifying partition slice of a single member is really not necessary in MOLAP, since it will always be automatically discovered by autoslice (well, except for other 3 reasons that Jesse lists in his blog). But starting with AS2005, it is possible to specify MDX sets for partition slice, and that’s exactly the case where setting partition slice manually makes sense.</p> <p>Analysis Services Stored Procedures Project on Codeplex features <a href="http://www.codeplex.com/ASStoredProcedures/Wiki/View.aspx?title=PartitionHealthCheck">PartitionHealthCheck</a> function, which shows the overlaps in data id ranges between partitions, but it is no clear what action the user can take when he discovers that the autodetected ranges are not optimal. User doesn’t have direct control over assignments of data ids to dimension members. What this blog entry is going to explore is how to avoid relying on autoslice altogether, and put the precise control over partition slices into the hands of the user.</p> <h3>Almost related attributes</h3> <p>First scenario to consider is what I call “almost related attributes”. I.e. we have attributes, which are not truly related to each other, but, on the other hand are not completely unrelated. Examples of such attributes are Zip code to State – it is almost many to one, but sometimes there are Zip codes which cross State boundaries. Another example is Week to Year. Year mostly contains 52 weeks, but the weeks at the beginning and end of year can cross to previous or next year.&#160; </p> <p>Let’s use Adventure Works 2008 as our sample database. We have ‘Calendar Year’ and ‘Calendar Week’ attributes. Let’s first query for specific Calendar Year:</p> <pre class="code"><span style="color:blue;">SELECT
</span><span style="color:navy;">[Measures]</span>.<span style="color:navy;">[Reseller Sales Amount] </span><span style="color:blue;">ON </span><span style="color:silver;">0
</span><span style="color:blue;">FROM </span><span style="color:navy;">[Adventure Works]
</span><span style="color:blue;">WHERE
</span><span style="color:navy;">[Date]</span>.<span style="color:navy;">[Calendar Year]</span>.<span style="color:navy;">[Calendar Year]</span>.<span style="color:navy;">[CY 2004]</span>;</pre>
<a href="http://11011.net/software/vspaste"></a>
<p>Running this query under <a href="http://sqlblog.com/blogs/mosha/archive/2008/09/21/hierarchical-trace-profiling-with-mdx-studio.aspx">Hierarchical Profiler</a> of <a href="http://www.mdxstudio.com">MDX Studio</a> shows that only one partition – Reseller_Sales_2004 – was read for that query. This is because that partition contains data just for year 2004, and therefore the automatic slice detection worked just fine. But let’s query for a specific week in 2004:</p>
<pre class="code"><span style="color:blue;">SELECT
</span><span style="color:navy;">[Measures]</span>.<span style="color:navy;">[Reseller Sales Amount] </span><span style="color:blue;">ON </span><span style="color:silver;">0
</span><span style="color:blue;">FROM </span><span style="color:navy;">[Adventure Works]
</span><span style="color:blue;">WHERE
</span><span style="color:navy;">[Date]</span>.<span style="color:navy;">[Calendar Week]</span>.&amp;[17]&amp;[2004];</pre>
<p><a href="http://11011.net/software/vspaste"></a>Of course, we would like again to read just 2004 partition, but this time profiler shows that also partitions Reseller_Sales_2002 and Reseller_Sales_2003 were touched. This is pretty bad, this week 17 of 2004 is well inside year 2004 with no hope of crossing into 2003, let alone 2002 !</p>
<p>So why did automatic slice detection did such a poor job here. To answer this question, let’s look at data id values for the Calendar Week attribute:</p>
<pre class="code"><span style="color:blue;">WITH MEMBER </span>[Measures].CalendarWeekDataID <span style="color:blue;">AS
</span>DataId(<span style="color:navy;">[Date]</span>.<span style="color:navy;">[Calendar Week]</span>.<span style="color:blue;">CurrentMember</span>)
,CAPTION = <span style="color:red;">'Data ID of Calendar Week'
</span><span style="color:blue;">SELECT </span><span style="color:navy;">[Measures]</span>.<span style="color:navy;">[CalendarWeekDataID] </span><span style="color:blue;">ON </span><span style="color:silver;">0
</span>, <span style="color:navy;">[Date]</span>.<span style="color:navy;">[Calendar Week]</span>.<span style="color:navy;">[Calendar Week] </span><span style="color:blue;">ON </span><span style="color:silver;">1
</span><span style="color:blue;">FROM </span><span style="color:navy;">[Adventure Works]</span>;</pre>
<p>This query shows the data id for every calendar week, and the problem is now obvious. The calendar weeks don’t have nice ordering by the years, instead first we get Week 1 of years 2002, 2003 and 2004 (this week didn’t exist in 2001), then Week 2 for these years etc. No wonder the range of data ids for calendar week, included week 17 of 2004 also in 2003 and 2002. We could fix the Calendar Week attribute, but we would still be at the mercy of Analysis Services about how it decides to assign data ids. Instead, we can explicitly set the slice on every partition in the ‘Reseller Sales’ measure group.</p>
<p>And what is better way to build this slice if not by using another MDX query !</p>
<p>&#160;</p>
<pre class="code"><span style="color:blue;">WITH
SET </span>WeeksIn2004 <span style="color:blue;">AS
</span>Exists(
<span style="color:navy;">[Date]</span>.<span style="color:navy;">[Calendar Week]</span>.<span style="color:navy;">[Calendar Week]</span>.<span style="color:blue;">MEMBERS
</span>,<span style="color:navy;">[Date]</span>.<span style="color:navy;">[Calendar Year]</span>.<span style="color:navy;">[Calendar Year]</span>.<span style="color:navy;">[CY 2004]</span>)
<span style="color:blue;">MEMBER </span>[Measures].PartitionSlice <span style="color:blue;">AS
</span><span style="color:red;">&quot;{&quot; </span>+
Generate(
WeeksIn2004 <span style="color:blue;">AS </span>itr,
<span style="color:navy;">[Date]</span>.<span style="color:navy;">[Calendar Week]</span>.<span style="color:blue;">CurrentMember</span>.<span style="color:blue;">UniqueName
</span>+ IIF(itr.<span style="color:blue;">CurrentOrdinal </span>= WeeksIn2004.<span style="color:blue;">Count</span>, <span style="color:red;">&quot;&quot;</span>, <span style="color:red;">&quot;,&quot;</span>))
+ <span style="color:red;">&quot;}&quot;
</span><span style="color:blue;">SELECT </span><span style="color:navy;">[Measures]</span>.PartitionSlice <span style="color:blue;">ON </span><span style="color:silver;">0
</span><span style="color:blue;">FROM </span>[Adventure Works]</pre>
<a href="http://11011.net/software/vspaste"></a>
<p>This returns long string representing MDX set with all the weeks in 2004. We just need to make sure that the resulting string is XML’ized, i.e. ‘&amp;’ is replaced with ‘&amp;amp;’ if it is to be pasted directly into XMLA script for ALTER partition, and we are good to go.</p>
<p>When we apply this procedure to every partition, rerunning the query</p>
<pre class="code"><span style="color:blue;">SELECT
</span><span style="color:navy;">[Measures]</span>.<span style="color:navy;">[Reseller Sales Amount] </span><span style="color:blue;">ON </span><span style="color:silver;">0
</span><span style="color:blue;">FROM </span><span style="color:navy;">[Adventure Works]
</span><span style="color:blue;">WHERE
</span><span style="color:navy;">[Date]</span>.<span style="color:navy;">[Calendar Week]</span>.&amp;[17]&amp;[2004];</pre>
<p><a href="http://11011.net/software/vspaste"></a>Now only touches 2004 partition.</p>
<h3>Related attributes</h3>
<p>So far we considered “almost related attributes”, as for truly related attributes one may think there is no problem. But in fact, in Adventure Works 2008, Calendar Week 2004 is defined as related to Calendar Year. So why when we sliced on <span style="color:navy;">[Date]</span>.<span style="color:navy;">[Calendar Week]</span>.&amp;[17]&amp;[2004] it didn’t decode the year ? If we run coordinate decoding in MDX Studio, it does properly show ‘CY 2004’ being decoded. The problem here lies in the fact that FE does coordinate decoding lazily, only if it really needs to. This makes sense – otherwise there will be a lot of time spent decoding attributes that nobody needs. So in our query, FE noticed that there were no calculations looking at Calendar Year’s CurrentMember, and it left it non decoded, as showed by verbose data in profiler in MDX Studio:</p>
<pre>Dimension 9 [Date] (0 0 0 0 0 0 0 0 0 0 52 0 0 0 0 0 0 0 0 0 0)
[Calendar Week]:[Week 17 CY 2004]
[Calendar Year]:0 </pre>
<p>But in this case, it looks like SE could benefit from attribute decoding on query subcube – this could be useful performance optimization.</p>
<h3>What about multiple attributes</h3>
<p>We now successfully dealt with Calendar Week attribute, but Date dimension contains plenty of other attributes. If their Data IDs are as messed up as with Calendar Week – we need to do something about them. Unfortunately here we run into unintentional, yet very unfortunate limitation of Analysis Services. It is impossible to set partition slice by multiple attributes if at least one of them is a set and not a single member (which is the only useful case for us !). This limitation stems from unfortunate combination of few different factors:</p>
<p>1. Sets used in partition slice have to comply with limitations of SetToStr(…, CONSTRAINED) (see <a href="http://msdn.microsoft.com/en-us/library/microsoft.analysisservices.partition.slice(ide).aspx">documentation</a>). There is a good reason for that – partition slices are computed when cube is not really ready for querying, so putting just any MDX set could trigger FE wanting to get data from the cube, which is not yet ready with unpredictable results.</p>
<p>2. SetToStr(…, CONSTRAINED) doesn’t allow crossjoin of sets using neither CrossJoin function, nor * operator, no tuple-like form – i.e. it is not possible to specify crossjoin</p>
<p>3. While it is possible to specify set of tuples for SetToStr(…, CONSTRAINED), it still doesn’t help, since FE is not capable of building a subcube out of set of tuples (instead it immediatelly resorts to dreaded arbitrary shaped form).</p>
<p>It would be useful if any one of these limitations was relaxed – because it would enable true great control over partition slices in non-trivial cubes.</p><img src="http://sqlblog.com/aggbug.aspx?PostID=9471" width="1" height="1">mdxperformancessasNotes about Notes from Microsoft BI Conferencehttp://sqlblog.com/blogs/mosha/archive/2007/06/02/notes-about-notes-from-microsoft-bi-conference.aspxSat, 02 Jun 2007 06:16:00 GMT21093a07-8b3d-42db-8cbf-3350fcbf5496:1388mosha1http://sqlblog.com/blogs/mosha/comments/1388.aspxhttp://sqlblog.com/blogs/mosha/commentrss.aspx?PostID=1388<P>I was reading the blog post "<A href="http://www.ssas-info.com/VidasMatelisBlog/?p=20">My notes from Microsoft BI Conference</A>" by Vidas Matelis. It is a&nbsp;collection of notes recorded at different presentations, chalk talks etc. I felt that I agree with some of them, disagree with some, and have no opinion about some. I felt like it would be useful to write down my thoughts about these notes. Of course, when I don't agree with them - it doesn't mean that I am right - I am merely expressing my opinion on a matter, which is subjective.&nbsp;I also apologize upfront if any of my comments sound critical - my intent is not to critique, but to open a discussion,&nbsp;as an old saying goes "The truth is born in dispute". I will be citing the selected notes from Vidas's blog and then add my commentary.</P>
<UL>
<LI>Attribute relationship is important. I have heard that at least 10 times in different sessions.</LI></UL>
<P>Indeed. If I were asked what is the single most important advice I could give to the cube designers - it would be to define correct attribute relationships in the dimensions. Note the accent on the word 'correct'. Lately, way too often I stumble upon cubes where people have heard how important attribute relationships are, so that they marked attributes related even if there was no true 1-to-many relationship between them. This could lead to disastrous results. If attributes are not 100% 1-to-many, the numbers in the cube could and will be wrong. They could even be non-deterministic ! Remember, it is very important to define relationships and convert hierarchies to be natural, but only if these relationships truly exist in the data.</P>
<UL>
<LI>For MOLAP partitions no need to specify slice property, it is detected automatically.</LI></UL>
<P>I don't agree with this statement. It is only true for the single member slices. I.e. if we have daily partitions, it is not necessary to tell Analysis Services what day the MOLAP partition was processed for, because it will detect it automatically. But if the slice consists of more than one member - Analysis Services might not pick it up with great precision. What AS will do is to detect a range of Data IDs per attribute. Since the user has no control over how Data IDs are assigned to members, if partition has slice of only two members, it may turn out that their Data IDs are first and the last in the attribute, and the range will cover the entire space.</P>
<UL>
<LI>If your partition is at the day or week level, no aggregations&nbsp;are possible for month level. Aggregations cannot cross partitions.</LI></UL>
<P>It is true that aggregations cannot cross partitions. But, of course, it is not true that aggregations are not possible at higher levels. What this note probably meant to say was that it isn't worth to set aggregations at the level above the one where partition is sliced by a single member.</P>
<UL>
<LI>Each KPI formula is defined as hidden calculated measures, except when it is just reference to another measure. For better performance consider creating&nbsp;real calculated measures with properly designed calculations and then use these measures in KPI definitions.</LI></UL>
<P>Yes, KPIs may create hidden calculated measures, and not just for the KPI value, but for other properties as well - i.e. Goal, Trend etc. I have written about it here - <A href="http://www.sqljunkies.com/WebLog/mosha/archive/2005/12/31/cube_init.aspx">www.sqljunkies.com/WebLog/mosha/archive/2005/12/31/cube_init.aspx</A>. What is not clear from this note is how creating real calculated measures is going to help performance. The truth is - it won't help by itself. What this note probably meant was that if the calculated measures are created manually in the MDX Script, the cube designer has more control. For example, it is possible to specify performance related properties such as NON_EMPTY_BEHAVIOR, or non-performance properties such as FORMAT_STRING. It is possible to use this calculated measure inside SCOPE statement etc. So it gives flexibility both in functionality and in making performance optimizations.</P>
<UL>
<LI>In multiple sessions I have heard talking about&nbsp;”VisualTotals” problem. Basically best performance can be achieved for natural hierarchy totals - when children’s parent contains total value. When you use any&nbsp;filters, roles security based on username, subselect statements, etc, visual totals are calculated and performance will be affected.</LI></UL>
<P>There is some confusion going on inside this note. For example, what "roles security based on username" has to do with Visual Totals ? What's true is that Visual Totals have different forms - query based with subselects and "Default Visual Mode" connection string property, session based with named sets using VisualTotals function and CREATE SUBCUBE statement, there are Visual Totals which can be defined in the dimension security. But they are not inherently performance destroying. Different forms of Visual Totals have different semantics and different effect on performance and on caching. Putting blank statement that they are bad for performance just isn't right.</P>
<UL>
<LI>In pre SP2 release you were able to increase YTD calculation performance by using fancy logic (Mosha blog example). After SP2 best performance can be achieved by using simple YTD() function.</LI></UL>
<P>I want to clarify this. The comment about "Mosha blog example" probably refers to the following blog post: <A href="http://www.sqljunkies.com/WebLog/mosha/archive/2006/11/17/rsum_performance.aspx">www.sqljunkies.com/WebLog/mosha/archive/2006/11/17/rsum_performance.aspx</A>. But this is exactly where I describe what a great performance SP2 delivers by using simple YTD() function ! I never fancied computing running sum using fancy logic. I got confused with either Richard Tkachuk or Chris Webb, both of whom blogged about fancy ways to compute running sum. I intentionally don't link to their posts, because I believe that the simpler the MDX expression is - the better.</P>
<UL>
<LI>Excel 2007 uses subselects and that means visual totals. But you can run run excel in backward compatibility mode</LI></UL>
<P>More confusion about Visual Totals. Yes Excel 2007 uses subselects. Yes it means Visual Totals. But even when you run Excel in backward compatibility mode - you still going to get Visual Totals !!! Not through subselects, but through session property "Default Visual Mode". What this note probably meant to say, that when you run Excel in backward compatibility mode, you don't get subselects.</P>
<UL>
<LI>64bit machines does not run faster than 32bit. Many older 64 bit machines actually run slower than 32bit. That was fixed recently. 64 bit machines just lets you use more memory.</LI></UL>
<P>This is controversial statement. I am sure that in the scenarios that the presenter talked about this is true. However, there are also pieces of the engine code, which are explicitly optimized for 64bit, and&nbsp;where the&nbsp;performance of 64bit is better than 32bit, so it isn't just memory. But YMMV...</P>
<UL>
<LI>In real life don’t just do full process on cube. Plan how you can&nbsp;improve processing. It is recommended that you&nbsp;submit in separate transactions ProcessData and ProcessIndexes statements instead of ProcessFull.</LI></UL>
<P>I wish there were more explanation behind this statement. Real life is complicated. Perhaps sometimes separating ProcessData and ProcessIndexes is the right thing, especially when the goal is to reduce processing window. But running queries on top of the data which doesn't have indexes could be disastrous for performance. So do the right thing for your setup. Don't just plan how you improve processing, plan how you improve overall system performance, and do what is more important in your scenario (for some people it is processing, for some it is queries, for some it is mix).</P>
<UL>
<LI>Process partition clears cache for all partitions in affected measure group.</LI></UL>
<P>This statement makes&nbsp;it look like&nbsp;there are separate caches per partition. In reality data caches are per measure group (for SE) and per cube (for FE).</P>
<UL>
<LI>If you are using crossjoin, put larger set always as first parameter of crossjoin</LI></UL>
<P>I was really puzzled by this one. There must be a justification for it, but I cannot think of it. However, I do agree that order of sets in crossjoin does matter a lot for performance, but it isn't the sizes of sets which are important in my opinion. It is complexity of MDX expressions behind them. The reason is very simple - CrossJoin function usually uses nested&nbsp;loop join algorithm. Therefore the sets with more complex MDX expressions should be the outer sets, and the simpler ones should be inner sets. This can be illustrated on the example from Adventure Works. The first query, where complex Filter is the outer parameter of CrossJoin&nbsp;returns instantaneously</P><PRE>WITH MEMBER Measures.Test AS
Count(
Crossjoin(
Filter([Customer].[Customer Geography].[Customer], LEFT([Customer].[Customer Geography].CurrentMember.Name,3) = "Mar")
,[Promotion].[Promotions].[Promotion]
)
)
SELECT Test ON 0
FROM [Adventure Works]
</PRE>
<P>While changing the order of sets in CrossJoin and putting complex Filter as inner parameter like in the following query, executes for long 6 seconds:</P><PRE>WITH MEMBER Measures.Test AS
Count(
Crossjoin(
[Promotion].[Promotions].[Promotion]
,Filter([Customer].[Customer Geography].[Customer], LEFT([Customer].[Customer Geography].CurrentMember.Name,3) = "Mar")
)
)
SELECT Test ON 0
FROM [Adventure Works]
</PRE>
<UL>
<LI>Exist function works on dimensions, not cubes, so they are faster</LI></UL>
<P>First, Exist<STRONG>s</STRONG> function can work on either dimensions or measure groups. Second, the argument about "faster" is not clear to me. What is faster than what ?</P>
<UL>
<LI>Fastest filter example: Exists(Customer.Members, Gender.[Male])</LI></UL>
<P>Here is even faster expression: Descendants( Gender_Customer_Hierarchy.Gender.[Male], Gender_Customer_Hierarchy.Customer, SELF ). I.e. if you have user natural hierarchy from Gender to Customer (it is natural because customer can have only one gender!), then using Descendants will always be better than Exists. Exists, however, is more universal, since it is usually used on top of attribute hierarchies and doesn't depend on the user hierarchies. But morale here is if you did define user hierarchies - don't be afraid to use them, since they will give boost in performance.</P>
<UL>
<LI>If you have defined MeasureExpression, no aggregations are used.</LI></UL>
<P>That is no aggregation are used to derive the results at higher granularities, but aggregations can be used if the granularity is the same. Measure expressions in this respect behave just like Distinct Count and Many to Many dimensions.</P>
<UL>
<LI>Kernel profiler could be used to show CPU usage per each Stored Procedure. This might help you find slowly performing Stored Procedures.</LI></UL>
<P>Another best practice is to integrate stored procedures with Trace through Context.TraceEvent method. This will give a lot of insight into performance of stored procedure. More details are in my "<A href="http://www.sqljunkies.com/WebLog/mosha/archive/2007/04/19/stored_procs_best_practices.aspx">Best practices for server ADOMD.NET stored procedures</A>" blog.</P>
<UL>
<LI>Best performance can be achieved using&nbsp;standard aggregation functions instead of rewriting them (Example LastChild). But some of them require Enterprise Edition of SQL Server.</LI></UL>
<P>LastChild is not such a good example here, because performance differences between semiadditive measure LastChild and carefully written MDX are not that big. Better example is LastNonEmptyChild, where no matter how it is reimplemented in MDX, the built-in semiadditive measure will always be significantly better. But there is also a counterexample. It is possible to implement AverageOfChildren functionality in such a way that it will perform better than built-in semiadditive function. More details are in this blog post: <A href="http://www.sqljunkies.com/WebLog/mosha/archive/2007/05/27/counting_days_mdx.aspx">www.sqljunkies.com/WebLog/mosha/archive/2007/05/27/counting_days_mdx.aspx</A>.&nbsp;(it is not pure MDX solution since it requires creation of auxiliary measure group, but still).</P><img src="http://sqlblog.com/aggbug.aspx?PostID=1388" width="1" height="1">mdxperformancessasDisplay Foldershttp://sqlblog.com/blogs/mosha/archive/2006/12/13/display-folders.aspxThu, 14 Dec 2006 01:29:00 GMT21093a07-8b3d-42db-8cbf-3350fcbf5496:7953mosha0http://sqlblog.com/blogs/mosha/comments/7953.aspxhttp://sqlblog.com/blogs/mosha/commentrss.aspx?PostID=7953<p><img style="margin:5px;" src="http://www.mosha.com/msolap/images/display_folders.png" align="right" />Display Folders is one of those little features of Analysis Services 2005 which get many people excited. While I personally don't share this excitement, I can understand how display folders can make information more accessible by organizing different metadata objects such as hierarchies, KPIs, measures etc into logical structures. So we are going to take a deeper look into how display folders work, and what does it mean when the <a href="http://www.mosha.com/msolap/util.htm">client tool</a> claims support for display folders.</p> <p>Let's start with the following example, showing measures organized into several display folders. There are couple of interesting things to notice about this picture. First, the display folders can have hierarchies. For example, both Net Income and Statistical folders are children of Financial folder (resembling somewhat chart of accounts). Second, perhaps more interesting, the same measure can show up under more than one display folder. For example, YTD Sales show up both under Favorites and under KPIs folder, and Profit appears three times - under Favorites, KPIs and Net Income folders.</p> <p>To see how this is done, let's first understand what display folders mean to AS server. As a matter of fact - they mean nothing to it. As far as AS server is concerned, DisplayFolder property of metadata object such as measure or KPI, or DisplayFolder calculation property of calculated member is just a string. AS server doesn't interpret this string in any way - it simply reports it back in the appropriate schema rowset, be it MEASURE_DISPLAY_FOLDER column in MDSCHEMA_MEASURES for measures and calculated measures, or KPI_DISPLAY_FOLDER column in MDSCHEMA_KPIS schema rowset. It is really up to the client application to correctly interpret this string and produce visual representation similar to the one captured in the Excel 2007.</p> <p>So it seems like there must be a common documented convention on the structure of this string in order for all the client tools to be organize objects in display folders consistently. Unfortunately, I wasn't able to find single place in documentation which coherently explains it. For example, <a href="http://msdn2.microsoft.com/es-es/library/ms126258.aspx">description for MDSCHEMA_KPIS</a> says:</p> <blockquote> <p><em>A forward-slash (/) delimited categorization structure. Client applications use this structure to determine hierarchical presentation of KPIs. This structure is not enforced.</em></p> </blockquote> <p>While the <a href="http://msdn2.microsoft.com/es-es/library/ms126250.aspx">description for MDSCHEMA_MEASURES</a> claims</p> <blockquote> <p><em>The path to be used when displaying the measure in the user interface. Folder names will be separated by a semicolon. Nested folders are indicated by a backslash (\).</em></p> </blockquote> <p>It is interesting to note, that one of them mentions forward slash and another one backward slash. The &quot;this structure is not enforced&quot; comment probably refers to the fact that neither server's metadata manager nor AMO will do any validation on the string. The comment about semicolon refers to the fact that the DisplayFolder property can contain more than one - this is how same object can appear under multiple folders. For example the DisplayFolder property for Profit measure in the example above is </p> <p>KPIs;Financial;Financial\Net Income</p> <p>So when the client application claims that it supports display folders, it is worth to check whether it supports forward slash, backward slash and semicolon in the display folder specification. Excel 2007 does it, but even built-in cube browser in BI Dev Studio has a little problem with organizing measures, by pushing the very first display folder on the list under the measure group name for physical measures.</p><img src="http://sqlblog.com/aggbug.aspx?PostID=7953" width="1" height="1">ssasWhat are the natural hierarchies and why they are a good thinghttp://sqlblog.com/blogs/mosha/archive/2006/11/09/what-are-the-natural-hierarchies-and-why-they-are-a-good-thing.aspxFri, 10 Nov 2006 04:12:00 GMT21093a07-8b3d-42db-8cbf-3350fcbf5496:8675mosha0http://sqlblog.com/blogs/mosha/comments/8675.aspxhttp://sqlblog.com/blogs/mosha/commentrss.aspx?PostID=8675<p>Analysis Services 2005 Service Pack 2 is just around the corner. Microsoft released the <a href="http://download.microsoft.com/download/2/B/5/2B5E5D37-9B17-423D-BC8F-B11ECD4195B4/WhatsNewSQL2005SP2.htm">list of improvements made in SP2</a>. While many of them are self-explanatory, others could benefit from more explanations. I already wrote about two very important improvements - <a href="http://sqlblog.com/WebLog/mosha/archive/2006/11/01/slicer_axis_interaction.aspx">interaction between WHERE and axes</a>, and <a href="http://sqlblog.com/WebLog/mosha/archive/2006/11/07/visual_totals_dc.aspx">Visual Totals changes</a>. Today, <a href="http://www.sql.ru/forum/actualthread.aspx?tid=360352">this thread</a> raised a question about the following improvement:</p> <ul> <li><em>A warning message now appears when a user-defined hierarchy is not defined as a natural hierarchy.</em> </li> </ul> <p>The user was wondering, what are the <em>natural hierarchies</em>, how can user-defined hierarchy be defined as natural or unnatural, and why is it so bad to have user-defined hierarchy unnatural that the warning message has to appear. So, I realized, that this probably was one of the vaguer items on the list and offered my explanation. This post is a short translation of the forum thread to English.</p> <p>First, let's find out what natural hierarchies are. While there were many (justified!) complains about product documentation, the concept of natural hierarchies is in fact documented on MSDN. To find this information on MSDN, we can use <a href="http://search.live.com/results.aspx?FORM=IE7&amp;q=natural+hierarchy+site%3Amsdn.microsoft.com">Live Search</a> from Microsoft restricting on msdn.microsoft.com domain (until recently, Live Search didn't index MSDN well, so use of another search engine worked better. But this issue is now resolved, and in my experience, Live Search now returns more, better and more relevant results while searching MSDN then any other search engine). The very first link is an <a href="http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dnsql90/html/IMDXSMSS05.asp">Introduction to MDX Scripting whitepaper</a>, which contains the following definition:</p> <blockquote> <p><em>A natural hierarchy is composed of attributes where each attribute is a member property of the attribute below. For example, the <b>Geography</b> hierarchy <b>Country</b>, <b>State</b>, <b>City</b> and <b>Name</b> is a natural hierarchy if <b>City</b> is a member property of <b>Name</b>; <b>State</b> is for <b>City</b>; and <b>Country</b> is for <b>State</b>. The hierarchy <b>Gender-Age</b> is not a natural hierarchy because <b>Gender</b> is not a member property of <b>Age</b>.</em></p> </blockquote> <p>So the concept of natural hierarchies is tightly related to the concept of related attributes. Obviously, all the attribute hierarchies are already natural. With respect to number of levels, attribute hierarchy can have</p> <ul> <li>Single level - when attribute is marked as non-aggregatable. One level hierarchies are always natural, because there is nothing to relate to ! </li> <li>Two levels - when attribute is marked as aggregatable. But the first level is 'All' level, and it is related to any attribute </li> <li>Variable number of levels for Parent attribute hierarchy in Parent-Child. This is an unusual case, when single attribute unfolds into multiple levels, but the way Parent-Child hierarchy is build, is by respecting relationships between members, therefore this hierarchy is natural as well </li> </ul> <p>Now that we know what natural hierarchies are, how can we make user-defined hierarchy natural or not. Again, Live Search is our friend, and we get the following BOL article:&#160; &quot;<a href="http://msdn2.microsoft.com/en-us/library/ms166553.aspx">Specifying Attribute Relationships Between Attributes in a User-Defined Hierarchy</a>&quot;, which walks step by step through the process in UI.</p> <p>Now, to the most important question of all - why natural hierarchies are so important. Why SP2 bothered to introduce a special warning if user-defined hierarchy turned out unnatural. Well, there is plenty of materials published on importance of attribute relationships. It simply cannot be underestimated both from the performance point of view and , in the presence of calculations, from correctness point of view. I will add two more reasons which are often missed, but which are more closely related specifically to natural hierarchies and less to the general subject of attribute relationships:</p> <p>1. Due to the fact that natural hierarchies always maintain one-to-many relationship between levels, they can be used internally for decoding of attribute values. Therefore, natural hierarchies are always processed into <em>materialized hierarchies</em> in AS2005. I.e. the entire hierarchy is computed and optimized for querying. On the other hand, unnatural hierarchies are kept as <em>non-materialized</em>, which means that any query to unnatural hierarchy will involve SE query (it can be spotted in Profiler as Dimension Query Subcube event) which is much more costly then going to the prebuilt data structures of materialized natural hierarchies. </p> <p>2. Unnatural hierarchies tend to produce infamous <em>arbitrary shape subcubes.</em> The problem with arbitrary shapes is that they cannot be used in all contexts (for example SCOPE's cannot deal with them and raise an error), and in the contexts where they are accepted, there is a huge performance penalty associated with dealing with arbitrary shapes. Other then the fact that the code which deals with arbitrary shape subcubes is much more complex (and therefore slower), there are more significant ramifications of them:</p> <ul> <li>Bitmap indexes cannot be utilized as efficiently as with normally shaped subcubes </li> <li>Most of the caching subsystem have difficulties finding arbitrary shaped subcubes in the cache. This point actually leads us to another improvement in SP2 which probably needs a little bit more explanation. The document says &quot;<em>MDX query performance has been improved with respect to ..., arbitrary shapes, ...</em>&quot;. This refers to the fact, that the caching subsystem has been improved to detect the arbitrary shaped caches within the execution context of single MDX query - something that wasn't possible before SP2. However, such arbitrary shaped caches still cannot be easily detected and therefore reused between the queries. </li> </ul> <p>It should be clear by now, that whenever possible, unnatural hierarchies should be avoided. But this doesn't mean that unnatural hierarchies are always bad ! Any guidance should be considered within its reasoning. For example, in my article about <a href="http://sqlblog.com/WebLog/mosha/archive/2006/10/25/time_calculations_parallelperiod.aspx">Time Calculations in UDM</a>, I showed how using unnatural hierarchies in the form Year -&gt; QuarterOfYear -&gt; MonthOfQuarter -&gt; DayOfMonth, actually simplifies a lot writing time relation calculations. I guess my rule of thumb is following:</p> <p>If there are two attributes in the dimension, attribute A with cardinality N and attribute B with cardinality M, then</p> <ul> <li>If cardinality of CrossJoin(A,B) is equal to M, then A is related to B, the relationship should be defined, and A -&gt; B hierarchy is natural </li> <li>If cardinality of CrossJoin(A,B) is equal to (or very close to) N*M, then A and B are fully (or very close to fully) independent, and A -&gt; B hierarchy is a &quot;good&quot; unnatural hierarchy </li> <li>If cardinality of CrossJoin(A,B) is only a little bit greater then M, then A is almost related to B, and perhaps the key of B can be modified (for example by making it composite key to include key of A as well) to make them really related. If this is not possible, then A -&gt; B hierarchy is unnatural, and it is &quot;bad&quot; unnatural hierarchy </li> </ul> <p>Again, this rule of thumb should not be taken literally, but always applied in the context of the actual scenario.</p><img src="http://sqlblog.com/aggbug.aspx?PostID=8675" width="1" height="1">ssasInside OLAP Engine: SE cache sharinghttp://sqlblog.com/blogs/mosha/archive/2006/01/12/inside-olap-engine-se-cache-sharing.aspxThu, 12 Jan 2006 08:16:00 GMT21093a07-8b3d-42db-8cbf-3350fcbf5496:7958mosha0http://sqlblog.com/blogs/mosha/comments/7958.aspxhttp://sqlblog.com/blogs/mosha/commentrss.aspx?PostID=7958<p><em>After publishing the <a href="http://sqlblog.com/blogs/mosha/archive/2006/01/05/inside-olap-engine-cache-prefetching.aspx">excerpt</a> from my upcoming whitepaper on AS caching, I received a lot of feedback, both in blog comments and in email. I want to thank everybody who took time to read and assess that piece. Honestly, I was a little bit afraid, because I know that I have tendency to overcomplicate and confuse people when I explain something, but so far the feedback indicated that the technical level was OK, and everybody felt comfortable with the content. Several people asked if they can get preview access to the whitepaper. I, of course, would've done it, but the biggest and most important part of it - namely how to write cache friendly calculations - is not yet written. Everything else is an interesting knowledge, but that part are concrete steps that people can do in order to improve performance by writing their calculations in such a way that they take most benefit of the caching system. While I have a good idea about what I should put there, it is somewhat more difficult material, and I struggle with writing it up. However, I thought that I could share another excerpt here. It seems that readers of my blog are hardcode AS users, and releasing pieces in the blog would be mutually beneficial - readers may find this material useful and I will get free peer review. So in a sense, publishing this staff in the blog is my public &quot;Beta&quot; of the selected material from the whitepaper. This week I decided to publish portion about SE cache sharing. One more note before I start. Some people expressed hope that the acronyms and special terms that are used in the excerpts to be explained. And indeed, the whitepaper will explain all of them. For the purpose of this (and previous piece), SE means Storage Engine - component that deals with processed raw data before calculations apply. FE means Formula Engine - component which deals with all the calculations. </em></p> <p><strong>3. Cache sharing</strong></p> <p>Sharing the caches across multiple users is one of the most obvious benefits with doing all the computations centrally on the server. If first user has to wait for the data in his report to be computed, all the other users who open same or similar reports will experience much faster report load times. However, it is not always safe to share the caches across users, especially in the presence of security. In this chapter we will carefully review the conditions under which caches can and cannot be shared.</p> <p><strong>3.1 SE caches sharing</strong></p> <p>The rule for SE caches sharing is very simple: SE caches are <b>always</b> shared across all the users. This seems to be good to be true, but it is actually true! We will show how it is possible to deal with security and still share the cache. First of all, cell security is not applicable to SE, since it is done as post operation – i.e. SE caches may contain cells which are cell secured, but if such cells are needed by MDX expression, they will be marked as errors. (For more details about order of operations in the cube see my other article <a href="http://sqlblog.com/blogs/mosha/archive/2005/12/31/default-members-mdx-scripts-security-kpis-and-perspectives.aspx">&quot;Default members, MDX Scripts, Security, KPIs and Perspectives&quot;</a>). To understand how SE can handle shared caches in the presence of non trivial dimension security, we need to consider two impacts of dimension security:</p> <ul> <li>Restricting the secured members </li> <li>Applying visual totals </li> </ul> <p><strong>3.1.1 SE caches and restricting secured members</strong></p> <p>This part is fairly simple. It is similar to the way that datacache filtering (described in 2.1.2) works. The request to the datacache registry ignores dimension security restrictions, but then the attribute filtering is applied to the result datacache.</p> <p><strong>3.1.2 SE caches and dimension security visual totals</strong></p> <p>Making SE caching to account for dimension security visual totals is a little bit more complex. Indeed, it seems almost impossible to share caches, when for the same cell some users should see one value and other users see another value. Yet it is possible – and we are going to see how. But since it is somewhat tricky, it is best to illustrate this by example and doing some more tracing in profiler. We will again use stripped down version of Foodmart, with Sales cube containing only Customer dimension. There is one role with AttributePermission set on [State Province] attribute, allowing only WA and CA states, i.e. AllowedSet = { [Customer].[State Province].[WA], [Customer].[State Province].[CA] }, and we will enable VisualTotals=true on it. Now, we would send the following query</p> <pre>SELECT [Customer].[Country].[USA] ON 0 from Sales</pre>
<p>First, we send it from the Administrator account, such that no security applies to it, and we get the real total for USA, and second time we will send it from the account which is member of the restricted role, and therefore it will have visual total value for USA (i.e. total of WA and CA only). We are still most interested in the Query Subcube event, but to get the information which demonstrates visual totals handling, we will need to use another variant of that event called Query Subcube Verbose. Here is how the picture looks in profiler for it:</p>
<p><img src="http://www.mosha.com/msolap/images/secache_dimsecvt.png" /> </p>
<p>The information in Query Subcube Verbose event is a superset of what is available in Query Subcube event, but requires more skills to understand. We will limit discussion here only to the forms which are applicable for this discussion. First query (from connection 9) is query by Administrator. The interesting information is from dimension Customer of course. Unlike with Query Subcube event we don’t see bitmap of granularities but rather some symbolic representation of both <b>slices and granularities.</b> The string &quot;2 0 0 0 0&quot; shows slices on every attribute. Number 2 is internal <b>data id</b> corresponding to USA (in fact when we look into individual attributes later, it translates data ids and shows [Country]:[USA]). Data id is really internal thing, and should not be attributed any special meaning. You can even get data ids of attribute members using MDX function DataId, but this is internal undocumented function, so again, you cannot really rely on it. What is important here is that Country has slice of USA, and all other attributes don’t have neither slice nor granularity. If some attribute had granularity, but not a slice, then it would show as * instead of 0. For example</p>
<pre>SELECT [Customer].[Customers].USA.Children ON 0 FROM Sales</pre>
<p>Will show &quot;2 * 0 0 0&quot; to represent that [State Province] attribute has granularity, but no slice. If the slice is not a single member but set of members, it will show in Query Subcube Verbose event as +, and it is not possible to see the members of the set. For example</p>
<pre>SELECT { [Customer].[City].[Redmond], [Customer].[City].[Seattle] } ON 0 FROM Sales</pre>
<p>Will show &quot;0 0 + 0 0&quot; to indicate that City attribute has granularity and slice contains a set (the set of {Redmond, Seattle} that is).</p>
<p>Now let’s go back to our Administrator query. It gives &quot;2 0 0 0 0&quot; for the subcube, and it says that this wasn’t answered from the cache, since the cache was initially empty. Now we connect as a user of the restricted role (for example we can do it by specifying connection string parameter &quot;EffectiveUserName=REDMOND\billg&quot; to emulate Bill Gates – and, of course, only server administrator can do it). The query is the same, but let’s look more carefully into what we get for the subcube. It says &quot;2 – 0 0 0&quot;. This is pretty interesting. We see that the only difference from the Administrator query is that [State Province] attribute has &quot;-&quot; sign instead of 0. In the AS jargon, &quot;-&quot; means <b>slice below granularity</b>. Let’s explain slowly what it means. First for the &quot;below granularity&quot; part. Obviously, in our query we don’t have granularity on [State Province]. This can be easily verified by looking at the bitmap in Query Subcube event too. However, even though we don’t have granularity, this attribute has a slice. The slice comes from the dimension security definitions – it is { WA, CA } set. Therefore we end up with slice without granularity. Having slice in the subcube causes data to be filtered, and not having granularity causes data to be aggregated. Therefore we end up with value for USA which is sum of only WA and CA – exactly what we need for visual totals functionality. Note, that this request too wasn’t served from cache. And this is right – we get different data, and this data couldn’t have been derived from the real total of USA, therefore we needed to go to the storage. And the way cache system knows to distinguish between real total for USA and visual total for USA is by comparing the slice on [State Province] attribute. Of course, there could be different slices on this attribute for different roles. All of them will show as &quot;-&quot; in the profiler, but internally AS knows how to distinguish between them. The concept of slice below granularity is an important one for AS. It is used not just for dimension security visual totals, but also for features such as subselects, sets in WHERE clause, multiselect over distinct count, many-to-many dimensions and semiadditive measures and for other features too. We will talk about this more in the chapter about cache lifetime. However, we need to emphasize again, that such datacaches are inserted into datacache registry and are shared across all users. As a proof, we can now start another Administrator session, and send the following query</p>
<pre>SELECT [Customer].[Country].[USA] ON 0 FROM
(SELECT { [Customer].[State Province].[WA], [Customer].[State Province].[CA] } ON 0 FROM Sales)</pre>
<p>We will see the following in the trace:</p>
<p><img src="http://www.mosha.com/msolap/images/secache_subselectvt.png" /> </p>
<p>I.e. this query also generated slice below granularity subcube &quot;2 – 0 0 0&quot;, but this time it was found in cache, because it was previously requested by dimension security visual totals and cached.</p>
<p>The conclusion is that when calculations are not involved and there is only SE data to worry about – the data will be shared across all users unconditionally. This is great news since it has very significant performance impact in heavy multiuser setups.</p>
<p>When there are calculations, the picture is more complicated and calculation designer has implicit control over how wide the caches will be shared. We are going to dive into details in the next chapter about FE cache sharing, as well as in the section about cache lifetime.</p>
<font face="Arial" size="2"></font><img src="http://sqlblog.com/aggbug.aspx?PostID=7958" width="1" height="1">ssasInside OLAP Engine: Cache Prefetchinghttp://sqlblog.com/blogs/mosha/archive/2006/01/05/inside-olap-engine-cache-prefetching.aspxThu, 05 Jan 2006 23:02:00 GMT21093a07-8b3d-42db-8cbf-3350fcbf5496:7957mosha1http://sqlblog.com/blogs/mosha/comments/7957.aspxhttp://sqlblog.com/blogs/mosha/commentrss.aspx?PostID=7957<p><em>I am writing a whitepaper about Analysis Services 2005 Cache Manager, and it gets longer and longer. I already spent few days on it, and I am still not even halfway done. So I decided to publish an excerpt from this whitepaper (chapter 2.1.5) and gauge whether I am getting it at the right technical level or whether I am going too deep or too shallow. The timely comments will be appreciated.</em> </p> <p>As we learned by now, the FE query optimizer decomposes the MDX query into multiple single granularity subcubes. Depending on which cell or group of cells are being computed at the moment, FE will choose the appropriate subcube to send request for data to SE. However, since the query optimizer has the global picture of the entire query, it can tell which cells will be requested in the future. This knowledge is used by the Cache Manager to reorder the subcube requests in order to take advantage of datacache filtering and aggregation described in the previous chapter. This is called <i>cache prefetching</i> Cache prefetching is best illustrated by example. Let’s consider the following MDX query:</p> <pre>SELECT Descendants(USA, City, SELF_AND_BEFORE) ON 0 FROM Sales</pre>
<p>I.e. this query asks for USA, all the states in USA and all the cities in USA. If we were to send requests in the order in which cells appear in the query result, then the first request will be for USA only – since there is nothing in the cache that can satisfy it, it will go to the storage (MOLAP or ROLAP). Next, there will be states in USA, and again cache cannot satisfy it, so again it will go to the storage. And at the end it has to get all the cities in USA, which again will miss the cache and hit the storage. Essentially, the pattern of cache/data access here looks the same as for the following three single grain queries:</p>
<pre>SELECT Customers.USA ON 0 FROM Sales
SELECT Descendants(Customers.USA, Customers.[State Province], SELF) ON 0 FROM Sales
SELECT Descendants(Customers.USA, Customers.[City], SELF) ON 0 FROM Sales</pre>
<p>It is very useful to be able to trace cache access patterns, and it is possible to do with Profiler. Below is a screenshot of the Profiler after executing the above three queries: <img src="http://www.mosha.com/msolap/images/noprefetch_profiler.png" /> </p>
<p>The interesting event to watch here is &quot;Query Subcube&quot; – this event is generated every time FE needs to interact with SE. The TextData column contains the bitmap which represents subcube granularity (remember – FE/SE interface happens through single grain subcube). When the bit is 1 – it means that corresponding attribute has granularity, and when the bit is 0 – it means that corresponding attribute doesn’t have granularity. The order of attributes corresponds to the order of dimensions and attributes inside measure group. In our simple cube – there are only 5 attributes in the order Country, [State Province], City, Customer and Gender. The EventSubclass column tells us whether or not we hit the cache. Now, when we execute </p>
<pre>SELECT Customers.USA ON 0 FROM Sales</pre>
<p>The subcube granularity is 10000 – i.e. we only have granularity on countries and nothing else. There is nothing else in the cache, so obviously we don’t hit it. Next query</p>
<pre>SELECT Descendants(Customers.USA, Customers.[State Province], SELF) ON 0 FROM Sales</pre>
<p>Has granularity on both country and state – therefore subcube grain is 11000, but we still miss the cache. Finally, the last query</p>
<pre>SELECT Descendants(Customers.USA, Customers.[City], SELF) ON 0 FROM Sales</pre>
<p>Has grain on all three attributes – country, state and city – 11100, and we miss the cache again, because cities cannot be derived from states or country. So total 3 cache misses and 0 cache hits.</p>
<p>However, our original query contained all three levels of the hierarchy, and the query optimizer knows that three SE subcube requests will have to happen to fully satisfy the entire query. Therefore it decides to reorder the requests to take advantage of the cache system. Let’s take a look at Profiler screenshot for this query <img src="http://www.mosha.com/msolap/images/prefetch_profiler.png" /> </p>
<p>Here we see cache prefetching in action. First &quot;Query Subcube&quot; don’t go for country granularity, even though USA is the first cell in the query result. Cache Manager inserts request to the country,state,city granularity instead – 11100. This first request misses the cache (the cube was reprocessed to force cache cleanup). Obviously, we need to go to storage at least once. However, all the subsequent requests for 10000 and 11000 are going to benefit, since they can be aggregated from cache of 11100 – and we see that EvenSubcube value for them shows that indeed cache was hit.</p>
<p>Watching Profiler carefully, can often reveal important information about cache behavior and therefore about query performance.</p><img src="http://sqlblog.com/aggbug.aspx?PostID=7957" width="1" height="1">ssasDefault members, MDX Scripts, Security, KPIs and Perspectiveshttp://sqlblog.com/blogs/mosha/archive/2005/12/31/default-members-mdx-scripts-security-kpis-and-perspectives.aspxSat, 31 Dec 2005 07:00:00 GMT21093a07-8b3d-42db-8cbf-3350fcbf5496:7955mosha3http://sqlblog.com/blogs/mosha/comments/7955.aspxhttp://sqlblog.com/blogs/mosha/commentrss.aspx?PostID=7955<p>What happens to the cube after the processing is complete but before cube is ready for the querying ? There are actually quite a few steps that should be taken in order to prepare the cube, like create calculated members and named sets, apply security, factor in actions and KPIs etc. In this article we will provide an overview of the most important steps and discuss issues directly following from how these steps are done and how the ordering of these steps is important.</p> <b> <p>Cube initialization sequence</p> </b> <p>At the very high level the following events happen before cube becomes available for querying:</p> <ul> <li>System determined default members are applied </li> <li>MDX expressions for user specified default members in DDL are evaluated </li> <li>User specified default members are applied </li> <li>MDX expressions for dimension security (allowed set, denied set etc) are evaluated. </li> <li>MDX expressions for dimension security default members are evaluated </li> <li>Dimension security is applied to all attributes </li> <li>Default members are adjusted if necessary (if default member became secured) </li> <li>Dimension security visual totals are applied </li> <li>MDX Script is executed </li> <li>KPI driven calculated measures are created </li> <li>Actions are added </li> <li>Cell security is applied </li> <li>Perspective measures restrictions are applying </li> <li>Default measure is adjusted to the perspective </li> </ul> <div>We will go now over each and every step and briefly discuss it.</div> <b> <p>System determined default members are applied</p> </b> <p>This is kind of bootstrapping process. We really need to set up an execution context before we can go ahead with following steps. Many of the following steps require evaluation of various MDX expressions, and MDX expression always relies on the current coordinates. These coordinates need to be initially set to something. You may ask why not to use directly default members from DDL. The answer is that these could be by themselves MDX expressions, which need to be executed in some context, so we get Catch 22. How does the system determines the &quot;system determined&quot; default members ? Strictly speaking, this could be arbitrary, therefore no application should ever rely on the exact details. But currently the algorithm is following</p> <ul> <li>If the attribute is aggregatable (i.e. IsAggregatable=true), then the All member will be chosen as a default member on that attribute </li> <li>If the attribute is not aggregatable then some random member from this attribute will be chosen as a default member. Usually this random member is the first member which was seen during attribute processing, and since AS doesn’t issue ORDER BY statement for attribute processing there is no way to tell which member would be “first”. </li> <li>For the default measure, the first non-hidden measure is chosen. If all measures are hidden, then the first measure is chosen. “First” here refers to the first measure in the measures collection in AMO and DDL (but not in MDSCHEMA_MEASURES, which is always alphabetically sorted). </li> </ul> <p><b>MDX expressions for user specified default members in DDL are evaluated</strong></b></p> <p>From the previous paragraph, it is obvious, that normally the user will always want to overwrite the [almost random] system selection of default members on non-aggregatable attributes. And perhaps the default measure as well (although this one is not as random). Default members can be MDX expressions, and they are evaluated in the context of the system determined default members established at the previous step. </p> <p>But things are not as simple as they look. Let’s consider a simple example. Have you tried to define default member in measures one of the dimensions (dimension attributes really) to be a calculated member ? If you do it by using Analysis Services 2005 UI from BI Studio – you will get an error similar to this one:</p> <pre>DefaultMeasure: Member object ‘Profit’ not found when parsing [Measures].[Profit]</pre>
<p>What is going on ? Is it a bug ? It very well might be perceived as a bug, especially given a fact that in Analysis Services 2000 UI Analysis Manager such operation worked fine. But actually, amazingly enough, from the engine perspective, this is not a bug – it is By Design behavior. And, of course, it is possible to set default members to be calculated members – it’s just that the UI for it is not obvious. I agree that for the user, trying to change default member through the property called &quot;Default Member&quot; is the most natural thing to do, and it appears as a bug when it doesn’t always work, but there is an explanation. What happens here is the order of applying default members vs. executing MDX Script. The default members specified by the user in UI get translated into attribute property DefaultMember or cube property DefaultMeasure in DDL, and they are applied <b>before</b> MDX Script is executed. Since all calculated members are created inside MDX Script, if default member refers to calculated member – the error is raised, because there are no calculated members yet. This isn’t the only scenario when such an ordering is not desired. In my book, &quot;Fast Track to MDX&quot;, there is chapter about default members, and the example we give there is to set the default member of the Time dimension at the latest day for which there is a data in the fact table and it works with any cube in which this dimension is used, even if different fact tables have different latest date recorded. This is achieved by the following MDX expression for AS2000:</p>
<pre>Tail(Filter(Time.Day.MEMBERS, NOT IsEmpty(Measures.DefaultMember)),1).Item(0).Item(0)</pre>
<p>By the way, in AS2005 this expression could be simplified to</p>
<pre>Tail(NonEmpty(Time.Day.MEMBERS),1).Item(0)</pre>
<p>However, this won’t work in AS2005 for the same reason – since default members are evaluated before MDX Script was executed – it means that the CALCULATE statement wasn’t executed yet, therefore most of the cells in the cube are still empty, therefore Filter (or NonEmpty) function will return empty set, and default member cannot be NULL. On the other hand evaluating and applying DDL default members after MDX Script execution has its drawbacks. For example, the named sets and SCOPE subcubes need a context to be evaluated in, and the default members need to be factored in it. So there is really no &quot;right&quot; decision whether or not to apply DDL default members before or after MDX Script – either way, there will be scenarios when it will be wrong. And sometimes it is desirable to set default members in the middle of MDX Script, such that calculations before affect it (like CALCULATE) and calculations after are affected by it (named sets). Or even for different attributes default members should be set at different points of the MDX Script. It is, of course, possible to do – from the engine point of view pretty much the same way as it was done in AS2000. In AS2000 setting default members in the UI of Analysis Manager simply generated new ALTER CUBE command in the cube commands collection. Same can be done in AS2005, only that instead of collection of commands we have more convenient MDX Script itself. So what user needs to do is to insert into MDX Script at the right place (say, at the end of MDX Script) the appropriate ALTER CUBE statement, like the following example</p>
<pre>ALTER CUBE CURRENTCUBE UPDATE DIMENSION Measures, DEFAULT_MEMBER=[Measures].[Profit]</pre>
<p>One nice touch about ALTER CUBE statement, is that it operates on the hierarchy, therefore it allows setting default member on multiple attributes in one statement. (Of course, if all levels in the hierarchy have their underlying attributes defined relationships between each other – this is a moot point, because changing default member on one such attribute will anyway affect the others). </p>
<p>The disadvantage of using ALTER CUBE statement in the MDX Script is that it needs to be added to the MDX Script of every cube where the default member needs to be set. However, it also can be argued, that in this case anyway default member depends on the MDX Script (either by being a calculated member or derives from the calculations) and therefore probably should be tailored separately to each cube anyway.</p>
<p>As we have explained above, the MDX expressions for attribute default members in DDL should be such that they can be evaluated without cube context. For example, if Year is non aggregatable attribute, and we want to set default year to be the last year from the dimension, then the following expressions are fine:</p>
<pre>Tail(Time.Year.Year.MEMBERS,1).Item(0)</pre>
<pre>Time.Year.Year.MEMBERS.Item(Time.Year.Year.MEMBERS.Count-1)</pre>
<p>Or if we want to tie it to the current year, then</p>
<pre>StrToMember(&quot;[Time].[Year].[Year].[&quot;+DatePart(&quot;YYYY&quot;,Now())+&quot;]&quot;)</pre>
or, even better, call to the server stored procedure which returns the member object for the Year hierarchy based on whatever logic the designer put into this stored procedure
<pre>GetDefaultYear()</pre>
<p>All these expressions don’t need to get anything from the cube, so they are fine.</p>
<b>
<p>User specified default members are applied</p>
</b>
<p>Default members evaluated in the previous step are applied to the context. It is important to separate these two steps, although it seems natural to combine them together. However, default members are evaluated in isolation one from another. Therefore, until all the expressions are evaluated, they are not applied. </p>
<b>
<p>MDX expressions for dimension security are evaluated.</p>
</b>
<p>Dimension security is applied <b>before</b> the MDX Script is executed. Therefore all the static expressions inside MDX Script (such as named sets and calculation SCOPE subcubes) will be resolved in the context of dimension security. In dimension security you can use arbitrary MDX expressions for specifying allowed and denied sets. The UI normally generates simple member enumeration sets, but using Advanced tab you can define more complex set expressions. At this point, the system already determined all the roles to which the user belongs (called active roles), and it evaluates all the MDX expressions for all AttributePermission objects at all active roles. These expressions are resolved in the context of the current user, therefore function Username resolves to that user. But not only that, also if you are using stored procedure inside MDX expressions for dimension security, and this stored procedure has ImpersonationLevel=ImpersonateCurrentUser, then it will work as well. Both of these things are the foundation for the dynamic dimension security techniques. More information about it can be found at http://www.mosha.com/msolap/security.htm.</p>
<b>
<p>MDX expressions for dimension security default members are evaluated</p>
</b>
<p>Dimension security can specify its own default members. This feature is more about personalization then about security (as we will quickly see in the next paragraph). </p>
<p>Note, that while the MDX expressions for dimension security were evaluated at the previous step, the dimension security itself wasn’t applied yet. Therefore expressions for dimension security default members are still evaluated in the unsecured context. </p>
<b>
<p>Dimension security is applied to all attributes</p>
</b>
<p>Now it is time to apply dimension security to all attributes. This means that the evaluated allowed and denied sets in every attribute and every active role are applied to form virtual bitmask over the attribute members. It is bitmask in a sense that it is a data structure which tells for every attribute member whether it is secured or not. It is virtual, because the physical implementation is not necessarily a bitmap, many times it is more sophisticated data structure, however, it is guaranteed to never require more memory then single bit per attribute member. Usually it requires much less memory then that. Merging of active roles is simply a virtual OR operation between these virtual bitmaps, and it is very efficient. Merging security implied default members is different, because at the end there could be only one default member, therefore if there is a conflict between the roles in this respect, one of them overwrites the others. It is not possible to control which default member will win. Process of merging active roles has other interesting rules about how visual totals settings are merged etc.</p>
<b>
<p>Default members are adjusted</p>
</b>
<p>It is possible to set definitions in such a way that DDL defined default member or even dimension security defined default member ends up to be secured one. Obviously this would be a security hole, therefore after dimension security is applied, AS goes over all the default members and makes sure that they are all allowed. If previously defined default member is not allowed, it is changed using similar rules as in system determined default members. </p>
<b>
<p>Dimension security visual totals are applied</p>
</b>
<p>Dimension security visual totals now are applied to the current context, therefore all the calculations which will be done from that point on will be done with visual totals on the attributes defined in dimension security</p>
<b>
<p>MDX Script is executed</p>
</b>
<p>This is obviously most important step, where all the calculations are created etc. Since discussion about MDX Scripts can easily occupy few more blog articles, or even a book, we shall not say anything more here.</p>
<b>
<p>KPI driven calculated measures are created</p>
</b>
<p>KPIs are one of the most publicized features in AS2005, and they are very user appealing. Yet, their implementation is very simple and straightforward. When the user defines KPI object properties such as KPI value, KPI goal, KPI trend etc, the user can specify the MDX expressions for these. What happens behind the scenes is that AS creates hidden calculated measures for each one of these properties and assign the MDX expression specified in the property to the calculated measure. Later, when user queries KPIs, the KPI browser generates queries using MDX functions KPIValue, KPIGoal, KPITrend etc, which return these hidden calculated measures. Now, we are at the step when these hidden calculated measures are created. As you noticed, this step happens after MDX Script was executed. One interesting consequence of it is that the calculated measures created at this step will be executed <b>after</b> all other calculations in the MDX Script. (Or to put it in AS2000 terms, they will have higher solve order and/or pass). Is it a problem ? It could be. Let’s suppose that our MDX Script looks like following:</p>
<pre>CALCULATE;
CREATE Account.Variance = (Account.Budget – Account.Actual)/Account.Actual;</pre>
<p>I.e. we want to compute percentage how far budget deviated from the actuals. Now, let’s also suppose that we have KPI called Profit, which will have the following simple expression for its value</p>
<pre>Measures.Sales – Measures.Cost</pre>
<p>There will be hidden calculated measure called Profit_Value created behind the scenes, and when the user will want to see the variance of the Profit, he will navigate to the intersection of Variance and Profit_Value calculated members – and since Profit_Value is executed after Variance, the user will get difference of ratios instead of ratio of differences ! So how this problem can be solved. Actually, it can be solved very easily if we know one additional piece of information. KPI trigger creation of hidden calculated measures <b>only</b> if the MDX expression for the KPI property is not a simple reference to some measure (either calculated or physical). However, if the expression is a simple reference, such as [Measures].[Sales], then no hidden calculated measure will be created, and KPIValue, KPIGoal, KPITrend etc functions will simply return that measure. Therefore, one can (perhaps even should) create all the calculations in the MDX Script (where they really belong in the first place), and resolve all the precedence rules there using power of MDX Scripts. Therefore in our example the MDX Script would be rewritten as following:</p>
<pre>CALCULATE;
CREATE HIDDEN Profit = Measures.Sales – Measures.Cost;
CREATE Account.Variance = (Account.Budget – Account.Actual)/Account.Actual;</pre>
<p>And define the expression for the KPI Profit value simply as [Measures].[Profit] (there is no problem with KPIs having same names as calculated members – they belong to different namespaces). Since calculated measure Profit was defined in the MDX Script before the calculation for Variance, it will also be executed before, and therefore the results will start to make sense.</p>
<b>
<p>Actions are added</p>
</b>
<p>Actions are really noop as far as calculations or current context are concerned. All the MDX expressions used in actions are evaluated dynamically upon invocation of MDSCHEMA_ACTIONS schema rowset.</p>
<b>
<p>Cell security is applied</p>
</b>
<p>Unlike dimension security, cell security is applied <b>after</b> the MDX Script was executed. Therefore all the static calculations (such as named sets and calculation SCOPE subcubes) were resolved without taking cell security into account. Of course, all the dynamic expressions (i.e. everything at the right hand side of the assignment operator – expressions of calculated members, custom member formulas etc) will be executed using cell security. Cell security is applied by applying virtual OR operator on top of cell security boolean expressions in all active roles. </p>
<b>
<p>Perspective measures restrictions are applying</p>
</b>
<p>We are close to the end now. The last piece deals with perspectives. Perspectives are really almost the same as the cube. It is mostly about hiding some objects, i.e. MDSCHEMA_DIMENSIONS, MDSCHEMA_HIERARCHIES, MDSCHEMA_MEASURES, MDSCHEMA_SETS etc return less rows. But the MDX is not affected by these changes (almost!). However, there is one important difference, and it deals with how measures (but not calculated members !!!) are hidden. When MEASURES.MEMBERS is used as axis expression, then only the measures and calculated measures included in the perspective are included in the set. But if same is used deeper inside expression for calculation, for example defining calculated member which does MEASURES.MEMBERS.COUNT, then the number which will be returned will count all the measures, including the hidden ones. Such behavior should sound familiar to the people who worked with MDX subselects and CREATE SUBCUBE. And indeed, the way perspective restrictions on measures are implemented is doing internal</p>
<pre>CREATE SUBCUBE PerspectiveName FROM
(SELECT PerspectiveMeasures ON 0 FROM CubeName)</pre>
<p>This is pseudocode, because in AS2005 CREATE SUBCUBE cannot change the name of the cube on which it operates, but perspectives obviously have different name from the cube</p>
<b>
<p>Default measure is adjusted to the perspective</p>
</b>
<p>Perspectives, of course, can overwrite default measures, since they may exclude the cube default measure. And even if default measure for perspective is not explicitly specified, it still may change. Just like with any other subselect and CREATE SUBCUBE statement, after it is done, the default members are adjusted to fit into the space defined by the subcube. Since for perspectives the subcube restriction is on measures only, the default measure is adjusted. </p><img src="http://sqlblog.com/aggbug.aspx?PostID=7955" width="1" height="1">ssasUnderstanding changes in Yukon MDX Part Ihttp://sqlblog.com/blogs/mosha/archive/2005/02/16/understanding-changes-in-yukon-mdx-part-i.aspxThu, 17 Feb 2005 03:06:00 GMT21093a07-8b3d-42db-8cbf-3350fcbf5496:8996mosha0http://sqlblog.com/blogs/mosha/comments/8996.aspxhttp://sqlblog.com/blogs/mosha/commentrss.aspx?PostID=8996<p>When I did a review of “<a href="http://sqljunkies.com/WebLog/mosha/archive/2005/01/30/7031.aspx">Analysis Services 2000 vs 2005</a>” presentation, I mentioned, that changes to MDX were not covered at all in it, and I also noted, that changes to MDX in Yukon could easily fill several presentations by themselves. And they do. In July, Rob Zare has delivered Webcast called “Understanding changes in Yukon MDX Part I”. This Webcast was mostly based on the presentation that Rob and myself prepared together for the Yukon Technical Preview. When Beta 2 shipped, this Webcast was supposed to become available, but while everybody whom I asked assures me that it is available somewhere (on Betaplace, or perhaps on Beta 2 DVD), nobody was able to point out the exact location. Since we are getting a lot of questions in the newsgroup on this subject, I decided at least to make the slides available. While this is not as good as recorded Webcast - it is still better then nothing :) Note that this is just Part I, and it was impossible to cover everything, so we had to chose what we thought would fit the “introduction“ presentation. We are working now on the “advanced“ presentation - so let me know if you want some particular content to be included there.</p> <ul> <li>MDX in UDM - cube space, autoexist, attributes vs. hierarchies, cell coordinates etc </li> <li>MDX Scripts - declarative vs. procedural, centralized calculation management etc </li> <li>Cube as a spreadsheet - Yukon calculation model </li> <li>MDX Script statements - CALCULATE, SCOPE, assignments, conditional assignments, calculation properties assignments, FREEZE etc </li> <li>Sets in WHERE clause </li> <li>MDX data types and conversion rules </li> <li>Handling of unrelated dimensions and attributes - IgnoreUnrelatedDimensions vs. ValidMeasure etc </li> <li>Automatic recursion resolution - “infinite recursion occured“ can be resolved by the engine </li> <li>Miscelaneous - aggregation over distinct count, syntax simplifications, CASE operator etc. </li> </ul> <p>Slides can be downloaded from here:&#160; <a href="http://www.mosha.com/msolap/ppt/Yukon_MDX_I.zip">http://www.mosha.com/msolap/ppt/Yukon_MDX_I.zip</a> <br />Additional material about MDX: <a href="http://www.mosha.com/msolap/mdx.htm">http://www.mosha.com/msolap/mdx.htm</a> <br />Additional material about Yukon: <a href="http://www.mosha.com/msolap/yukon.htm">http://www.mosha.com/msolap/yukon.htm</a> <br />Additional material about Analysis Services: <a href="http://www.mosha.com/msolap">http://www.mosha.com/msolap</a></p><img src="http://sqlblog.com/aggbug.aspx?PostID=8996" width="1" height="1">ssasAnalysis Services 2000 vs. 2005http://sqlblog.com/blogs/mosha/archive/2005/01/30/analysis-services-2000-vs-2005.aspxMon, 31 Jan 2005 06:19:00 GMT21093a07-8b3d-42db-8cbf-3350fcbf5496:8869mosha0http://sqlblog.com/blogs/mosha/comments/8869.aspxhttp://sqlblog.com/blogs/mosha/commentrss.aspx?PostID=8869<div>I have run across a presentation called “<a href="http://www.nysia.org/events/past/2004/2004129dbprof.pdf">Analysis Services 2000 vs. 2005</a>” prepared by Jaimie Basilico and Mark Frawley (Jamie works in Microsoft as Senior Technology Specialist in the East Cost, and he is one of the best field people in Analysis Services that we have). This presentation is targeted towards people who are familiar with Analysis Services 2000 and want to come on speed with Analysis Services 2005. I have found the presentation very useful, but not all subjects are covered in the same depth. Below are my comments</div> <ul> <li>Slides 10-11: Good explanation of concept of Data Source Views (DSVs). A little bit overlap with later slides on UDM, since building cube over fact tables with different granularities is really feature of UDM, not of DSV. </li> <li>Slide 12: One-click-cube feature is great to get you started, but you still need to know your schema well </li> <li>Slide 14: Real replacement for Archive/Restore and CAB files of AS2K is built-in engine support for Backup/Restore in AS2005 </li> <li>Slides 16-17: UDM overview </li> <li>Slide 18: I think mentioning of XMLA vs. non-XMLA clients and client-side caching is confusing here - especially in the context of previous bullet points. Probably the best would be to remove the last bullet point and discuss XMLA and thin-client architecture separately. </li> <li>Slides 19-29: Great overview of Proactive Caching. Definetely best part of the presentation. Worth every slide. </li> <li>Slides 30-35: KPIs, Perspectives, Actions, Translations - good overview. Wish it was more detailed, like Proactive Caching. </li> <li>Slides 36-44: Dimensions architecture. This is very important part of UDM - Role playing, Fact, Many-to-many, Reference dimensions are covered. Only one correction here - slide 39 - Fact dimensions actually don't do autoexist with other dimensions. </li> <li>Slides 45-47: MDX - this is weakest part of the presentation, changes to MDX could easily fill dozens of slides - presentation doesn't cover any of them. </li> <li>Slides 48-49: Stored procedures and server ADOMD.NET </li> </ul> <div>Full presentation can be found at <a href="http://www.nysia.org/events/past/2004/2004129dbprof.pdf">http://www.nysia.org/events/past/2004/2004129dbprof.pdf</a></div><img src="http://sqlblog.com/aggbug.aspx?PostID=8869" width="1" height="1">ssasKerberos Authentication and Delegation in Analysis Services 2005http://sqlblog.com/blogs/mosha/archive/2005/01/25/kerberos-authentication-and-delegation-in-analysis-services-2005.aspxWed, 26 Jan 2005 02:14:00 GMT21093a07-8b3d-42db-8cbf-3350fcbf5496:8868mosha1http://sqlblog.com/blogs/mosha/comments/8868.aspxhttp://sqlblog.com/blogs/mosha/commentrss.aspx?PostID=8868<p>Anybody who implemented multi-tier application which needs to access Analysis Services and tried to use Windows integrated security from end to end, has run into the issue of enabling delegation, which also meant forcing Analysis Services to use Kerberos (since this is only supported SSPI provider which implements delegation). One classic example is Intranet Web app, where domain users connect with IE to ASP.NET, and ASP.NET connects to Analysis Services. This was possible to do in Analysis Services 2000 - the detailed steps are documented here: <a href="http://www.mosha.com/msolap/articles/enablingdelegation.htm">http://www.mosha.com/msolap/articles/enablingdelegation.htm</a>. In Analysis Services 2005, the idea is the same, but few things changed. First, Analysis Services 2005 by default will automatically choose Kerberos if it is available, unlike AS2K which defaulted to NTLM (to be more precise, AS2005 uses by default Negotiate SSPI package, which gives preference to Kerberos. It could be configured through Active Directory group policy to use only Kerberos etc). Second, Analysis Services 2005 supports multiple instances, so SPNs must reflect the instance names. It is important to note, that SPNs don't use port numbers but rather instance names in order to ensure secure mutual authentication of the named instance in case of MITM attack with hijacked redirector. Below are detailed instructions for setting up Kerberos and enabling delegation in AS2005, courtesy of Marius Dumitru who also provided the original instructions for AS2K. The permanent copy of those instructions is kept here: <a href="http://www.mosha.com/msolap/articles/kerberos_delegation.htm">http://www.mosha.com/msolap/articles/kerberos_delegation.htm</a>.</p> <p>In order for Analysis Services 2005 to use Kerberos as an authentication protocol and enable delegation the following steps are needed:</p> <ol> <p></p> <li>The correct SPN should be registered in the Active Directory. The &quot;setspn.exe&quot; utility from the Windows 2000 Resource Kit can be used with the following syntax: <br /> <br />setspn.exe -A MSOLAPSvc.3/&lt;serverhostname&gt;.&lt;domainname&gt; &lt;serviceaccount&gt; <br />&#160;&#160;&#160;&#160;&#160;&#160;&#160; and/or <br />setspn.exe -A MSOLAPSvc.3/&lt;serverhostname&gt; &lt;serviceaccount&gt; <p></p> <p>where &lt;serviceaccount&gt; is either the computer name (for LocalSystem/NetworkService) or the domain account under which the server is running.</p> </li> <li>The user account(s) you want to be delegated must have the &quot;Account Is Sensitive And Cannot Be Delegated&quot; option <b>cleared</b> (i.e. <b>not checked</b>). You'll find this property in &quot;Active Directory Users And Groups&quot;, under the &quot;Account&quot; property tab. <p></p> <p></p> </li> <li>All computers involved must be marked as trusted for delegation (except the first and last computers in the chain). <br />For example, if a user on machine A connects to IIS on machine B which uses a COM component on machine C which uses MSOLAP90 to connect to an Analysis Services server on machine D, then the machine accounts of B and C should have the &quot;Computer Is Trusted For Delegation&quot; option checked (enabled) in &quot;Active Directory Users And Computers&quot;-&gt;&quot;Computers&quot;-&gt;&quot;Computer&quot;-&gt;&quot;Properties&quot;. <p></p> <p></p> </li> <li>If you have other servers on the chain between the user and the Analysis Services machine, and the other servers run under a service account other than &quot;LocalSystem&quot;, then those server accounts should have the &quot;Account Is Trusted For Delegation&quot; option enabled in &quot;Active Directory Users And Groups&quot;-&gt;&quot;User&quot;-&gt;&quot;Properties&quot;-&gt;&quot;Acount&quot;-&gt;&quot;Account Options&quot;. <p></p> <p></p> </li> <li>All accounts (including machine accounts) must belong to the same Active Directory domain (or to trusted domains in the same forest). <p></p> <p></p> </li> <li>The server (datasource) name has to be either the full DNS name of the server (fully qualified domain name, e.g. myhost.mydomain.com), or a NetBios name (myhost). Specifying a numeric IP address will disable Kerberos. <p></p> </li> </ol> <p>You can troubleshoot whether a server tries to use Kerberos or not by running &quot;setspn.exe -L &lt;serviceaccount&gt;&quot; and checking whether SPNs with the following format show up in the output:</p> <p>MSOLAPSvc.3/myhost.mydomain.com <br />MSOLAPSvc.3/myhost</p> <p>Also, the Windows Event Log will show success/failure events, but may need additional configuration to do it.</p> <p>For Yukon Analysis Services named instances, the same steps apply, the only change being that the SPN formats to configure are:</p> <p>MSOLAPSvc.3/&lt;serverhostname&gt;.&lt;domainname&gt;:&lt;instancename&gt;</p> <p>MSOLAPSvc.3/&lt;serverhostname&gt;:&lt;instancename&gt;</p> <p>(i.e. the string &quot;:instancename&quot; gets appended to the regular SPN)</p><img src="http://sqlblog.com/aggbug.aspx?PostID=8868" width="1" height="1">ssasLoading Aggregate Data with MDX Script in Analysis Services 2005http://sqlblog.com/blogs/mosha/archive/2005/01/13/loading-aggregate-data-with-mdx-script-in-analysis-services-2005.aspxFri, 14 Jan 2005 01:55:00 GMT21093a07-8b3d-42db-8cbf-3350fcbf5496:8867mosha0http://sqlblog.com/blogs/mosha/comments/8867.aspxhttp://sqlblog.com/blogs/mosha/commentrss.aspx?PostID=8867Richard Tkachuk has just posted a new whitepaper on the <a href="http://www.sqlserveranalysisservices.com">www.sqlserveranalysisservices.com</a> site called “<a href="http://www.sqlserveranalysisservices.com/OLAPPapers/Loading%20Aggregate%20Data%20in%20AS2005v1.htm">Loading Aggregate Data in Yukon</a>”. It discusses several scenarios when the data is already precomputed at the non-leaf cells and needs to be loaded there instead of being aggregated from the leaves. The modeling approach Richard took is similar to what was done in Analysis Services 2000 - i.e. using parent-child hierarchies with Data Members and then using MDX to copy values from Data Members cells to the non-leaf cells. The MDX expressions themselves are simple, but what makes this whitepaper interesting, is that it demonstrates some of the new Analysis Services 2005 MDX concepts, most interesting of which is auto-recursion resolution - i.e. where Analysis Services 2000 would've raised infamous “Inifinite Recursion” error, Analysis Services 2005 knows how to resolve it. The whitepaper is conviniently labeled “v1”, which hints that this is just a first iteration. Indeed, the technique demonstrated in the current document can be futher refined and optimized, and I also think we can expect additional scenarios to be covered. <img src="http://sqlblog.com/aggbug.aspx?PostID=8867" width="1" height="1">ssas