In my previous post, I stated that InventSumLogTTS table is not updated after related inventory transaction was physically posted. I was totally wrong.

In a while I had to make some performance tuning for a couple of installations, which suffered from bad inventory closing performance and overall conflicts between inventory closing and regular user work. After looking closer onto the problem, in both cases, I found out that the main reason for performance issues is a large set of updates to inventSumLogTTS table, coming from inventory closing/recalculation process. Since both installations use Weighted Average inventory model, routine inventory recalculation was causing some 2-3-5 millions of updates to InventTrans table and, respectively, creation of the same number of records in inventSumLogTTS table. The worst part of it is that InventSumLogTTS table is locked in pessimistic mode. It means, that if your users are running an order explosion update or trying to schedule production order, the system will start iterating over all InventSumLogTTS records for all items, which somehow participate in the explosion. Since now we have several millions of transactions, it will take some time. Since all explosion update process is run as a one big fat transaction, all updated records in inventSumLogTTS will remain locked till end of this transaction and all other users, trying to schedule another production order, which happens to share any of direct or indirect subcomponents with order being re-scheduled, will have to wait till the transaction ends. Also, most probably, part of the records related to scheduled items were inserted by inventory closing in not yet committed transaction(s). And every time when scheduling bumps into one of these items, it have to wait till transaction will commit. Shortly speaking, any attempt to run inventory closing/recalculation safely prevent most of production-related functionality.

Everything I said is related to DAX2009. In DAX2012, designers made some provisions to support inventory closing/recalculation execution in parallel with production operation.First of all – they enabled optimistic locking for InventSumLogTTS table. Second – they invented ‘Soft Deletion’ of inventSumLogTTS records, so the reqTrans update function can mark records as processed without causing long locks to other processes (because of actual deletion). I have not checked this algorithm in practical usage and I am not sure that they addressed 100% of locking issues for inventSumLogTTS table, but I can believe that in version 2012 they addressed the most of them.

Yet, I am not sure that the very approach is correct. Let’s look into the following statements.

Right now (at least – as of versions 4-2012R2) InventSumLogTTS is used only for dynamic update of net requirements (reqTrans). I do not know how it can be used in future, but now this table do not have any other use.

MRP deals only with inventory transactions in, so to say, ‘Estimated’ statuses. After inventory transaction was registered (for receipts) or picked (for issues), from MRP point of view, it becomes a part of inventory-on-hand, one big net requirement which processed by MRP as a whole and it is based on inventSum data (not on inventory transactions).

If an inventory transaction were physically updated (becomes a part of inventory-on-hand), then net requirements update logic apply this change ONLY on first change to physically updated status. If both new and old transaction statuses in a record of inventSumLogTTS are below or equal to Picked/Registered, then this record is not applied to net requirements and simply dropped during net requirement update process.

Inventory closing/recalculation deal ONLY with inventory transactions who are well behind ‘Estimated’ status. It process transactions which were already Physically or Financially updated.

Moreover, inventory closing/recalculation DO NOT change any of quantity/dimension/transaction status fields, which are somehow related to MRP. Well – I remember that average inventory recalculation/closing creates dummy inventory transfers, but they are irrelevant for MRP.

So, I can simply draw a conclusion that any logging to inventSumLogTTS from inventory closing/recalculation should be simply disabled. The very fact that this was not made in a standard version of Dynamics AX can be considered as a design bug.

Now let’s come to more practical part. Dynamics AX do have API to disabling logging to InventSumLogTTS. You can call the following function: appl.inventUpdateOnhandGlobal().inventUpdateOnhand().parmSkipInventSumLogTTS() to enable or disable logging to inventSumLogTTS. I must mention, that in the end of transaction (commit or abbort), logging is re-enabled automatically, so in most cases you just need to disable logging and then do nothing.

Practically speaking – to disable logging to inventSumLogTTS during inventory closing/recalculation, you need insert a call to appl.inventUpdateOnhandGlobal().inventUpdateOnhand().parmSkipInventSumLogTTS(true); after every ttsbegin; statement (but not connection.ttsbegin() function), you find in the inventCostItemDim class. To disable logging for inventory closing/recalculation cancellation, you need to insert the same call (again – after every ttsbegin; statement) into the class inventCostClosingCancel_WorkInvent.

I made this change for two installations and I had quite positive results: Inventory recalculation/closing time dropped from days to hours (about 15-20 times) and all locking between production operations and inventory closing/recalculation have been gone.

Usual disclaimer: Use this thing with care, first try it in TEST environment, I am not sure about future compatibility and compatibility with your customizations and so on. Anyway – even in the worst case, InventSumLogTTS is cleared out on every MRP regeneration for dynamic plan, so It is not very big deal to skip this updates. In the worst case, just imagine that you ran unsuccessful session of MRP regeneration. 😉

Most of Dynamics AX specialists (especially from support) heard about inventSumLogTTS table. Everyone knows that this table somehow is related to MRP; Everyone knows that if you do not run MRP, your inventSumLogTTS table grows bigger and bigger and eventually you need to clean it somehow. But since detailed documentation on static/dynamic plan update algorithms is absent, not many people understand how Axapta actually use this table for plan updates; what are advantages and caveats of dynamic vs static plan use;how order explosion differs from regeneration planning and so on…

I wrote this text mostly from my experience with Dynamics AX 2009. In the new version (DAX2012) approach remained the same from functional point of view, only technical details changed, so this information can be useful also for people using DAX2012.

A bit of history

As I wrote in my blog before, inventSumLogTTS table began its life (somewhere in late releases Axapta 2.5) only as a tool to track a list of changes which was applied to inventSum table in independent DB transaction. Should the main transaction be rolled back, the system was to iterate over related records in inventSumLogTTS and programmatically rollback changes to inventSum. Starting from Axapta 3.0, inventSumLogTTS table started to be also used to support dynamic update to MRP plan. In version Dynamics AX 4.0, the whole Inventory MultiTransaction System was dropped from Axapta and replaced with a new approach. Nevertheless inventSumLogTTS table still exists in the system, but nowadays it is used only to support dynamic MRP plan update. Today, name of the table is misleading, because now it is not used to log InventSum updates. As of versions 4-2012, it is used only to keep track of inventTrans changes, which have not been applied to dynamic plan yet.

Recently, I was invited to a customer who had started to use Inventory Closing functionality about a month ago. They were using DAX2009 with recent rollup 7. Every time when they were running inventory closing, the system locked all (or at least – most of the) inventTrans records, safely preventing users from any regular work with logistics module. Before I came to the customer I already heard rumors that new functionality of non-financial transfer closing can cause long duration locks to inventory data. Now I had a chance to witness it with my own eyes. The sql statement which was a reason for show-stopper problem in IC was very easy to find. (I knew the place in the code before I went to a customer’s site). It was the following query from InventCostNonFinancialTransferHandler.updateInventTrans() method:

Looks impressive, hugh ? If you explode nested views, you find out that this query is joining 10 or 11 tables. (I lost count to be honest). After I started to look onto the query, I understood that from application logic point of view, the query is Okey. It must update all non-financial transfers for a given Dimension Group in one update. It should not lock users from current operations (e.g – reserving stock for sales orders or posting picking list), because it should update only financial transactions from period to be closed. I made an assumption, that maybe this query is just causing lock escalation from record level to page level and then to table level, thus causing total lockup. I tried to disable lock escalation for inventTrans with alter table set(lock_escalation=disable) statement. (Just for a test; I think it would be a bad idea in general). It did not help. Then I tried to analyze lock information by selecting data from sys.dm_trans_locks table. I found out that this statement placed U-lock over seemingly every page of the inventTrans (Or, at least, all pages with given dataareaid).

I never had a lot of experience with U-locks before. I saw them in a database from time to time, but I red somewhere in SQL BOL that it is a kind of short duration locks issued by update statement to maintain consistency and prevent deadlocks. When I tried to find more information on update locks in internet, I ran into perfect article Read Committed and Updates by Craig Freedman. It turned out that update statement in SQL (and update_recordset statement in Axapta respectively) place U-lock ON EVERY row it read. What happens next ? If update statement has simple logic (like update table set nonindex_field=constant where another_field=constant), the system either update a row (if it fits a where condition) and replace U lock with X, either it immediately release U lock (if row does not fit to the condition). So for simple cases, U-lock is placed on a row for really short duration – from reading of a page into memory until end of the page’s in-memory scan. But in complex queries, especially in queries with self-join, U-lock is placed in the very beginning of SQL Statement execution (when the page is being read for the first time) and kept until query reached a home-run stage, when all records to be updated are defined and the query started update phase itself. Since our query in question has a join of 11 tables, even in the very best case scenario the system was taking a couple of minutes to evaluate all necessary info and proceed to update phase. In the worst case (and this worst case happens too often), the system was choosing incorrect execution plan. In this case, U-Locks persisted for 30-40 minutes. It was locking all the pages in the table, because the system was simply reading all pages with given dataAreaId (because inventTrans do not have more specific index suitable for query conditions) thus leading to a total lock-up for 30-40 minutes.

BTW, there is no much point in locking here, because these updates being called only from starting part of Inventory Closing, which is executed in single user mode. Chances that these inventTrans records would be updated but some other user activity (not IC) are pretty thin: There is no much you can do with already financially updated inventory transfer transactions. The only thing which is coming to my mind is an idea that some crazy user will start to change InventDimId of inventory transactions via Item Details->Functions->Edit dimensions function. But I think we can ignore this possibility 😉 I can understand SQL Server concerns about the data to be modified by different process, but from application logic point of view it is highly improbable.

When I identified the issue, I started to try different ways to fix it. First thing I tried to replace an update_recordset with equivalent set of statements w/o update_recordset: while select table {select forupdate table2 where table2.recId==table.recid; table2.update()} . Since outermost select does not use ‘pessimisticklock’ clause (and InventTrans has OCC enabled), it does not places any locks on table it read. Inner select use ‘forupdate’ clause, but because of OCC it is just a syntactic sugar, which does not generate any additional clauses or hints for the select statement sent to SQL Serever. This approach brought some progress: Although the query (now select, not update) was still taking 30-40 minutes to execute, it did not place these nasty U-locks over all records; IC was still slow, but it stopped to lock users at least. As a next stage, I tried to split one select onto two nested selects. For everyone who is into SQL Statement tuning it sounds like very counter-natural, but since we are unable to directly influence the query plan in X++ with hints, we must somehow ensure SQL Server selecting more or less decent execution plan. I tried the following approach: On outer level I placed a query which was simply producing a list of itemIds, which have non-closed transfers (any transfers, maybe even financial ones), in nested while select I used the copy of original select, only with adding ‘itemId==itemIdFromOuterSelect’ condition to every occurrence of InventTrans and inventSettlement in the query. This approach brings good results: Processing of non-financial transfers in my case took only 1-1.5 minutes. Maybe in your case, if you have many items with transfers, it will take about 30-40 minutes for execution of the whole logical block, but it yet much better than 30-40 minutes for one item dimension group. Well, maybe this approach is not the most perfect one. Maybe I should try to combine outermost select (to select item list) and inner update_recordset. Maybe this U-Locks won’t be so nasty after all. If the statement takes only 5-10 seconds to execute, then maybe benefits of fast execution will outweigh potential lock conflicts. But for my case, it was acceptable solution and the customer kept to use it.

As far as I understand, the whole feature was first designed for DAX2012 (where related statement is always executed per-item) and then backported to DAX2009 without too much testing. Also, to my knowledge currently MS is working on a patch for this feature in DAX2009.

Story #2: Wrong query execution plan can ruin the helper-based algorithm in Axapta

On another project, the customer often had performance issues with inventory closing cancellation.

In the beginning, it was a mere issue of total execution time. When they were running IC cancellation in interactive mode, it took around 3-4-5 hours to complete. I pointed them that to the fact, that if you execute IC cancellation in batch mode, it runs many times faster than in interactive. The trick is simple: When IC cancellation is started, it checks the mode of execution. If it is running in a batch mode, it checks which items has inventSettlement records in inventory closing being canceled, then it spawn a batch task of class InventCostClosingCancel_workInvent for every such an item. Next it iterates over all indirect cost journals created by inventory closing being canceled, and for every cost journal, it spawns a batch job of class inventCostClosingCancel_WorkProd. Then all these jobs start to execute (up to a batch server maximum number of thread at a time), thus brings significant (like 5-10 times) increase in performance. (It is worth to mention that all batch jobs are spawned with empty batch group, so you have to configure at least one of your batch servers to serve batch jobs from this batch group).

So, after they started to run Inventory Closing cancellation in a batch mode, time of the operation dropped to 45-60 minutes. (Progress percent in batch screen to advance by 1-2 percent every minute.) But from time to time, a batch mode cancellation went exceptionally slow (like 0.1-0.15% in a minute). I decided that SQL server generates silly execution plans sometimes and prescribed to run update statistics on inventTrans/inventSettlement/prodTableJour before running IC cancelation. For a while this well known home remedy against spontaneous slowness of queries helped to resolve the issue. But eventually, since 8-10 months after GoLive, even after statistics update, inventory closing cancellation become very slow, so I had to investigate the issue a bit more.

When I checked the list of connection, I found out that there is a lot of blocked processes from batch server, so it was a locking issue, not the issue of wrong execution plan. (Or at least so I decided after first look). Next I ran SQL Profiler, I also found out that the system creates several deadlock in a second. Finally I started to check the list of running queries and resources they are waiting for. I found out, that all processes was waiting on newly inserted key into inventSettlement.DateVoucherIdx or prodTableJour.dateVoucherIdx. I drew a conclusion that they are waiting on new key, because when I tried to select data from the tables using %%lockres%% pseudo-column, I was unable to find record being locked). I made assumption that a transaction insert record, then everyone wait on it and then transaction is getting rolled-back because of deadlock detection. (This was the only reasonable explanation why all connections are waiting on in-existent record.) The waiting statements were one of the following update_recordset statements from inventCostClosingCancel_workInvent.duplicateSettlements() and inventCostProdCancel.prodTableJour() respectively:

After I found and identified the places in the X++ code, which was causing deadlocks, I was finally able to understand the issue. Meaning of IC cancellation is quite simple: It simply insert exact copy of closing records, but with reversed sign. Since insert_recordset statement in X++ does not support expressions in select list, the author of the code found the simple pattern to overcome this: First the system inserts copy of old data (inventSettlement/prodTableJour) ‘as is’ (only voucher id and date are changed), next, special update_recordset statement (which supports some basic expressions in setting clause), is updating newly inserted records, inverting a value of the fields. Since this insert and update are made in 1 transaction, no other user can read our half-cooked records with un-inverted sign.

From application logic point of view, since, every helper thread is processing its own itemId or JournalId, no lock conflicts or deadlock can occur. But since an update statement in SQL (and update_recordset statement in X++) has a habit of locking EVERY record it read, locking logic suddenly becomes in-determenistic.If SQL Server use execution plan with index scan by ItemId/JournalId indexm several threads avoids locking conflicts, since every of them update its own personal set of keys. But from time, SQL Server having a bad day and it is selecting voucher+date index for both queries. In this case,1st helper thread inserts records for item A and locks the related key entries in Voucher+Date index; Then the 1st thread start to iterate over WHOLE index entries for given voucher and date (which are common for all helpers, since all inventSettlement records of the cancellation use the same voucher and date). Sooner or later, this scan was trying to read uncommitted inserts of the second helper thread, which caused 1st thread to be locked until 2nd thread commit transaction. Second thread, in turn, sooner or later starts to iterate all the index entries for the same voucher and date. Sooner or later it was hitting the locks on records inserted by 1st thread. SQL Server, after finding classical deadlock situation (2 processes are waiting on each other), terminates one of the two (or several) deadlocking connections causing all the threads by one to redo the same transaction.

In both cases, I fixed the issue by building a very specific index to prevent any potential conflicting index reads in this pieces of code. In the first case (for inventSettlement) it was an index on TransDate+Voucher+ItemId. In the second case (for prodTableJour) it was an index on TransDate+Voucher+ProdId. Since then, SQL Server always use these indexes for related update_recordset statements, thus eliminating this silly deadlock issue. (It is silly, because it was not caused by some real life conflict for the same resources. It was just a result of mis-optimization of a query in SQL Server).

If you do not believe me or just want to reproduce the 2nd issue on your own test installation, try to disable itemDateIdx index on inventSettlement and NumDateIdx index on prodTableJour table. Then try to run inventory closing cancellation in batch mode with SQL profiler running deadlock trace. Moreover, when I was preparing this article, I managed to run into the same issue on standard CONTOSO database on standard international version of DAX2009ru7. It seems that the issue can be quite frequent and even average installation can benefit from building these two additional indexes, I mentioned on the previous paragraph.

Lessons learned:

Locking in SQL Server can be in-deterministic. The set of records locked can be defined not by your application logic, but by some random fluctuation of SQL Server optimizer

Be careful with update_recordset statement.It can place locks on records, which you never expects. Think about building appropriate indexes or rewriting the query if you are updating a table with highly competitive access scenarios.

Sometimes a bad query plan can bring not only performance issues, but also concurrency issues. Do not forget to do all usual hoodoo to keep SQL optimizer happy: Update statistics regularly, rebuild index regularly and so on. If it does not help – try to rewrite a query…

Of course, primary goal for these guys (or these ladies – looks more like it) is to collect HR-database with contact of potential hires, as they always asking you to fill out your contact data in the end of survey. Yet the results of of the survey, which they promise to send everyone, who submit this survey can be very interesting for everyone in Dynamics AX community. I have filled out this questionnaire already, because I am interested in three things:

Quite naturally, I want to know which country has highest salary for a Dynamics AX specialist. 😉 Especially, after this salary is normalized by,say, Expatistan cost of living index

I want to figure out the proportion between cost of software (Dynamics AX in our case) and cost of implementation. Say, 11 years ago, when I started my career as an Axapta specialist, typical DAX project budget in Russia was around $100-120K for software and about $80-90K for for consulting. Now, after 10 years of high oil prices and consequential dollar wages increase, average salary in Dynamics AX market in Russia grew four-folds. Now, even with considerable increase in Dynamics AX price per user seat, cost of consulting typically is about 200% of software purchase costs. I am wondering, which country has highest, say, highest annual-wage to user-license-price ratio and which country – lowest. Actually, it looks a bit peculiar for me, that Microsoft use almost the same sales price for all countries regardless of their GDP per capita and average salary. I guess, if Microsoft pricing policy would be more flexible, they could greatly increase their user base in less-well-to-do countries. It might pay pretty well in future, but it requires more strategic kind of thinking then typical Microsoft executive have…

I am also want to compare an average Dynamics CRM salary year-over-year. I think, we are witnessing a kind of commoditization of Dynamics CRM market. In most cases, Dynamics CRM is used simply as development environment to make ANY kind of custom development. No significant knowledge in consulting in general and sales management principles in particular is required to enter this market. Most of Dynamics CRM specialists (at least, people I had chance to meet and work with) are techies, who studied Dynamics CRM in a row of other Microsoft technologies (somewhere after ASP.NET and MS SQL, but before Biztalk). So, I expect drop in average Dynamics CRM salaries in future. I recently red that Dynamics CRM revenue for Microsoft has increased by 30% in last quarter. It might be a good indication of forthcoming salary drop. Demand for CRM specialists will increase sharply and then drop sharply as well. Sooner or later, knowledge of Dynamics CRM can become kind of commodity knowledge, like knowledge of C# or MS SQL. It can still bring money, but this money won’t be as good as 2-3 years ago…

So, I hope that this survey will be representative enough to provide every participant with relevant data. That’s why it is so important for everyone to fill it out…

Today, I will share some experience related to setting up and fine tuning MRP for best performance.

The most significant MRP-related improvement of DAX2009 over older versions is an ability to execute MRP in multi-threaded mode, when planning process is spread over several independent processes executed in parallel. As usual, multi-threaded execution works only when MRP is executed in batch mode, since helper threads are spawned as additional tasks inside the same batch job. This is the current way to support multi-threaded execution in Axapta.

Basic setup

The number of helper threads is specified on the second tab (Scheduling Helpers) of MRP start menu parameters. What number of helpers should be used ? To my experience, MRP scales very well. On my current project, we use 15 helpers (and 1 main thread). We tried to increase number of helpers to 31, but have not found any significant performance difference. Generally, it depends on configuration of your batch server(s) and your database server.When you are trying to increase the number of helper threads, you should check resource utilization for both batch server and database server. If one of these servers becomes saturated after yet another increase in helper threads number, then it is time to stop or, maybe, time to upgrade your hardware. I also want to mention, that I have positive experience of allocating MRP onto several batch servers. I would say, that my personal recommendation is as the following: First check every batch server, which will participate in MRP for the number of CPU cores it have. Then configure every batch server to run 2xNumber_Of_Cores batch threads. (In Administration->Setup->Server Configuration). Then specify number of helper as total number of batch threads for all batch servers serving your MRP batch group MINUS 1. Remember, that main batch thread also consumes one thread from your batch server group capacity. Also, if during MRP run, you are going to run some other batch processes on the same batch server(s), you may want to decrease the number even further, so maybe you should set number of helper-threads to total thread capacity minus 2 or 3.

Next important parameter to discuss is Item Distribution factor. What does it mean? On the very first stages of MRP (In Update
phase to be precise), the system allocates all items (all items in inventTable or all items in inventTable, which fit to the query specified in MRP startup dialog box) into chunks. Every chunk is a kind of unit of work, which is processed by one thread. During the phases of MRP, which are executed in parallel (Data regeneration, Coverage planning, Futures planning and Action Messages), every thread grabs chunk, process it, then grab another chunk, process it and so on, until chunks for given stage and BOM level are marked as processed.

Size of the chunk is calculated as Item_Distribution_Factor*(Number_Of_Helpers+1).

From one side, smaller chunks ensure smooth processing and even allocation of items between threads. The smaller chunk is, the less are chances for a thread to grab a chunk with high number of computational intensive items. Say, one item takes from 1 to 10 seconds to be coverage planed. If size of the chunk is only 1 item, then in the worst case, we most late thread will finish BOM level coverage only 10 seconds after the most early thread finished. If chunk size is 300 items, then in the worst case scenario, difference between time to process is 300*10Seconds-300*1Second==2700Seconds==45Minutes. It means that for this worst case scenario, there are good chances that most of helper threads will be doing nothing, waiting for 30-45 minutes for the last, unlucky thread to finish processing. It would increase planning time and it also would lead to non-optimal usage of hardware. (Since most of the threads would do nothing, while waiting for end of BOM Level). From other side, allocation of the chunk to a thread is a competitive process, which leads to temporary database locks. Several threads often try to allocate the same chunks in parallel; Only one of these threads succeed, while others repeat allocation process until they grab their own chunk. So, chunk allocation can become a bottleneck itself if number of chunks is too high and size of the chunk is too small. To my experience, reasonable size of the chunk is somewhere between 10 and 60. To find out the optimum chunk size and distribution factor, you can simply make several test with different distribution factors. Also, you can check ‘Track Item Process Duration’ checkbox in MRP Dialog parameters and then check typical item planning time in Master Planning->Inquiries->Unfinished Scheduling Processes->Inquiries->Item Process Duration. If item process duration varies a lot, you can benefit from smaller chunk size; if it does not, then probably increase of chunk size can be more beneficial.

Another potential way to improve performance (I never tried it though) is to put randomly-ordered items into a chunk. Now, when the system creates chunks, it simply iterates over inventTable ordered by itemId, so a chunk contain items with similar itemId. Since, usually, items with the same complexity of planning often has sequential itemIds, it often leads to uneven of distribution of items between chunks. Some chunks consist from items which are regular purchased items, while other chunks consist of complex BOM-items, which require complex resource planning on many work center groups. If you add special MRPOrder field to inventtable, fill it with random number during item creation and then sort by this field during chunk generation, you can have more even distribution of items between chunks.

Infant mortality issue

The frequent problem of parallel MRP run is early termination of helper threads. Say, we started our MPR as usual, but then in 15 minutes, we see that all our helper batch tasks terminated withthe strange message “Nothing to process”. (You can see this message, if you click Log button in Batch Task form before main thread of MRP-batch terminate). Then the only remaining main thread continues to run MRP in single-thread mode (very slowly).
Here is what is happening:

Main thread spawn helper threads and continues it work

Helper threads continuously check status of process waiting for it to advance to “Deleting and inserting data”

Main thread deletes net requirements and planned orders for plan being regenerated. (and it takes a lot of time usually)

Helper threads, after waiting for 15 minutes for status update, terminates with “Nothing to process” message

Main thread runs MRP in single thread mode.

First of all, the standard logic of MRP has obvious bug in helper thread behavior. While waiting for status update from main thread, the system keeps reading process status table (reqProcessList) continuously,without any pauses between executing SELECT statements. It is easy to figure out, that, say, 16 threads, issuing 70-100 statements in a second will send to DB server about 1500 statements in a second. This amount of queries can easily consume an average DB Server’s capacity, so it simply won’t have a time for useful work (like deletion of net requirements in main thread). To prevent the system from this behavior, you need to modify method run() of ReqProcessExternalThread class. It contains the ill-fated status waiting loop. You should find the end of while(true){} loop and
insert into it sleep(500); statement. This will add 0.5 seconds delay between re-reads of the process status, thus giving to your database server a time to breathe.

Another issue to fix is low timeout value for helper threads. If you have realistically large working set of net requirements, deletion of the plan would take much more time then 15 minutes, expected by developers of the functionality. To increase timeout, you should open class declaration part of the mentioned class (reqProcessExternalThread) and modify definition of WAITFORPROCESSSTATUSUPDATE macros. Try to change the value of macros from 15 to at least 60 or maybe even 90.

Slow end-of-level processing.

Often, you can see in the Unfinished Scheduling Process form, that during coverage phase, in the end of BOM Level he system works very slowly. It looks like it almost hanged for 30-40 minutes, but then, suddenly it advances to the new level. If you start AOS tracing on the batch server, you will see that all helper threads are doing nothing (they simply rereading process status waiting for advance to the next BOM level), while one unlucky thread is busy calculating coverage for portion of items. Also, in Unfinished Scheduling Process form, you can realize that the number of items on the level decreases ! 10 minutes ago you
had more then 15000 items on the level, now it is slightly less then 15000, and in a few minutes it will be less then 14500.

What’s happening ?

The main reason for the issue is incorrect BOM Level data in inventTable. When the system allocates items between chunks, every
chunk has attached BOM Level. But sometimes, during explosion of BOM for,say, 5th level, the system finds as a component of
exploded BOM the item, BOM Level of which (according to inventTable) is 2. Coverage for this item was already created
already. How does the system calculates coverage info for this item again ?

The line for the item in ReqProcessItemListLine (It is a list of lines in a chunk) is marked as ‘IsLevelChanged’

The system finds of create special ‘Spare’ chunk for items, which were pushed to the next level because of sudden level
change. This chunk’s line in reqProcessItemList is marked as ‘Spare’.

The system reassign the line for the item in ReqProcessItemListLine to Spare Chunk

During processing of next BOM Level, the system checks the item for being ‘pushed’ from level from level. If level for item was changed, the system deletes planned orders for item, previously created during planning on incorrect BOM Level.

The main performance issue of this approach is that the systemdoes not control the size of ‘Spare’ chunk at all. I witnessed
cases, when size of this chunk exceeded 1000 items. If planning of every item takes,say,4 seconds, then unlucky thread, which managed
to grab this chunk will spend more then 1 hour for processing of it. Since this chunk has highest number for a BOM level, it is usually
grabbed by some unlucky thread in the very end of BOM Level processing. Then one thread is processing this chunk for an hour, while all other threads are doing nothing, simply waiting for our unlucky thread.

The very first idea of how to fix the issue which came to the mind is to calculate BOM levels before every full regeneration planning. Unfortunately – it does not always help. Sometimes, we have incorrect BOMLevel info in inventTable only because we changed out BOM structure without recalculating BOM levels. But often, the reason for level change is that real life nesting of items in production BOMs does not fit to theoretical BOM structure described in master-data BOM. Say, we have a shining brass-head bolt, which is used only to screw a label with the name of our company to the final assembled good. Naturally, after calculation, it will get BOM Level of 1. Say, then, during production of a deeply nested sub-BOM of our finish good, someone decided to use the same bolt for screwing much smaller label to one of the sub-components. He dropped the line with standard bolt from production order’s BOM and added new line with the shining brass-head one.(Maybe it is not very good practice, but you simply can not update standard BOM structure for every possible small change, requested by a customer). If you recalculate BOM Levels, this bolt will still have BOMLevel 1. But if you try to run MRP, you would find out that it coverage actually is performed in level 5 or 6. By the way, one of the very last stages of MRP is BOMLevel update. During this stage, the system updates InventTable with actual BOM Level data, gathered during MRP processing, not with theoretical BOM Level from master-data BOM Structures.

So, the only way to resolve the issue, is to change behavior of the system to create many smaller Spare Chunks for a level, instead of one large Spare Chunk. To accomplish this, you need to modify method getSpareListNum() of ReqProcessItemList table:

The only strange thing in this piece of code is a usage of random number generator. I use it to decrease the number of potential conflicts, when several threads is trying to allocate new Spare Chunk, leading to duplicate key exception.

Since Microsoft released most recent update of Dynamics AX (I mean SP1 rollup 7), I would like to point the readers of my blog to very important update in this version, which can dramatically increase performance of one of the critical, but slow operations from production module of Axapta.

I am talking about BOM Calculation process, which in previous version can only be started in single-thread mode. I heard about implementations, where BOM recalculation for all BOMs in the system were taking more then 2 days.

Somewhere in the last December, Microsoft released hotfix KB2487033, which supported multi-threaded execution of BOM calculation and now this feature included into RollUp7.

This nice feature works only if BOM Calculation is started not for a given ItemId, but for all BOMs in the system (say – from costing version form) and if user specified batch-mode calculation of this batch job. In this case, the system behaves the following way:

BOM levels are recalculated

Tables bomCalcItemTask and bomCalcItemInventoryDimensionTask are populated with item ids and item inventory dimensions to calculate. (These two steps are also executed in standalone, single-threaded mode of calculation).

In single-threaded mode, the system starts to iterate over the items to calculate, starting from the lowest level components (with numerically highest BOM Level).

In multi-threaded mode, system SPAWNS a certain number of helper tasks for EVERY BOM level to be calculated. It also set dependencies between tasks of every BOM Level, so jobs of BOM Level 7 starts only after tasks of BOM Levels 8 and 9 are completed, tasks of BOM Level 5 depends on tasks of BOM levels 9,8,7,6 and so on…

Then in multi-threaded mode, system just calculates every BOM level in parallel. It is quite meditative experience – to look onto progress of tasks in you BOM Calculation Batch job. 🙂 It also saves you a lot of time. In my case, calculation time dropped from one night to something like 20-30 minutes, moreover 10 minutes of it, were taken for BOM Level recalculation (I guess, it is principally not parallelizable task by it nature).

Nevertheless, there are a few issues, which I would like to share here:

The BOM Calculation process is not robust. If one of the BOMs has an error (quite typical if data were imported from heritage system), then on the first BOM with error, one of the BOM recalculation tasks will terminate. Then next task will try to recalculate the same BOM and terminate again, then next, and so on. As a result, usually, after first calculation you will find out that only 20-30 percent of you items has new prices. Solution for the issue is simple: You should simply put try {} except block somewhere inside BOMCalcJob_All.runBOMCalculation() method, so error in BOM would not terminate the whole job.

When system calculates the number of helper threads, which will be used for BOM Calculation, it is implicitly assumes, that all available batch servers will be used for the calculation. Say, if you have 3 batch servers and every server has 8 maximum batch threads specified in batch server configuration, the system will spawn 24 helper threads for every BOM level. First of all, the number itself is too big, normally 8-12 threads should be enough. Morever, if you going to use only one batch server (out of three) for given BOM Calculation batch group, these 24 helper threads will be executed in three steps (8 threads each). So, I recommend to modify BomCalcJobItemTaskBatchManager.numberOfBatchSessions() method to return more reasonable number of threads if you have more then 1 batch server or if your batch server has too many batch threads available.

Also, as kind of shameless self-promotion 🙂 , I want to mention that for one of my project I made a clone of the feature, but the clone, which calculates in parallel only ONE BOM with all dependent sum-BOMs and items.

I remembering the first time when our team was developing a solution which was supposed to quickly calculate the GL balance for specific GL Account/Dimension combination. It was 2001 and we were developing the cost allocation functionality.

We just started to work with Axapta 2.1, so – our first approach was simply to calculate a GL Balance by summing up LedgerTrans table from the beginning of times . Soon, after we went live, we realized that the brute force approach is not always the most successful way of doing things in Axapta.

Luckily, we found the ledgerBalance and LedgerBalanceDim tables. First table contained the summed up GL turnover per GL Account+Period; Second table contained the summed up GL turnover per GL Account+GL Dimensions+Period. After we started to use these two tables for calculation GL balance from these two tables, we saw dramatic improvement in performance, since 200-300 of ledgerTrans records was packed into one record of these balance tables. (We had around 10-15 postings per GL Account daily and our GL period was a month).

Little bit later (like in the Fall 2001), we found out that these two tables is a more like part of problem than part of solution. 🙂 Every time when you design some table which holds summed-up data for quick retrieval, one should remember that update of the data in the table will hold the lock on the table’s record until end of the transaction. So, If Axapta was implemented in the company, which did not have very detailed GL account sheet (say only 3-4 GL accounts was used to hold the customer’s payment balance), posting of the multi-line payment journal could prevent everyone else from posting ANY other customer-related document for 5-10 minutes. Somewhere in the beginning of transaction, system was updating the balance data and then this data was locked until the end of transaction. In connection with mutual locking during inventory update, which I discussed in my article “History of Inventory Locking”, this had a rather catastrophic consequences for the system’s stability and performance. Couple of carefully planted orders/journals with the same GL accounts and inventory items, could make system unusable for a whole business day :). The only remedy for this was the usage of batch servers for posting of nearly all documents, because batch server was naturally preventing any competition for resources by converting several potentially parallel processes into one sequential queue of processes. This approach could be a solution for accountancy-centric implementations, but any real time process, like WMS just could not be fixed this way. That’s why I never believed into any of success stories telling about WMS implementation in Axapta made BEFORE release of Axapta 3.0.

The situation was relieved in Axapta 3.0. These two tables was replaced with LedgerBalanceTrans/ledgerBalanceDimTrans tables. To prevent lock chains and deadlock during an update of balance data, developers of Axapta introduced two additional fields to these new tables: the transaction date and the funny named field ‘LedgerBalancesVariant’. First thing, we realized that now balance data is stored on per-date basis, but not per-period. Initially, the meaning of this Variant field was not clear for me. Soon, I found out that actually, system keeps up to 20 (TWENTY) GL balance records for the same date and same account/dimension combination. During update, system was RANDOMLY selecting a number from 1 to 20 and then updating/inserting balance record with arbitrary Variant number. (Actually – It was not complete random number. It was calculated as Variant= (SessionId mod 20), so update of the same account/dimension combinations from the same session went to the same balance record. If they were using true random() value, update from the same session and the same DB transaction would be scattered across several records even it was the only transaction for the combination in the day.) Thus, chances for long locks was greatly decreased. Even if GL account sheet is too coarse grained, probability for two users to update the same record at the same time now was like only 5%. When I first studied this concept, I had an idea that maybe after 300-400 concurent users, I would increase the number of these ‘variants’ to 30-40, but actually this idea proved to be wrong, since I never saw any serious lock contentions on ledger balance data since release of version 3.0. Of course, if someone were to monitor process locking in MS SQL, he/she would see occasional waiting locks on ledgerBalanceTrans/ledgerBalanceDimTrans from time to time, but generally – balance data had ceased to be the main bottleneck of system’s performance.

From the other side – this solution actually traded update performance for for query performance. After balance data have been scattered across 20 variants and stored on the per-date basis, size of ledger balance table became comparable to the size of the ledger transactions table itself. Typically, size of LedgerBalanceDimTrans table became to be around 50-70% of ledgerTrans table. Since then, usage of ledgerBalanceDimTrans instead of ledgerTrans for reporting purposes became unnecessary complication, since query time for the both cases was comparable, but ledger balance data sometimes lacked necessary details important for some particular reports. Of course, in may cases usage of ledgerBalanceDimTrans was little bit more convenient then direct usage of ledgerTrans data, since one record of the balance table holds both debit and credit turnover, while fetching the corresponding data from ledgerTrans with ‘group by crediting’ would give us debit and credit turnovers in different records.

When I first saw ledger balance tables in Dynamics AX 2009 I was very surprised to see that field Variant had gone. First of all I had not expected any changes in these area since it had became non-problematic since 2003. Second – I saw no way for this mechanism to exists without Variant field.

Actually, it turns out that designers of financial subsystem in Axapta realized that actual results from usage of this Variant algorithm was satisfactory, but not perfect. From one side, we lost almost all of our DB space/query time savings from ledgerBalance* usage, from the other side – we still can have a lock conflicts on ledger balance data updates. (Though less frequent). So the designers decided to store ONE balance record per ONE ledger voucher. In the worst case (if we have post mostly one-line documents) we will have the number of ledgerBalance* records equal to the number of ledgerTrans records. In the realistic case (we post at least 50-70% of multi-line documents) we will have the same number of records as in Variant-mechanism. Since now these balance data is not a query performance increase mechanism anyway since 2003, actual size of ledgerBalances* tables does not matter; It enough for them to not outgrow ledgerTrans itself. From the other side, since EVERY ledger voucher now has it own balance record, there is no chance of lock conflicts at all.

One interesting point which is worth to mention about the new mechanism is the way how these balance updates are being made now:

All updated balances made in one ledgerVoucher is kept in RecordSortedList (thus – in memory) somewhere in LedgerBalancesList/LedgerBalancesPostingList class.

As soon as posting of ledgerVocher has finished, or the list became to large to hold in memory it is being flushed to quasi-temporary table LedgerBalancesTransDelta.

In the end of ledgerVoucher, system opens separate additional database connection and use this separate connection to execute insert_recordset statement which insert new balance records by summing up ledgerBalancesTransDelta. This update use optimistic locking mechanism (with transaction retry), but since this is separate transaction with a very little of actual database update, it can be rolledback and retried with much overhead. (And this is exact reason to use separate transaction for this update! If we were to use the main database transaction, in case of lock conflict, this would cause a great overhead (Imagine 20000 lines FA depreciation journal, which was rolled back and posted again because of balance data lock conflicts).

Finally, the records belonging to current session are being deleted from ledgerBalancesTransDelta. (That is why this table called quasi temporary).

Although, as I said, I have never seen any actual implementations of Dynamics AX 3-4 which really suffered from ledger balance data related bottlenecks, It might be that these problems would arise in 3000-4000 user implementation. (Never have seen personally implementations with more then 700-800 users). Overall, the whole history of changing approach to balance data in DAX looks like a good example of software design and engineering.