Pushing Sitecore to the limits and beyond

Tag Archives: Pipelines

A while ago, we at Pentia took over a massive Sitecore solution, which after 15 years of upgrades and development the maintenance cost consumed the entire digital budget of the customer.

In other words, the client was at a crossroad – to build new or renovate.

For this client the answer was relatively easy:

Firstly, the number of features and functionalities in the platform is vast, and just to scope and specify the entire platform was a massive, if not impossible, undertaking – and one which would claim a large number of resources internally and externally.

Secondly, while building a new platform (a massive task), the existing platform would have to be kept alive and slowly (painfully slowly) phased out over time. This means double resources for development, maintenance and operations.

Thirdly – and probably the most deterring factor – the change management involved in retraining the thousands of staff involved in and around the platform and across departments was substantial and disruptive to the entire organisation.

Therefore, a renovation project was established, and the first task was to reduce technical debt for the solution.

Reducing maintenance cost

One of the best ways to reduce technical debt is to reduce the code base, less code == less maintenance cost. In this case we managed to delete 28% of the code base, here are a few key figures for the solution when we took it over.

900+ sites (over ½ million items)

15 years old (multiple upgrades from Sitecore 4.x to 8.2 and single migration)

Custom Attributes

The point of this attribute is to clearly mark that a loosely referenced class, method or interface is indeed needed by the solution.

In other words, it indicates that a class, method or interface is used, even though it has no references. It’s possible to add a text to explain how and where it is used.

Obsolete

Whilst .net provides the Obsolete custom attribute; there are some missing options to indicate that the code is obsolete, and can be removed when a condition is met:

Specific date

Specific release is in production

3rd party system is updated to a specific version

The point of this attribute was therefore to allow us to plan the renovation project in stages and remove code when the referring parts were cleaned up.

Refactor

During this project we ran into many pieces of code, classes and structures which were in dire need of refactoring. But because of constraints in time, code not deployed, lack of knowledge, dependencies, multiple version of 3rd party system, or for some other reason it was not possible at that time.

Therefore, the best we could do was add this attribute and define why it should be refactored, and why it hasn’t been refactored.

The purpose of this attribute was therefore documentation and planning of the renovation process

Introducing a “Ensure Code Is Obsolete” Service

It is very difficult to ensure that code is obsolete and is never called and that is why it is so difficult to delete code.

What we needed was a somewhat conclusive measurement if the running code was being executed.

What we decided to do was to introduce code in the solution that collected data on code executed across all running solution instances and aggregated the data and presented the results, to ensure that the code was not required.

The IIncrementCountService interface was introduced to provide the ability to count how often the code is executed and then send the results to be aggregated with the other instances, by the content management server.

Implementation Challenges

The Content Management, Content Delivery, Utility & API instances are in different network zones without access to each other.

The implementation must have a minimum impact on performance, network traffic, database storage.

Not introduce any new databases and or tables.

As we do not have access to production environment apart from the Sitecore Client, it is not possible log the data the file system.

Sitecore Remote Events

Remote events (see this blog for a good introduction) provide the perfect mechanism to allow all instances to send their counter data to the Content Management service which is responsible for aggregating the data and presenting the results.

You must be careful with events as if you flood the event queue table it can kill the performance of ALL your sitecore instances.

The following configuration was introduced (see my blog post on Type Safe Settings) so the IncrementCount function will only raises an event when one of the following is true:

The count exceeds 1000

The threshold of 15 minutes is reached

A new day starts

This ensures that the event queue is not overloaded and will minimize performance impact, network & database usage.

The IncrementLocalCountService class is responsible for incrementing the count, caching it locally and raising the event to notify the Content Management server, when one of the afore mention threshold is met.

Who is responsible for aggregating the results?

The content Management is responsible for aggregating the results. It requires some extra configuration, to register that it will subscribe to handle remote events, raise the event and it then handle the remote event (see blog for more information).

Where is the Data Saved?

Ideally it should be saved in its own SQL database.

Unfortunately, we were not allowed to introduce and new databases and or tables, so we had to use the sitecore IDTable. The CounterRepository is responsible for retrieving, updating and persisting the counters in the IDTable.

Presenting the results

No magic here a simple counter.aspx pages, which reads from the CounterRepository and displays it in a table, with the option to clear the database. Also some code to ensure that only Sitecore administrators can access the page. See Part 2 in the series.

It proved difficult to determine what the Sitecore configuration was in each environment, especially for the content delivery servers, as it was not possible to call showconfig.aspx.

Solution

Each time the application starts, it writes out the contents of the merged Sitecore configuration to a file in the logs folder. The file name contains the instance name, date and time created. So in addition to seeing the current configuration, you can also see how it changes over time (very useful after a deploy where nothing works).

Get the merged Sitecore configuration

It turned out to be very simple to implement, as it only takes one line of code to get the merged Sitecore configuration:

Create a custom pipeline processor class

Create a processor for the initialize pipeline, so each time Sitecore is started the processor will be called to ensure that the configuration is saved. Create a public class with a public member called Process, which accepts a parameter of type PipelineArgs. The code below is all that is needed.

Configuration Changes

The processor has to be added to the initialize pipeline, I would recommend you create an include file to achieve this, but for the sake of clarity I have added it directly to the web.config, see below.

Now every-time Sitecore is started it writes out the configuration, so it is easy to get the configuration and monitor how it changes for all environments over time.

I hope this helps you untangle the Sitecore includes which at times can be a nightmare.

The customer wanted the ability to suspend scheduled publishing, but could still make manual publishes (i.e. started from the Sitecore client).

Each time a publish is started it runs the publish pipeline. Therefore it is possible to insert a custom pipeline step at the beginning (see below) to do the following:

Identify if it was a scheduled publish

Check if a check-box in Sitecore is ticked

If both conditions are met – abort the publish pipeline to stop the publish

Unfortunately aborting the publish pipeline is not enough 😦

In the initial code I would abort the pipeline using AbortPipeline() (see below) as I assumed this was enough to stop the publish. The pipeline was aborted and no items were published, but the code that starts the pipeline still updated the properties table indicating that the publish had succeeded:-(

Side affect

This had the side effect that when the schedule publishing was enabled again, any items that were modified or created whist the publishing was disabled would not be published as when scheduled publishing was resumed Sitecore believed that they had already been published.

Solution

After checking the code using reflector I determined if I threw an exception, it would ensure that the properties table was not updated. So the publish was completely cancelled, and when scheduled incremental publishing was resumed it will publish all the items that have been modified since the last successful publish, and not since the last aborted publish.

How to identify a scheduled publish

Not the nicest solution but it works! I check the publish context user which can have the following values:

The user logged into sitecore – If publish is started from the Sitecore client

sitecore\Anonymous – if the publish is started by the scheduler

If the value is sitecore\Anonymous I know that it is a scheduled publish.

A major concept change from sheer UI to SPEAK; is that continuation/control flow has moved from the server to the client. Now the client is responsible for control flow/state validation and the server is ideally responsible for answering simple/stateless requests.

Effectively a lot of server side C# code is moving to JavaScript, and to help with the complexity and to mirror the functionality/concepts available server side, Sitecore has introduced as part of the SPEAK framework client-side pipelines.

Client Pipelines

This is a very brief introduction and I will go into more detail in follow up post.

A pipeline is a set of one or more steps that can be executed in a specific order; each step shares the same context.

Each step can change the context and pass it to the next step until all steps have been completed, and or the pipeline is aborted.

Each step can contain both client and server side logic or just client side logic.

The diagram below illustrates a simple pipeline that is executed when a button is clicked.

Example

So what can we use pipeline for? well one example could be when you delete an item, the idea is to keep each step as short and sweet as possible:

Step 1 – Can Delete – check if the current user has the required permissions to delete the item.

Step 4 – Check Clone Links – as you have to warn the user that any cloned items will become real items.

Step 5 – Execute – delete the item

Step 6 – Has links – show the broken link dialog to fix any broken links

Step 7 – Un-clone items.

Limitations

Unfortunately the current implementation does not provide the ability to define the order the steps are executed in, using a configuration file or the web.config. The order is defined by a property called priority that is defined within the JavaScript itself, see the code below.

Each step needs to know about other steps in order to be called in the appropriate sequence.

Difficult to see a clear picture for a Pipeline (in which order steps are executed).

If you want to insert/remove an extra step it an existing Sitecore speak pipeline you have to modify Sitecore’s source code.

It increases the complexity of upgrade.

Hope

I hope that future version of SPEAK will support declarative declaration of pipeline steps, similar to what is available server side (i.e. pipelines defined in the web.config and or include configuration files).

My next post will go into more detail about how to create and initiate pipelines, hope you enjoyed my first ever post, Alan