This post follows on somewhat from my recent posts on running async startup tasks in ASP.NET Core. Rather than discuss a general approach to running startup tasks, this post discusses an example of a startup task that was suggested by Ruben Bartelink. It describes an interesting way to try to reduce the latencies seen by apps when they've just started, by pre-building all the singletons registered with the DI container.

However, you can't just have framework code in your applications. Inevitably, developers have to put some actual functionality in their apps, and if performance isn't a primary focus, things can start to slow down. As the app gets bigger, more and more services are registered with the DI container, you pull in data from multiple locations, and you add extra features where they're needed.

The first request after an app starts up is particularly susceptible to slowing down. There's lots of work that has to be done before a response can be sent. However this work often only has to be done once; subsequent requests have much less work to do, so they complete faster.

I decided to do a quick test of a very simple app, to see the difference between that first request and subsequent requests. I created the default ASP.NET Core web template with individual authentication using the .NET Core 2.2 SDK:

dotnet new webapp --auth Individual --name test

For simplicity, I tweaked the logging in appsettings.json to write request durations to the console in the Production environment:

Next I hit the home page of the app https://localhost:5001 and recorded the duration for the first request logged to the console. I hit Ctrl+C to close the app, started it again, and recorded another duration for the "first request".

Obviously this isn't very scientific, It's not a proper benchmark, but I just wanted a feel for it. For those interested, I'm using a Dell XPS 15" 9560, w block has an i7-7700 and 32GB RAM.

I ran the "first request" test 20 times, and got the mean results shown below. I also recorded the times for the second and third requests

Mean duration ± Standard Deviation

1st request

315ms ± 12ms

2nd request

4.3ms ± 0.6ms

3rd request

1.4ms ± 0.3ms

After the 3rd request, all subsequent requests took a similar amount of time.

As you can see, there's a big difference between the first request and the second request. I didn't dive too much into where all this comes from, but some quick tests show that the vast majority of the initial hit is due to rendering Razor. As a quick test, I added a simple API controller to the app:

Hitting this controller for the first request instead of the default Razor Index page drops the first request time to ~90ms. Removing the MVC middleware entirely (and responding with a 404) drops it to ~45ms.

Pre-creating singleton services before the first request

So where is all this latency coming from for the first request? And is there a way we can reduce it so the first user to hit the site after a deploy isn't penalised as heavily?

To be honest, I didn't dive in too far. For my experiments, I wanted to test one potential mitigation proposed by Ruben Bartelink: instantiating all the singletons registered with the DI container before the first request.

Services registered as singletons are only created once in the lifetime of the app. If they're used by the ASP.NET Core framework to handle a request, then they'll need to be created during the first request. If we create all the possible singletons before the first request then that should reduce the duration of the first request.

The WarmupServicesStartupTask class implements IStartupTask (from part 2 of my series) which requires that you implement ExecuteAsync(). This fetches all of the singleton registrations out of the injected IServiceCollection, and tries to instantiate them with the IServiceProvider. Note that I call GetServices() (plural) rather than GetService() as each service could have more than one implementation. Once all services have been created, the task is complete.

The IServiceCollection is where you register you register your implementations and factory functions inside Starrup.ConfigureServices. The IServiceProvider is created from the service descriptors in IServiceCollection, and is responsible for actually instantiating services when they're required.

The GetSingletons() method is what identifies the services we're going to instantiate. It loops through all the ServiceDescriptors in the collection, and filters to only singletons. We also exclude the WarmupServicesStartupTask itself to avoid any potential weird recursion. Next we filter out any services that are open generics (like ILogger<T>) - trying to instantiate those would be complicated by having to take into account type constraints, so I chose to just ignore them. Finally, we select the type of the service, and get rid of any duplicates.

By default, the IServiceCollection itself isn't added to the DI container, so we have to add that registration at the same time as registering our WarmupServicesStartupTask:

And that's all there is to it. I repeated the test again with the WarmupServicesStartupTask, and compared the results to the previous attempt:

Mean duration ± Standard Deviation

1st request, no warmup

315ms ± 12ms

1st request, with warmup

289ms ± 11ms

I know, right! Almost knocked you off your chair. We shaved 26ms off the first request time.

I have to admit, I was a bit underwhelmed. I didn't expect an enormous difference, but still, it was a tad disappointing. On the positive side, it is close to a 10% reduction of the first request duration and required very little effort, so its not all bad.

Just to make myself feel better about it, I did an unpaired t-test between the two apps and found that there was a statistically significant difference between the two samples.

Value

t

7.1287

degrees of freedom

38

standard error of difference

3.589

p

<0.0001

Still, I wondered if we could do better.

Creating all services before the first request

Creating singleton service makes a lot of sense as a way to reduce first request latency. Assuming the services will be required at some point in the lifetime of the app, we may as well take the hit instantiating them before the app starts, instead of in the context of a request. This only gave a marginal improvement for the default template, but larger apps may well see a much bigger improvement.

Instead of just creating the singletons, I wondered if we could just create all of the services our app uses in the startup task; not only the singletons, but the scoped and transient services.

On the face of it, it seems like this shouldn't give any real improvement. Scoped services are created new for each request, and are thrown away at the end (when the scope ends). And transient services are created new every time. But there's always the possibility that creating a scoped service could require additional bootstrapping code that isn't required by singleton services, so I gave it a try.

GetSingletons() is renamed to GetServices(), and no long filters the services to singletons only.

ExecuteAsync() creates a new IServiceScope before requesting the services, so that the scoped services are properly disposed at the end of the task.

I ran the test again, and got some slightly surprising results. The table below shows the first request time without using the startup task (top), when using the startup task to only create singletons (middle), and using the startup task to create all the services (bottom)

Mean duration ± Standard Deviation

1st request, no warmup

315ms ± 12ms

1st request, singleton warmup

289ms ± 11ms

1st request, all services warmup

198ms ± 8ms

That's a mean reduction in first request duration of 117ms, or 37%. No need for the t-test to prove significance here! I can only assume that instantiating some of the scoped/transient services triggers some lazy initialization which then doesn't have to be performed when a real request is received. There's possibly JIT times coming in to play too.

Even with the startup task, there's still a big difference between the first request duration, and the second and third requests which are only 4ms and 1ms respectively. It seems very like there's more that could be done here to trigger all the necessary MVC components to initialize themselves, but I couldn't see an obvious way, short of sending a real request to the app.

It's worth remembering that the startup task approach shown here shouldn't only improve the duration of the very first request. As different parts of your app are hit for the firat time, most initialisation should already have happened, hopefully smoothing out the spikes in request duration for your app. But your mileage may vary!

Summary

In this post I showed how to create a startup task that loads all the singletons registered with the DI container on app startup, before the first request is received. I showed that loading all services in particular, not just singletons, gave a large reduction in the duration of the first request. Whether this task will be useful in practice will likely depend on your application, but it's simple to create and add, so it might be worth trying out! Thanks again to Ruben Bartelink for suggesting it.