Dissecting the local functions in C# 7

The Local functions is a new feature in C# 7 that allows defining a function inside another function.

When to use a local function?

The main idea of local functions is very similar to anonymous methods: in some cases creating a named function is too expensive in terms of cognitive load on a reader. Sometimes the functionality is inherently local to another function and it makes no sense to pollute the "outer" scope with a separate named entity.

You may think that this feature is redundant because the same behavior can be achieved with anonymous delegates or lambda expressions. But this is not always the case. Anonymous functions have certain restrictions and their performance characteristics can be unsuitable for your scenarios.

Use Case 1: eager preconditions in iterator blocks

Here is a simple function that reads a file line by line. Do you know when the ArgumentNullException will be thrown?

Methods with yield return in their body are special. They called Iterator Blocks and they're lazy. This means that the execution of those methods is happening "by demand" and the first block of code in them will be executed only when the client of the method will call MoveNext on the resulting iterator. In our case, it means that the error will happen only in the ProcessQuery method because all the LINQ-operators are lazy as well.

Obviously, the behavior is not desirable because the ProcessQuery method will not have enough information about the context of the ArgumentNullException. So it would be good to throw the exception eagerly - when a client calls ReadLineByLine but not when a client processes the result.

To solve this issue we need to extract the validation logic into a separate method. This is a good candidate for anonymous function but anonymous delegates and lambda expressions do not support iterator blocks (*):

string fileName =null;// No exceptionsvar task = GetAllTextAsync(fileName);// The following line will throwvar lines =await task;

(**) Technically, async is a contextual keyword, but this doesn't change my point.

You may think that there is not much of a difference when the error is happening. But this is far from the truth. Faulted task means that the method itself failed to do what it was supposed to do. The failed task means that the problem is in the method itself or in one of the building blocks that the method relies on.

Eager preconditions validation is especially important when the resulting task is passed around the system. In this case, it would be extremely hard to understand when and what went wrong. A local function can solve this issue:

Use Case 3: local function with iterator blocks

I found very annoying that you can't use iterators inside a lambda expression. Here is a simple example: if you want to get all the fields in the type hierarchy (including the private once) you have to traverse the inheritance hierarchy manually. But the traversal logic is method-specific and should be kept as local as possible:

Use Case 4: recursive anonymous method

Anonymous functions can't reference itself by default. To work around this restriction you should declare a local variable of a delegate type and then capture that local variable inside the lambda expression or anonymous delegate:

Closure allocation and a delegate allocation will occur if a local function captures locals/arguments:

publicvoid Baz(int arg){// Local function captures an enclosing variable.// The compiler will instantiate a closure and a delegateAction a = EmptyFunction;return;void EmptyFunction() { Console.WriteLine(arg); }}

A local function captures a local variable/argument and anonymous function captures variable/argument from the same scope.

This case is way more subtle.

The C# compiler generates a different closure type per lexical scope (method arguments and top-level locals reside in the same top-level scope). In the following case the compiler will generate two closure types:

This means that the lifetime of the closure instance is bound to the lifetime of the func field: the closure stays alive until the delegate func is reachable from the application. This can prolong the lifetime of the VeryExpensiveObject drastically causing, basically, a memory leak.

A similar issue happens when a local function and lambda expression captures variables from the same scope. Even if they capture different variables the closure type will be shared causing a heap allocation:

As you can see all the locals from the top-level scope now become part of the closure class causing the closure allocation even when a local function and a lambda expression captures different variables.

Local functions 101

Here is a list of the most important aspects about local functions in C#:

Local functions can define iterators.

Local functions useful for eager validation for async methods and iterator blocks.

Local functions can be recursive.

Local functions are allocation-free if no conversion to delegates is happening.

Local functions are slightly more efficient than anonymous functions due to a lack of delegate invocation overhead (****).

Local functions can be declared after return statement separating main logic from the helpers.

Local functions can "hide" a function with the same name declared in the outer scope.

Local functions can be async and/or unsafe no other modifiers are allowed.

Local functions can't have attributes.

Local functions are not very IDE friendly: there is no "extract local function refactoring" (yet) and if a code with a local function is partially broken you'll get a lot of "squiggles" in the IDE.

To get this numbers you have to manually “decompile” a local function to a regular function. The reason for that is simple: such a simple function like “fn” is inlined by the runtime and the benchmark won’t show you real invocation cost. To get these numbers I used a static function marked with NoInlining attribute (unfortunately, you can’t use attributes with local functions).

Thank you Sergey for the great article.
I have two questions:
1. What is the root cause of the behavior when the same closure type is used across multiple delegates/functions? Isn’t it clr team’s failure?
2. Are there any reasons to use closures or situations when it’s a vital necessity? Most of the time i prefer to pass possible closure candidate as a function argument.

1. I don’t think this is a mistake from the CLR team (actually, by C# team, because closure is created by the compiler not by the runtime). Most likely it is a tradeoff between potential issue like one I’ve mentioned with VeryExpensiveObject and general cost of creating closure types and instances.
2. I didn’t get this. Could you please explain what do you mean by “Most of the time i prefer to pass possible closure candidate as a function argument.”?

> I mean that it’s difficult for me to find out an example when we need closures.
The question should be other way around: you don’t need closures, you need an expected behavior of local or anonymous function that allows them to use variables/arguments/instance/static fields naturally. Then compiler will decide how to achieve this and what is needed for that. If you local/anonymous function doesn’t use anything from the enclosing context, then the generated code is more optimal. If it DOES use something, then the compiler have to glue together the context and the generated function in one entity – closure.

I found another situation where a heap allocation happens. (It doesn’t seem like you covered it in this article, but maybe it’s a special case of one of the situations you mentioned.) I blogged about it in detail at http://faithlife.codes/blog/2017/08/local-functions-and-allocations/ but in summary, a local iterator method or local async function that captures a local variable will allocate the compiler-generated class for its backing state machine, even if the outer function never invokes the local function.

Your ReadLineByLine sample method exhibits this behaviour, which you can see by examining the IL in SharpLab: http://bit.ly/2xYxOWr

Some comments:
1. Cases 1 and 3 are basically the same (iterator blocks)
2. Case 2 can be achieved using async delegates as well
3. Case 4 is pretty minor
4. The first DifferentScopes_b should be DifferentScopes_a and the second “Body of the lambda ‘a'” should be “Body of the lambda ‘b'”
5. Console.WriteLine(func()) should be Console.WriteLine(a())
6. “stays alive until” should be “stays alive as long”
7. ImplicitAllocation_b__0() should be ImplicitAllocation_a__0()

Executing LocalFunctionInvocation once took 0.0142 ns. At 3 GHz, executing a single instruction is going to take 0.3333 ns. That tells me that the loop that was executing the benchmark has been eliminated, otherwise you couldn’t get less time per invocation than it takes to execute a single instruction. So you’re probably measuring some benchmarking overhead, not how long it actually takes to execute the invocation.

Svick, you’re absolutely right. The numbers were that low because the runtime inlined the local function. I’ve updated the post with new data. The local functions are still faster than delegates, but the difference is more reasonable now.