Sunday, 14 July 2013

Variable Lifting in Lambda Expressions

In my previous blog about event driven programming pattern, there was a code block that I want to make sure is well explained. 

private Task<string> CallLongRunningServiceWithCallback()
{
  var tcs = new TaskCompletionSource<string>();
  var client = new ServiceReference1.Service1Client();

  client.LongFunctionCallback += (o, args) =>
    {
      if (args.Error != null)
      {
        tcs.TrySetException(args.Error);
        return;
      }
      if (args.Cancelled)
      {
        tcs.TrySetCanceled();
        return;
      }
      tcs.TrySetResult(args.Result);
    };
     client.RunLongFunctionAsync();
     // Do some other work…
     return tcs.Task;
}

How come a variable declared outside the callback function is accessed inside the function? Isn’t this against the variable scoping rules? Why won’t compiler throw error? Many programmers may have these questions (at least my few of my team meets asked this) and that is why I wanted cover it in a separate blog post.

If you are not keeping close look on what is introduced in each .NET version upgrade, you may have missed this addition to the .NET in 3.0. If you use lambda expression, you can refer to the local variables in the method your lambda is declared. In above example, tcs is defined as local variable to the CallLongRunningServiceWithCallback method but it’s also available to use inside the lambda method for LongFunctionCallback delegate. This behaviour is called variable lifting and it was derived from lambda calculus. Lambda (λ) has been used in mathematics for a long time. The formal systems that are nowadays called λ-calculus and combinatory logic were both invented in the 1920s, and their aim was to describe the most basic properties of function-abstraction, application and substitution in a very general setting. In 1985 Thomas Johnsson introduced lambda lifting. After that it was added to few computer programming languages, but Microsoft formally introduced it in .NET 3.0 I believe somewhere around 2007.

How does variable lifting works behind the scene? Lambda function may contain some it’s own local variables and some variables from the calling function. The local variables inside the lambda are called bound variables and calling function’s variables are called free variables. Free variables are “lifted” into the lambda. Compiler does a lot of leg work for you to capture the state of variables and preserve them outside of their normal lifetimes. More formally, when the compiler encounters a lambda expression that has free variables, it will lift the free variables into a class called a closure. The closure's lifetime exists beyond the lifetime of the free variables hoisted in it. The compiler rewrites the variable access in the method to access the one within the closure instance. It will create a closure class similar to the following. Please note that actual name of the class is auto-generated and so it could be different.

public class _closure$_1
{
  public TaskCompletionSource<string> tcs = new TaskCompletionSource<string>();

  public Task<string> LongFunctionCallback(CallbackArgs args)
  {
    var tcs = new TaskCompletionSource<string>();
    var client = new ServiceReference1.Service1Client();
     client.LongFunctionCallback += (o, args) =>
    {
      if (args.Error != null)
      {
        tcs.TrySetException(args.Error);
        return;
      }
      if (args.Cancelled)
      {
        tcs.TrySetCanceled();
        return;
      }
      tcs.TrySetResult(args.Result);
    };
     client.RunLongFunctionAsync();
     // Do some other work…
     return tcs.Task;
  }
}

As you can see the tcs variable is lifted inside the closure class. Compiler will then also modify the main method where lambda is called to instantiate the closure class, assign calling function’s local variable to the .tcs property of closure class and then call the closure class’s LongFuncationCallback method.

I hope this helps understanding my earlier blog post about event driven programming. You will see similar code of TaskCompletionSource in many places on MSDN and other blog sites, but I could not find anywhere anyone explaining variable lifting in the same post. There could be many interesting scenarios that you can try out to dig more deep into how variable lifting works. What will happen if I change the lifted variable after the lambda expression? You can see such scenarios are explained very well by Timothy Ng in MSDN magazine article

1 comment:

  1. Ah.. And here i thought that closures existed only in javascript. Nice info, thanks.

    ReplyDelete