Saturday, 20 July 2013

Spending too much time debugging?

Ever observed that your programmers are using visual studio debugging too frequently during the development phase? Many team leads or architects may ignore this behavior or may not even observe because it boils down to the basics of programming skills that we take it for granted. Debugging is usually slow cause it takes time for Visual Studio (or any other) debugger to load all the symbols in memory. I think using debugging too frequently during development shows lack of confidence on understanding of how the program works or sometimes lack of understanding of fundamentals of the framework (e.g. .NET, Java, etc). This affects the productivity and velocity of the team.

During the ground up or from scratch development of a program or any functionality, small or big, I prefer to have a mental picture of what is going to happen in my code. I would first spend time on which tasks are involved and create classes and function signatures. These are often called contracts. These are just plain vanilla functions without any stuff in them. If it’s a database, create data model. If it’s a .NET program, create class diagram with properties and functions. If the function has return values, just return empty values etc. so that it will compile. Let’s make sure all the code compiles. No debugging yet.

Then I would write unit tests for the public functions and run the tests. I try to write functions as if they are completely isolated piece of functionality. Obviously all the unit tests will fail here. This takes time, but it’s a foundation for any new program or functionality. May architects call this as Test Driven Development. It makes perfect sense why companies spend thousands of dollars educating programmers to adopt to this practice.

Once unit test development is done, try to add more stuff into the functions. It’s important to note that you don’t know who will call this function in future and what parameter values (valid or invalid) will be passed. We should always validate all parameters first e.g. if its string type of parameter use string.IsNullOrEmpty or string.IsNullOrWhiteSpace and in negative case throw ArgumentNullException. Remember the basic principle “garbage in garbage out” you leaned in college?

No debugging required yet. I would then keep adding more stuff in my function and run unit tests to test it. When all my unit tests are passing, I move on to next function and repeat same steps. Once all the functions are done, run the program end to end. At this point you may encounter integration related issues but they should be very simple to fix cause you have already made sure at each unit function level its works as expected.

I call this very basic programming skill. Many times it can be tempted to not write unit tests for very simple function. You can cover such them through other unit tests, but in my opinion majority of the functions should be having unit tests.

In Visual Studio one can run the program by always pressing F5 which runs under debugger. Pressing Ctrl + F5 runs without debugger. 

There can be exceptions to the process. Sometimes when you are working a defect and you just don’t have time to understand the whole workflow of the program e.g. what happens from the start to end, or it may be someone else’s code and you have given a specific task to fix a very pointed functionality or defect. Someone might have already narrowed down a line of code or function where it needs a fix. This usually happens on maintenance projects. Before you change the line of code you want to check what are the values of the variables and you debug the program. You have option to use logging but if there is any reverse engineering involved debugger is a best friend that comes to your rescue. However as I explained earlier, ground up development of new program or addition of new feature, no compromise on discipline. 

Sunday, 14 July 2013

Variable Lifting in Lambda Expressions

In my previous blog about event driven programming pattern, there was a code block that I want to make sure is well explained. 

private Task<string> CallLongRunningServiceWithCallback()
{
  var tcs = new TaskCompletionSource<string>();
  var client = new ServiceReference1.Service1Client();

  client.LongFunctionCallback += (o, args) =>
    {
      if (args.Error != null)
      {
        tcs.TrySetException(args.Error);
        return;
      }
      if (args.Cancelled)
      {
        tcs.TrySetCanceled();
        return;
      }
      tcs.TrySetResult(args.Result);
    };
     client.RunLongFunctionAsync();
     // Do some other work…
     return tcs.Task;
}

How come a variable declared outside the callback function is accessed inside the function? Isn’t this against the variable scoping rules? Why won’t compiler throw error? Many programmers may have these questions (at least my few of my team meets asked this) and that is why I wanted cover it in a separate blog post.

If you are not keeping close look on what is introduced in each .NET version upgrade, you may have missed this addition to the .NET in 3.0. If you use lambda expression, you can refer to the local variables in the method your lambda is declared. In above example, tcs is defined as local variable to the CallLongRunningServiceWithCallback method but it’s also available to use inside the lambda method for LongFunctionCallback delegate. This behaviour is called variable lifting and it was derived from lambda calculus. Lambda (λ) has been used in mathematics for a long time. The formal systems that are nowadays called λ-calculus and combinatory logic were both invented in the 1920s, and their aim was to describe the most basic properties of function-abstraction, application and substitution in a very general setting. In 1985 Thomas Johnsson introduced lambda lifting. After that it was added to few computer programming languages, but Microsoft formally introduced it in .NET 3.0 I believe somewhere around 2007.

How does variable lifting works behind the scene? Lambda function may contain some it’s own local variables and some variables from the calling function. The local variables inside the lambda are called bound variables and calling function’s variables are called free variables. Free variables are “lifted” into the lambda. Compiler does a lot of leg work for you to capture the state of variables and preserve them outside of their normal lifetimes. More formally, when the compiler encounters a lambda expression that has free variables, it will lift the free variables into a class called a closure. The closure's lifetime exists beyond the lifetime of the free variables hoisted in it. The compiler rewrites the variable access in the method to access the one within the closure instance. It will create a closure class similar to the following. Please note that actual name of the class is auto-generated and so it could be different.

public class _closure$_1
{
  public TaskCompletionSource<string> tcs = new TaskCompletionSource<string>();

  public Task<string> LongFunctionCallback(CallbackArgs args)
  {
    var tcs = new TaskCompletionSource<string>();
    var client = new ServiceReference1.Service1Client();
     client.LongFunctionCallback += (o, args) =>
    {
      if (args.Error != null)
      {
        tcs.TrySetException(args.Error);
        return;
      }
      if (args.Cancelled)
      {
        tcs.TrySetCanceled();
        return;
      }
      tcs.TrySetResult(args.Result);
    };
     client.RunLongFunctionAsync();
     // Do some other work…
     return tcs.Task;
  }
}

As you can see the tcs variable is lifted inside the closure class. Compiler will then also modify the main method where lambda is called to instantiate the closure class, assign calling function’s local variable to the .tcs property of closure class and then call the closure class’s LongFuncationCallback method.

I hope this helps understanding my earlier blog post about event driven programming. You will see similar code of TaskCompletionSource in many places on MSDN and other blog sites, but I could not find anywhere anyone explaining variable lifting in the same post. There could be many interesting scenarios that you can try out to dig more deep into how variable lifting works. What will happen if I change the lifted variable after the lambda expression? You can see such scenarios are explained very well by Timothy Ng in MSDN magazine article

Saturday, 6 July 2013

Responsibility Matrix for Architecture / Design Creation Activity

In a large project, there may be many people who have some role in the creation and approval of project deliverables. Sometimes this is pretty straightforward, such as one person writing a document and one person approving it. In other cases, there may be many people who have a hand in the creation, and others that need to have varying levels of approval. The Responsibility Matrix is a tool used to define the general responsibilities for each role on a project. The matrix can then be used to communicate the roles and responsibilities to the appropriate people associated with the team. This helps set expectations and ensures people know what is expected from them.

C - Create
I - Provide Input
R - Review
A - Approve
N - Notify
M - Manage

Let’s say Product Owner wants to add a new functionality to the product. Typically business team or product owner provides requirements. Requirement document should also have its own responsibility matrix defined. However since this is architect's blog I would highlight the responsibility matrix for architecture or design activity for creating design artifacts.

In a large project typically you have one or more enterprise architects and one or more application architects. However it’s important to define the primary ownership of new feature design to one enterprise architect who looks at the overall product and one application architect of the area where the new feature is being added. In this case, I would define the responsibilities as follows:

C - Creator of the design is one application architect whose application is largely affected by the new feature. This architect is primarily responsible for creating a design document.

R - All other architects review the design document for design-flaw, best practices or company standards compliance, integration issues with their applications etc. In other words, all other architects have review responsibility.

I - Business Analysts, Project Managers and Product Owners holds responsibility of providing inputs here. It is critical to get inputs from them as far as explaining the business problem. In many cases I have experienced that whenever responsibility matrix is not clearly defined, PMs, business analysts or product owners start providing design suggestions or sometimes forces a specific design. This is usually as case when these roles have some technical background in their career. They remember how they have done it in the past and start dictating how it should be done or starts questioning the architects. This is why responsibility matrix plays such a crucial role in large projects and conflicts can be avoided.

A - Finally enterprise architect has to make a call on the design and approve it. It’s important to have this played by single person to manage accountability.

N - Once the design is approved, Project Manager should be notified. The important contribution of project manager here is to do scope management, prioritization and risk management. All project managers know what this means.

M - Delivery team including technical managers, team leads and developers then manages it to take it to the further level of creating low level design, provide estimates to PMs and fit it into release plan. Finally implement the feature based on the agreed upon release plan.

Again one can ask whose responsibility is to create responsibility matrix? IMO it’s a team’s job but primary owner is project manager. If you have reached this line, thanks for reading this blog post. Please provide your comments.