one software architect's blog: 2013

Thursday, 5 September 2013

Code Quality: Diagnosability of your application

Code coverage, unit tests, static analysis are some of the code quality measures that are typically referred to code quality. I want to touch base on few more important aspects of code quality. Let's start with diagnosability.

Diagosability means ability to diagnose the issue quickly and accurately. Defects are always going to be there it’s a question of once tester finds a defect, how quickly and accurately can you diagnose the defect and know where to fix?

In other words how your application facilitates troubleshooting and ability to diagnose defects? Developers are always fan of debuggers. If you can attach a debugger, you can find any crazy defect's root cause. However you are not always going to have a liberty of debugging. It takes time and requires exact setup as of the production database along with various different systems involved in order to replicate the issue. You may have conditional logic which gets hit during specific state of the application. Debugging requires you to replicate exact same state in your dev environment so as to step through the code and see which code lines are executed along with values of variables used etc.

Sometimes (or most of the time) there is not enough time to do all this setup for debugging. Management may demand urgent fix and it’s a fair expectation.

In such scenarios, diagnosability comes to rescue if developers build the application keeping diagnosability aspect in mind. Simplest solution is to add enough logging or trace statements that redirect to either log file, event log or console etc depends on the nature of the application. Many architects prefer event logs for critical errors since it can be monitored through SCOM like tools.

In .Net we have system.diagnostics namespace that provides all necessary supporting classes to instrument logic flow from application into event logs or trace files. There are many third party paid and open source loggers out there.

.Net supports info, warn and error as trace settings in diagnostics namespace. You can use this efficiently wherever possible and be able to control the amount of logging. Too much logging can also slow down the performance.

Also its important to note not to log sensitive data into log files e.g. user names, confidential info, and encrypted info. You can log mostly identifier of the entities and any decision variables so as to determine the state of the application.

However the next question that comes to my mind is how about third party code or shared modules from other teams across the enterprise. Logging standards may not be consistent across the board. It may be a legacy code that you can't change to add diagostability support. For this .Net provides a way to IL inject the code and extend diagnosability of any application. There is one tool that I am evaluating currently called dynaTrace. Ill publish my findings later on it but I would prefer such tools for new projects. Complexity of enterprise applications has gone up exponentially and we are already in the era of multi-platform, multi-device, multi-language, multi-geography, multi-tenant, multi-xyz applications. Diagosability becomes very important and it’s directly proportionate to the maintenance budget.

Thursday, 8 August 2013

Let's agree to disagree

While doing design or architecture definition projects some clients can afford just one enterprise architect on the team and he is the final decision maker. However the matter complicates when there are multiple enterprise architects or solution architects. As everyone knows, we have wide range of options available in technological landscape; there are multiple ways of architecting a solution to address a business problem. Each architecture design may have different ways, methodologies and practices followed but it can very well achieve the business goal. In Microsoft Technologies itself there are solutions and products that do the similar things. They exist for whatever reasons and we don’t want to go in that discussion here. Those reasons can be valid reasons about their existence e.g. it could be for legacy support etc. The point I want to convey is when I am playing architect role as a consultant in various companies, I come across this conflict very often. Companies spend lots of $$ on discussing and arguing about “build vs buy”, one technology over other, one tool vs other etc. I always ask what are the company standards followed across the enterprise and go from there. But sometimes you need to be innovative to think beyond the best practices that exist because every application is unique. You cannot stop people from thinking differently. So discussion and arguments are bound to happen. It’s a very slippery slope to keep spending time endlessly on such things.

That is why I see one important quality all architects can have is to “agree to disagree”. In Wiki this is defined as “To tolerate each other's opinion and stop arguing; to acknowledge that an agreement will not be reached”.

Monday, 5 August 2013

SOA Services using WCF over msmq

Asynchronous web services support is an important of Service Oriented Architecture (SOA) implementation. This requires a stable and reliable delivery of messages to services. Microsoft has MSMQ platform that guarantees delivery of messages through MSMQ. WCF supports net.msmq binding so that you can turn on existing long running web services into fire and forget type of services. WCF saves lot of coding effort of sending, receiving and peeking MSMQ queues. WCF platform does all that for us just with some configuration file settings. However installing and configuring net.msmq services requires some work and understanding of how non-http activation works. In this post, I am going to start with configuring your machine and make it ready for fire and forget type of services.

Let’s start with installing Windows Communication Foundation http, non-http activation services and also windows Process Activation Service.

Make sure following windows services are started and running:

1. Message Queuing

2. Net.Msmq Listener Adapter

3. Windows Process Activation Service

You must be able to see following message in the event viewer:

In some case based on the sequence of your installation of components, you installed .NET 3.5 WCF HTTP Activation and now when you try to visit the WCF service using browser you may get an error. Following is the error message you may get when you when run application that is hosted on Internet Information Services (IIS):

Could not load type 'System.ServiceModel.Activation.HttpModule' from assembly 'System.ServiceModel, Version=3.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089'.

Description: An unhandled exception occurred during the execution of the current web request. Please review the stack trace for more information about the error and where it originated in the code.

Exception Details: System.TypeLoadException: Could not load type 'System.ServiceModel.Activation.HttpModule' from assembly 'System.ServiceModel, Version=3.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089'.

Visit this link http://support.microsoft.com/kb/2015129 and see the solution to fix it. I did modify my applicationHost.config to add runtimeVersion2.0 as shown below.

At this point I am able to visit my WCF service through web browser. Also when I call WCF service through MSMQ binding, I am able to see my msmq message is picked up by the net.msmq adapter service and finally my WCF service is invoked. I will walk you through my source code in next post.

Saturday, 20 July 2013

Spending too much time debugging?

Ever observed that your programmers are using visual studio debugging too frequently during the development phase? Many team leads or architects may ignore this behavior or may not even observe because it boils down to the basics of programming skills that we take it for granted. Debugging is usually slow cause it takes time for Visual Studio (or any other) debugger to load all the symbols in memory. I think using debugging too frequently during development shows lack of confidence on understanding of how the program works or sometimes lack of understanding of fundamentals of the framework (e.g. .NET, Java, etc). This affects the productivity and velocity of the team.

During the ground up or from scratch development of a program or any functionality, small or big, I prefer to have a mental picture of what is going to happen in my code. I would first spend time on which tasks are involved and create classes and function signatures. These are often called contracts. These are just plain vanilla functions without any stuff in them. If it’s a database, create data model. If it’s a .NET program, create class diagram with properties and functions. If the function has return values, just return empty values etc. so that it will compile. Let’s make sure all the code compiles. No debugging yet.

Then I would write unit tests for the public functions and run the tests. I try to write functions as if they are completely isolated piece of functionality. Obviously all the unit tests will fail here. This takes time, but it’s a foundation for any new program or functionality. May architects call this as Test Driven Development. It makes perfect sense why companies spend thousands of dollars educating programmers to adopt to this practice.

Once unit test development is done, try to add more stuff into the functions. It’s important to note that you don’t know who will call this function in future and what parameter values (valid or invalid) will be passed. We should always validate all parameters first e.g. if its string type of parameter use string.IsNullOrEmpty or string.IsNullOrWhiteSpace and in negative case throw ArgumentNullException. Remember the basic principle “garbage in garbage out” you leaned in college?

No debugging required yet. I would then keep adding more stuff in my function and run unit tests to test it. When all my unit tests are passing, I move on to next function and repeat same steps. Once all the functions are done, run the program end to end. At this point you may encounter integration related issues but they should be very simple to fix cause you have already made sure at each unit function level its works as expected.

I call this very basic programming skill. Many times it can be tempted to not write unit tests for very simple function. You can cover such them through other unit tests, but in my opinion majority of the functions should be having unit tests.

In Visual Studio one can run the program by always pressing F5 which runs under debugger. Pressing Ctrl + F5 runs without debugger.

There can be exceptions to the process. Sometimes when you are working a defect and you just don’t have time to understand the whole workflow of the program e.g. what happens from the start to end, or it may be someone else’s code and you have given a specific task to fix a very pointed functionality or defect. Someone might have already narrowed down a line of code or function where it needs a fix. This usually happens on maintenance projects. Before you change the line of code you want to check what are the values of the variables and you debug the program. You have option to use logging but if there is any reverse engineering involved debugger is a best friend that comes to your rescue. However as I explained earlier, ground up development of new program or addition of new feature, no compromise on discipline.

Sunday, 14 July 2013

Variable Lifting in Lambda Expressions

In my previous blog about event driven programming pattern, there was a code block that I want to make sure is well explained.

private Task<string> CallLongRunningServiceWithCallback()

{

var tcs = new TaskCompletionSource<string>();

var client = new ServiceReference1.Service1Client();

client.LongFunctionCallback += (o, args) =>

{

if (args.Error != null)

{

tcs.TrySetException(args.Error);

return;

}

if (args.Cancelled)

{

tcs.TrySetCanceled();

return;

}

tcs.TrySetResult(args.Result);

};

client.RunLongFunctionAsync();

// Do some other work…

return tcs.Task;

}

How come a variable declared outside the callback function is accessed inside the function? Isn’t this against the variable scoping rules? Why won’t compiler throw error? Many programmers may have these questions (at least my few of my team meets asked this) and that is why I wanted cover it in a separate blog post.

If you are not keeping close look on what is introduced in each .NET version upgrade, you may have missed this addition to the .NET in 3.0. If you use lambda expression, you can refer to the local variables in the method your lambda is declared. In above example, tcs is defined as local variable to the CallLongRunningServiceWithCallback method but it’s also available to use inside the lambda method for LongFunctionCallback delegate. This behaviour is called variable lifting and it was derived from lambda calculus. Lambda (λ) has been used in mathematics for a long time. The formal systems that are nowadays called λ-calculus and combinatory logic were both invented in the 1920s, and their aim was to describe the most basic properties of function-abstraction, application and substitution in a very general setting. In 1985 Thomas Johnsson introduced lambda lifting. After that it was added to few computer programming languages, but Microsoft formally introduced it in .NET 3.0 I believe somewhere around 2007.

How does variable lifting works behind the scene? Lambda function may contain some it’s own local variables and some variables from the calling function. The local variables inside the lambda are called bound variables and calling function’s variables are called free variables. Free variables are “lifted” into the lambda. Compiler does a lot of leg work for you to capture the state of variables and preserve them outside of their normal lifetimes. More formally, when the compiler encounters a lambda expression that has free variables, it will lift the free variables into a class called a closure. The closure's lifetime exists beyond the lifetime of the free variables hoisted in it. The compiler rewrites the variable access in the method to access the one within the closure instance. It will create a closure class similar to the following. Please note that actual name of the class is auto-generated and so it could be different.

public class _closure$_1

{

public TaskCompletionSource<string> tcs = new TaskCompletionSource<string>();

public Task<string> LongFunctionCallback(CallbackArgs args)

{

var tcs = new TaskCompletionSource<string>();

var client = new ServiceReference1.Service1Client();

client.LongFunctionCallback += (o, args) =>

{

if (args.Error != null)

{

tcs.TrySetException(args.Error);

return;

}

if (args.Cancelled)

{

tcs.TrySetCanceled();

return;

}

tcs.TrySetResult(args.Result);

};

client.RunLongFunctionAsync();

// Do some other work…

return tcs.Task;

}

As you can see the tcs variable is lifted inside the closure class. Compiler will then also modify the main method where lambda is called to instantiate the closure class, assign calling function’s local variable to the .tcs property of closure class and then call the closure class’s LongFuncationCallback method.

I hope this helps understanding my earlier blog post about event driven programming. You will see similar code of TaskCompletionSource in many places on MSDN and other blog sites, but I could not find anywhere anyone explaining variable lifting in the same post. There could be many interesting scenarios that you can try out to dig more deep into how variable lifting works. What will happen if I change the lifted variable after the lambda expression? You can see such scenarios are explained very well by Timothy Ng in MSDN magazine article.

Saturday, 6 July 2013

Responsibility Matrix for Architecture / Design Creation Activity

In a large project, there may be many people who have some role in the creation and approval of project deliverables. Sometimes this is pretty straightforward, such as one person writing a document and one person approving it. In other cases, there may be many people who have a hand in the creation, and others that need to have varying levels of approval. The Responsibility Matrix is a tool used to define the general responsibilities for each role on a project. The matrix can then be used to communicate the roles and responsibilities to the appropriate people associated with the team. This helps set expectations and ensures people know what is expected from them.

C - Create
I - Provide Input
R - Review
A - Approve
N - Notify
M - Manage

Let’s say Product Owner wants to add a new functionality to the product. Typically business team or product owner provides requirements. Requirement document should also have its own responsibility matrix defined. However since this is architect's blog I would highlight the responsibility matrix for architecture or design activity for creating design artifacts.

In a large project typically you have one or more enterprise architects and one or more application architects. However it’s important to define the primary ownership of new feature design to one enterprise architect who looks at the overall product and one application architect of the area where the new feature is being added. In this case, I would define the responsibilities as follows:

C - Creator of the design is one application architect whose application is largely affected by the new feature. This architect is primarily responsible for creating a design document.

R - All other architects review the design document for design-flaw, best practices or company standards compliance, integration issues with their applications etc. In other words, all other architects have review responsibility.

I - Business Analysts, Project Managers and Product Owners holds responsibility of providing inputs here. It is critical to get inputs from them as far as explaining the business problem. In many cases I have experienced that whenever responsibility matrix is not clearly defined, PMs, business analysts or product owners start providing design suggestions or sometimes forces a specific design. This is usually as case when these roles have some technical background in their career. They remember how they have done it in the past and start dictating how it should be done or starts questioning the architects. This is why responsibility matrix plays such a crucial role in large projects and conflicts can be avoided.

A - Finally enterprise architect has to make a call on the design and approve it. It’s important to have this played by single person to manage accountability.

N - Once the design is approved, Project Manager should be notified. The important contribution of project manager here is to do scope management, prioritization and risk management. All project managers know what this means.

M - Delivery team including technical managers, team leads and developers then manages it to take it to the further level of creating low level design, provide estimates to PMs and fit it into release plan. Finally implement the feature based on the agreed upon release plan.

Again one can ask whose responsibility is to create responsibility matrix? IMO it’s a team’s job but primary owner is project manager. If you have reached this line, thanks for reading this blog post. Please provide your comments.

Tuesday, 4 June 2013

Evolution of Asynchronous Programming (EAP) in .NET

Asynchronous programming model in .NET makes it very easy for developers to write event based asynchronous pattern (EAP) code starting .NET version 4.5. We have been trained for long enough by .NET programming practices to use event based pattern for any asynchronous work. For example, you want to perform a long running operation. When calling this long running function, you don’t want your client code to make a blocking call; instead a very famous pattern that many architects suggests here is to ask you client to pass a callback function or delegate and that callback function will be called as soon as this long running operation is over. This is nothing but the EAP pattern.

As shown in following example, function “client” makes a call to “longfunction” and waits for the long function to complete. This is a blocking call.

private Task<string> CallLongRunningService()

{

var client = new ServiceReference1.Service1Client();

// blocking call where main thread is blocked

client.CallLongRunningService();

}

Now instead of just waiting for “longfunction” to complete, “client” function wants to do some other stuff. Here many architects would suggest to use EAP pattern and call the long function asynchronously.

So we refactor “CallLongRunningService” to “CallLongRunningServiceWithCallback” and ask a callback function to be called when long function is complete.

private void CallLongRunningServiceWithCallback()

{

var client = new ServiceReference1.Service1Client();

client.LongFunctionCallback += LongFunctionCallback;

client.RunLongFunctionAsync();

}

private void LongFunctionCallback(CallbackArgs args)

{

// set the result in some local variable. Basically store the result so that others

// can access it

m_result = args.Result;

}

As you can guess “client” function also need to be refactored to consume the refactored “CallLongRunningServiceWithCallback” and reconcile its logic with the callback function. Now we have split the logic into two functions, main function and callback function.

Now this is where TaskCompletionSource and Task Parallel Library TPL comes to rescue. Using TaskCompletionSource you can refactor the above code as follows:

private Task<string> CallLongRunningServiceWithCallback()

{

var tcs = new TaskCompletionSource<string>();

var client = new ServiceReference1.Service1Client();

client.LongFunctionCallback += (o, args) =>

{

if (args.Error != null)

{

tcs.TrySetException(args.Error);

return;

}

if (args.Cancelled)

{

tcs.TrySetCanceled();

return;

}

tcs.TrySetResult(args.Result);

};

client.RunLongFunctionAsync();

// Do some other work…

return tcs.Task;

}

Client application will get the task as return parameter and will have to call task.Result to get the result from long running function. In the above see the comment line “Do some other work…”. This is where you can do something else while the long function is doing its job. Following is the code you will write to consume the long running function with callback.

Task<string> t = CallLongRunningFunctionWithCallback();

Console.WriteLine(t.Result);

Where is the blocking call here? In above lines of code when you call t.Result it’s a blocking call on current thread. You need this anyway because we don’t know how long the long function will take to complete. But the best thing about this pattern that I allows us to not just wait but we can carry other activities during the function is running. Also see TPL code is much simpler than our earlier when callback was a separate function. TPL provides TaskCompletionSource to help us track the result of the async task in a consistent way across the projects.

Thursday, 28 March 2013

All IT professionals are business professionals (also)

There is a funny distinction people make about I am an IT professional versus business professional. This is good as far we are talking about what type of skills I have. However it is essential in a successful organization to have a realization that we all working towards solving business problems. I cannot be a purely IT professional and just focus on IT problems. The way I would look at it is I am helping solve a business problem using my IT skills. This takes be back the fundamental reason of existence in any organization. The moment we focus on the reason of our existence in an organization that we are here to solve a business problem or achieve a business goal, our perspective changes. We all become business professionals. Of course we will use different skills to contribute towards solving a problem or achieving a goal, but it is very important to start from a business reason.

As an architect, when I am asked to design a software system I start from what are the organizations goals, what are they trying to accomplish, what business problem are we solving and then understand the solution architecture. In some projects I have seen that an architect is given a bunch of requirement documents and functional specifications and asked to create software architecture. I believe architect should be involved during the requirement gathering discussions and functional specification formation process as well along with business users and business analysts. Architect may not play active role there, but it is important for that role to be part of the process, interact with business users and business analysts to hear the information first hand and not from some document. Architecture is built using collaboration with various parties and a document has limitations of how much you can capture in it.

Wednesday, 6 February 2013

Hello World

I have started using this blog instead of my earlier blog on live.com. I am finding it much useful and easy to post to this blog from my iPhone. Lets see how it goes.

Categories

Thursday, 5 September 2013

Thursday, 8 August 2013

Monday, 5 August 2013

Saturday, 20 July 2013

Sunday, 14 July 2013

Saturday, 6 July 2013

Tuesday, 4 June 2013

Thursday, 28 March 2013

Wednesday, 6 February 2013