.NET & Architecture Thoughts: March 2010

Friday, March 19, 2010

Avoiding Cursors in SQL Server 2005

In this post, I will show you a technique which replicates a cursor in SQL Server. The idea is to avoid the overhead of using cursors in SQL Server.

Problem with using Cursors

Some time ago, I used cursors in one of my stored procedure. This procedure returned data which was consumed by a reporting screen. I was working with a limited amount of data for testing purpose so the screen worked great (at least for a while). But when the same stored procedure ran on the production machine with tables having hundreds of thousands of rows, the screen just went slow like anything. The stored procedure returned the data in almost 10 minutes (only). This was acceptable to me :) but not to my management LOL.

I started to debug the stored procedure and it finally dawned on me that the cursors in the stored procedure were causing the application to choke. After doing some googling binging, it was clear to me that that cursors have a performance overhead.

Cursors in SQL Server are great when we need to process data on a row by row basis. Usually a query returns a static dataset, However; at times, we need to process the data for each row before it is returned. This is where cursors come in handy. However cursors can have a performance issue if not used properly. I will not go into details of describing problems associated with cursors. You can find many useful articles online on this subject.

So the hundred thousand million dollar question is that how we can we avoid cursors but still get the same functionality. The point is to be able to process each row before we can return the data. Well there are many techniques to replicate a cursor. In the following section, I will show you the use of Temporary Table to implement cursor functionality.

Using Temporary Table to avoid Cursors

A Temporary Table is used to store data temporarily. When using a temporary table, the table and its data is stored on the disk in the tempdb database. A temporary table can be local or global. A local temporary table is only visible to user who created it and only within the connection which created it. Local variables get dropped when the connection is closed. The basic syntax of creating a Temporary Table is as follows:

CREATE TABLE #MyTable
(
 ID  int,
 Address varchar (50)
)

On the other hand, a global temporary table is visible to anyone connected to the SQL Server instance. These tables are dropped when the last connection to the table is closed. The syntax for creating temporary table is as follows:

CREATE TABLE ##MyTable
(
 ID  int,
 Address varchar (50)
)

To replicate cursors, I will use temporary table in the following code.

Code Listing

USE Northwind

-- Ref 1 - Create temporary table
CREATE TABLE #EmployeeData 
(
 employeeID  INT UNIQUE CLUSTERED,   
 lastName  VARCHAR (20),  
 firstName  VARCHAR (20),  
 isProcessed  BIT
)

-- Ref 2 - Insert Data in temporary table
INSERT INTO 
 #EmployeeData
 SELECT
  EmployeeID, LastName, FirstName, 0
 FROM
  Employees


DECLARE @employeeID INT
DECLARE @fName  VARCHAR (20)
DECLARE @lName  VARCHAR (20)

-- Ref 3 - Loop through data in the temporary table
WHILE EXISTS (SELECT employeeID FROM #EmployeeData WHERE isProcessed = 0)
BEGIN


 -- Ref 4 - Get top employeeID with flag not set
 SELECT TOP (1) 
  @employeeID = employeeID,
  @lName = lastName,
  @fName = firstName
 FROM 
  #EmployeeData 
 WHERE 
  isProcessed = 0

 -- Now you can process the data. For bravity, I will only print the full-name
 PRINT @fName + ' ' + @lName
 
 -- Ref 4 - Set flag for the currently selected employeeID
 UPDATE #EmployeeData SET isProcessed = 1 WHERE employeeID = @employeeID
END

DROP TABLE #EmployeeData

Let me explain the above reference points one by one:

Ref 1: Create a temporary table. The fourth column will be used as a flag to traverse each row replicating a cursor
Ref 2: Insert table from Employee table into temporary table. Set the flag column equal to 0
Ref 3: Traverse each row in temporary table using the fourth column. Loop through each row until flag is equal to 0
Ref 4: Update the flag to 1 of the currently selected row. This way the control will move to the next row with flag equal to 0

I hope the above code is easy to understand now. One question remains that what if we want to traverse the temporary table again. The answer is that all you have to do is update the table by setting the flag equal to 0 using the following regular Update statement:

Update 
#EmployeeData 
Set 
 isProcessed = 0

Now you are ready to process each row again using the above code.

I hope that this post has been useful. Please feel free to share your thoughts. And yes, stay tuned for more :)…

Tuesday, March 16, 2010

Single Responsibility Principle

In this post, I will talk about the Single Responsibility Principle. Before I dive into this discussion, let us briefly talk about software design.

Over the years, software complexity has increased exponentially. Diverse business needs are dictating complex software design. Programmers are burdened with learning and writing better code. A programmer either has to write new software or modify an existing code. Writing new software is easy, however; modifying existing code with a bad design is both difficult to understand and refractor. The question is ‘what is a bad software design’ and how do we identify it? Well to answer it, bad design shares the following flaws:

Rigid: Changes are hard to make
Fragile: Fixing something breaks something else
Immobile: Extension to existing code is difficult

A code is Rigid when making changes is difficult. Usually you find yourself scratching your head when tasked to change some code written by another programmer. This could be due to two reasons. First, there is no proper flow between classes/modules or second, proper coding standards (such as design patterns) are not followed. Such code is always difficult to understand and more difficult to change.

A code is Fragile when making a change breaks something else. How many times has it happened that your boss told you to fix something (which you did proudly) but discovered later that you broke something else?

Finally a code is Immobile when it cannot be reused or is difficult to extend. For example, it is always advisable to design a generic Data Access Layer (DAL) which can connect to any database. However, if the DAL is written targeting a specific database and the top management decides to use a different database later on, the programmer ends up cursing himself for writing a general DAL.

The above design issues can be avoided by adhering to proven best practices in the software industry. This is where design patterns come into play. In this post we will talk about ‘Single Responsibility Principle (SRP)’. Following posts will also discuss:

• Open Closed Principle (OCP)
• Liskov Substitution Principle (LSP)
• Dependency Inversion Control (DIC)

Single Responsibility Principle (SRP)

Single Responsibility Principle states that ‘a class or module should have only one responsibility’ where a responsibility is a ‘reason for change’. In plain English this means that a class should serve a single purpose (responsibility). Only the responsibility dictates a change in the class. If a class has more than one responsibility then it will lead to bad design later on. Changes made to a class should not affect others. This principle follows Dijkstra’s ‘Separation of Concern (SOC)’ principle.

Let me demonstrate this concept with a simple example. Consider the following class:

public class Employee {

 public EmployeeDTO GetEmployee (int employeeID) 
{
  // perform database lookup and get the employee
 }

 public double CalculateSalary (int employeeID)
 {
  // perform database lookup and calculate salary
}
}

The above Employee class is serving two purposes. First, it returns an employee and second it calculates the salary of the employee. The first function is justified, however; the Employee class should know anything about calculating the salary of an employee. This means every time the salary calculating formula changes, the Employee has to be modified. This process should be handled by a separate class. So the above code can be modified as following:

public class Payroll 
{
 public double CalculateSalary (int employeeID)
 {
  // calculate and return the employee salary
 }
}

public class Employee {

 public EmployeeDTO GetEmployee (int employeeID) 
{
  // perform database lookup and get the employee
 }

 public double CalculateSalary (int employeeID)
 {
  Payroll pay = new Payroll ();
  return pay.CalculateSalary (employeeID);
}
}

I am sure you can spot the change in the above code. The responsibility of calculating the salary is now handled by the Payroll class. The Employee class is no longer concerned about how the salary is calculated. It just invokes a method of the Payroll class and gets the required results. The Payroll class can be changed without affecting the Employee class and vice versa.

One may argue that the Single Responsibility Principle can result in a plethora of classes. The answer is yes it can. However, having more classes which are simple to manage is better than having a few but complex classes. Managing complex classes is always a burden on the programmer. Simple classes make refracting easy for any future changes.

I hope this post has been helpful to under the Single Responsibility Principle. In the following posts, I will discuss some more principles mentioned above. So stay tuned for more.

Tuesday, March 9, 2010

Software Architect

One of the sessions I attended at Tech-Ed MEA 2010 was on ‘Agile Architect’. This interactive session was steered by Scott Hanselman and Aaron Skonnard who gave valuable insight into the role of a software architect within an organization. So let us jump right into the discussions in the following sections.

Organizational Role

A software architect usually reports to the Chief Technology Officer or CTO. The CTO sets the future technology tone for an organization. An architect, on the other hand, articulates the vision of the CTO. He understands the business domain to the heart and conceptualizes the overall business needs into proper software requirements.

It is important to understand that a software architect is both a technical and political figure in an organization. He understands the business strategies. He understands the business both inside and out including business rules, policies and decisions. In short he must see the big picture when making his decisions.

Technical Aspect

A good software architect must be technology savvy. He understands the technology (to be used) in and out. He is well versed with latest technical trends, programming languages, software methodologies and development tools. This may seem to be an overstatement, however; being thorough from a technical point-of-view has two advantages. First he gets the respect of his team and hence they listen to him. Second, the developers always have someone to talk to when faced with a technical issue.

An architect should be a programmer at heart. This doesn’t necessarily mean that he writes production code, however; coding is a norm for him. A personal project or small business application can keep him on his toes.

One trait of an architect is to experiment with different technology. An architect’s search for cutting edge technology leads to user friendly business applications which are a boost for the business.

One important trait of an architect is to remember the basics of his IT education. Over the years, software architects forget about basics including data structures, databases, software engineering etc. A good architect remembers the basics which help him better design and implement software systems.

Last but not the least is that an architect always comes with a working prototype of the system to be developed. This simple point cannot be emphasized enough. A working prototype is critical in convincing the top management, ensuring user satisfaction and communicating with team members. The prototype doesn’t need to be a complete application. But it must show the big picture to all the stakeholders involved.

Problem Solving Skills

One question often asked by under-training architects is that do they need to know design patterns and software methodologies. The answer is both yes and no. Every designer handles a problem differently but proven best practices have their weight. They reduce the overall complexity and are optimized for performance. An architect may have good problem solving skills, however; knowing best practices always goes in his favor.

There is a saying that any solution which can be expressed in numbers is better than others. An architect evaluates his design and solution from a complexity point of view. For example, he may use Line of Codes (LoC) to determine the complexity of the code. Breaking the system into subparts and applying different complexity matrices allows him to perform a systematic evaluation of the system.

Personality Traits

As an architect, every developer must adapt some personality traits at an early stage of his career. He has the urge to ask ‘WHY’. Talking to different people within an organization lets him understand the business strategy. A good architect is often spotted talking to users and taking notes. This trait makes him different from the ‘Average Joe’.

When designing a system, an architect follows values principles such as conforming to standards, applying best practices and adhering to architectural values. For example, even if he is tasked to design a small application in limited amount of time, the application has a proper architecture. He understands that every application grows over time and becomes a maintenance nightmare if not architected properly.

I hope that in this post, you have learnt along the lines. Do share your thoughts on this. Stay tuned for more…

Friday, March 5, 2010

Microsoft Tech-Ed 2010 Middle-East Conference

I have just returned from Microsoft Tech-Ed Middle-East 2010 conference. This was the first time Tech-Ed was held for the Middle-East region in Dubai.

It was a wonderful experience attending this conference with Microsoft’s top IT gurus around. Personally I was delighted to meet Scott Hanselman and Aaron Skonnard. Both these guys are top .NET Professionals in the industry. Not only they are very friendly, they relayed a plethora of information to Tech-Ed attendents. I was thoroughly impressed by Scott Hanselman who has a humorous and unique style of delivering the presentations.

In the next few days, I will be blogging about some of the cool stuff I learnt at Tech-Ed. So stay tuned for more…

Pages