Pages

Tuesday, October 21, 2008

LINQ Explained– Part 1

This is first installment of a multi-part tutorial series on Language Integrated Query or LINQ. In this series, my goal is to provide the readers with a detailed overview of LINQ. LINQ comes as a built-in feature with Visual Studio 2008 however; LINQ can also be used with Visual Studio 2005 by downloading the May 2006 CTP here.

What is LINQ

LINQ is a programming model which enables us to query and modify data independent of a data source. It is a set of extensions that adds native support for queries to the .NET Framework. With LINQ support, ‘Queries’ have become a first-class citizen within any .NET languages such as C# and VB.NET. By providing data abstraction over different data domains, LINQ provides a unified approach to manage data.

Why use LINQ

Today (and always :-) developers are responsible for managing data in their applications. The data belongs to different data domains and each domain comes with its unique set of rules to play with e.g. SQL is used for relational databases, XQuery/DOM are used to handle XML Documents and different Application Programming Interfaces (APIs) are used to manage Text files, Objects, Graphs, Registry, Active Directory etc. The developers are faced with the dilemma to master different data domains for the same purpose - to handle data. Wouldn’t it be nice to have a single set of rules to manage all our data requirements? This is where LINQ comes in handy. LINQ provides a unified programming model to manage data from different data sources. Hence with LINQ, we can invest our efforts in handling the business logic and not worrying about the syntax to manage data.

LINQ syntax and working

With LINQ, the notion of queries is now a built-in concept in the .NET Framework. The LINQ syntax (known as Query Expression) is a reminiscent of SQL. But this syntax is not limited to relational databases rather applies across all data domains under its umbrella. Following is a simple example of a LINQ Query which operates on a string array:


string [] fruits = { "apple", "banana", "orange", "pineapple", "carrot" };

var query =
from fruit in fruits
where (fruit == "orange" || fruit == "pineapple")
select fruit;

foreach (var fruit in query)
{
ListBox1.Items.Add (fruit);
}

We will have a detailed look at LINQ syntax in the following posts. For now let us see what the above code does. We have an array of strings and a SQL-type query operates on this array. The query returns a subset of the array to an object of type var. The foreach-loop iterates through the object and displays the result. Simple isn’t it?

The worth noting point is that the same syntax above applies to a Relational database, DataSets, XML files or any other data domain. Our interface to handle the data remains the same but at the other end; the data domain can change depending on our requirements. This ability to have the same set of rules to access data across different domains is notable. I am sure, by now, you have started to see the strength of LINQ. The LINQ architecture depends on many .NET Framework features such as Generics, Delegates, Anonymous & Extension Methods etc. My next post will provide a detailed overview of these features.

LINQ comes in many flavors (LINQ Providers) to manage different data domains. Don’t confuse LINQ syntax with flavors. The syntax remains the same (with slight variation) across different providers. But the features may vary from one provider to the other e.g. the same LINQ syntax will fetch an Element/Node from an XML document but a DataRow from a database. Each LINQ Provider is responsible for converting the LINQ Expression to a form compatible with the underlying data source. LINQ has the following different flavors:

LINQ to Objects: Used to query in-memory collection of objects
LINQ to SQL: Handles data from SQL Server & SQL Server Compact databases
LINQ to Entities: Operates on object entities
LINQ to DataSet: Query data from DataSets
LINQ to XML: Handles data from XML Documents
Third Party: In the future we will see more providers written by third parties for different data sources.
The detailed discussion of these providers will be the topic of a future posts.


Summary

This installment provided an overview of Language Integrated Query. LINQ is a powerful technology which provides a unified programming model to manage data from different data sources. Queries now have native support in .NET Framework. LINQ syntax (Query Expression) resembles SQL syntax. There are several LINQ Providers for different data sources. Following are some useful links to know more about LINQ:

The LINQ Project
ScottGu on LINQ

In the next article, we will look at some of the C# Language features that are needed to understand to work with LINQ efficiently. So stay tuned…

4 comments:

  1. Like it. I will go through all
    -Satalaj

    ReplyDelete
  2. good articles on Linq with simple & understandable manner of explaining the topic..

    ReplyDelete
  3. As is explained in many places, LINQ statements are translated into regular code, using extension methods like OrderBy() and Sort(). Note that these extension methods can be used directly (i.e. outside of a LINQ statement), and as such are very effective extensions to the language.

    ReplyDelete