LINQ and Extension methods

Have you ever wished that a base class had a particular method? What about interfaces? Wouldn’t it be great to define a method on an interface along with its implementation? Any class that then implemented the interface would get this implementation for free.

In the past this was achieved with static utility classes. Unfortunately this leads to cluttering your code with the names of these utility classes and dilute the expressiveness of your code. Let’s say we have a utility class the gets the words and word count from a string. Don’t worry too much about the implementation, just the general structure.

public static class StringUtilities
{
   private static readonly Regex wordsRegex = new Regex(@"\w+");

   public static IEnumerable<string> GetWords(string source)
   {
      return from word in wordsRegex.Matches(source).Cast<Match>()
             select word.Value;
   }

   public static int WordCount(string source)
   {
      return GetWords(source).Count();
   }
}

To use this in our code we would have to do something like this:

var sentence = "The quick brown fox jumps over the lazy dog";

// Display each of the words
foreach (var word in StringUtilities.GetWords(sentence))
{
   Console.WriteLine(word);
}

// Display the word count
Console.Write("Total Words: ")
Console.WriteLine(StringUtilities.WordCount(sentence));

Look at all that clutter. The truth in this context is that we are really performing an action on the sentence. Wouldn’t it be better if we could just call sentence.GetWords() or sentence.WordCount() instead? It would certainly be more readable. Extension methods make this all possible. Here’s our updated StringUtilities class that creates the extension methods:

public static class StringUtilities
{
   private static readonly Regex wordsRegex = new Regex(@"\w+");

   public static IEnumerable<string> GetWords(this string source)
   {
      return from word in wordsRegex.Matches(source).Cast<Match>()
             select word.Value;
   }

   public static int WordCount(this string source)
   {
      return GetWords(source).Count();
   }
}

We’ve added this before the variable type. The rest of the code has been left untouched. So now we can use the extension methods like so:

var sentence = "The quick brown fox jumps over the lazy dog";

// Display each of the words
foreach (var word in sentence.GetWords())
{
   Console.WriteLine(word);
}

// Display the word count
Console.WriteLine("Total Words: {0}", sentence.WordCount());

Doesn’t that read better? We have been able to push the implementation details (the name of the static utility class) out of our code.

How to enable an extension method

In order to use an extension method it must be part of the local namespace or imported with a using statement. Once that’s done you can call extension methods just as you would any normal method.

What does this have to do with LINQ?

LINQ is all about extension methods. When you import the System.Linq namespace it comes with a whole bundle of extension methods. Most of them act on IEnumerable<T> and can be used to write your LINQ queries in method syntax. Let’s look at this query:

from item in items
where item.Price < 1
select item.Name

This query finds the items that are under one dollar and returns their names. We can write this query in method syntax like so:

items.Where(item => item.Price < 1).Select(item => item.Name)

It’s not quite as readable (although that is a matter of opinion), but it gives a good indication of what is going on (and further demonstrates why select is at the end). These methods also take advantage of Lambda expressions (which I’ll discuss in a future post).

There are other useful extension functions that work with queries. Some of the ones you’ll use most often are:

Fortunately you aren’t limited to using these extension methods on LINQ queries. They are designed to work on any class that implements IEnumerable<T>. This means you can use them directly on a lot of the classes already in the .NET base class library.

What about old non-generic IEnumerable?

There are a lot of classes in the .NET framework that don’t implement IEnumerable<T> but instead implement the non-generic interface IEnumerable. A perfect example is MatchCollection used by Regular expressions. When we enumerate over a MatchCollection we are given the base object which we then need to cast to a Match object. Until we do this cast we can’t access any of the properties of Match. Fortunately there are a couple of LINQ extension methods designed to help out when dealing with IEnumerable.

If you want to see OfType<T>() in action, copy and paste the following example into LINQPad. (You’ll need to select C# Statement(s) from the language drop down).

var items = new object[]{"a string", 22, Math.PI};

items.OfType<string>().Dump("OfType<string>");
items.OfType<int>().Dump("OfType<int>");
items.OfType<double>().Dump("OfType<double>");

LINQPad has its own extension method Dump() which is used to output results to the LINQPad window. You’ll see that each individual dump returns a strongly typed IEnumerable<T> object. In this example items actually implemented IEnumerable<object>. Fortunately these methods don’t discriminate and happily work their magic on any IEnumerable<T> as well.

Still more to come

There is still plenty of more that I will post about LINQ. In my next post I’ll look at deferred execution, what it means and how you can take advantage of it.

Tags: ,

Monday, February 16th, 2009 at 7:44 am under LINQ.
  • http://billhogsett.com Bill Hogsett

    I am trying to use a lamda expression in a linq for each and your post is as close as I have found.

    This works:

    For Each g In searchResults _
                  .Cast(Of Match)() _
                  .GroupBy(Function(m) m.Value) _
                  .OrderBy(Function(m) m.Count).ThenBy(Function(m) m.Key)
       theOutput.Add(g.Key, g.Count)
    Next
    

    This does not work:

    dim myordering = Function(m) m.Count
    For Each g In searchResults _
                  .Cast(Of Match)() _
                  .GroupBy(Function(m) m.Value) _
                  .OrderBy(myordering).ThenBy(Function(m) m.Key)
       theOutput.Add(g.Key, g.Count)
    Next
    

    The error message says that Count is no an element of Grouping.

    Any suggestions for a fix?

    Thanks.

    Bill

    p.s. Nice article