0% found this document useful (0 votes)
64 views4 pages

Mixing Linq Providers Linq Objects

Mixing Linq Providers Linq Objects
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
64 views4 pages

Mixing Linq Providers Linq Objects

Mixing Linq Providers Linq Objects
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

Mixing LINQ Providers and LINQ to Objects

By Bill Wagner

September 2010

Some code I was writing for Azure gave me an excellent opportunity to discuss the differences between
IEnumerable<T> and IQueryable<T> and explain many of the terms youll see in literature about LINQ
providers. Understanding these issues will help you leverage both types to the fullest.

Almost every .NET developer should be familiar with IEnumerable<T>: Its the interface that
enumerates all the elements in a sequence. LINQ to Objects is built on IEnumerable<T>. Any sequence
that implements IEnumerable<T> can act as a LINQ data source.

When a data source provides its own native query language, developers may create a LINQ provider to
leverage that language. A LINQ provider translates LINQ queries into specific API calls against a data
source. It enables you to write LINQ queries in C#, and have those queries executed based on the native
language of the data source. For example, the LINQ to SQL provider translates LINQ queries into T-SQL.
Working with IQueryable<T> feels just like working with IEnumerable<T>: You enumerate a sequence
or elements, applying filters, projections, or other transformations to the elements in the sequence. It
looks the same, but this behavior is implemented by using different libraries.

I was writing queries against Azure Table Storage, using the StorageClient assembly. The
TableServiceContext class gives you an access to a LINQ provider for the Azure Table Storage through
the CreateQuery() method. One of the queries in my Azure application sorted entities by date. The
query looked something like this:

var records = this.CreateQuery<EntityType>()


.Where(item => item.PartitionKey == region)
.AsEnumerable()
.OrderByDescending(item => item.DateUpdated);

Notice the AsEnumerable() call in the middle of the query above. Why do you suppose that was added
to this query? Equally important, why do you suppose it was added at that location in the query? The
answer lies in the design of Azure table storage, the TableServiceContext implementation of
IQueryable<T>.

There is a major difference between IEnumerable<T> and IQueryable<T>. An implementation built on


IQueryable<T> produces expression trees that translate the querys intent into a native language for a
specific data source. In my sample above, the TableServiceContext will examine the query and generate
necessary REST POST or GET requests against the table source.

Thats the key to this discussion: An implementation of IQueryable<T> translates an expression tree to a
native representation of the same logic for its particular source type. (In most cases, this
implementation should provide better performance than the LINQ to Objects implementation for that
data source.) In the case of Azure Table Storage, the implementation understands queries that include
Where clauses using the partition key or the row key. Many methods that are a part of the query syntax
pattern are not implemented in the TableServiceContext class. All the unsupported methods in your
query (like the OrderBy in my example above) will throw exceptions.

This is common behavior with most IQueryable implementation. Some handle more cases than others,
but with all IQueryable implementations, youll find that not all C# concepts may be translated
accurately to other languages. Even though LINQ to SQL and Entity Framework are powerful
implementations of IQueryable, neither can translate arbitrary C# methods into SQL statements.
Queries containing calls to methods youve written will usually generate exceptions.

In every case when you use a LINQ provider, youll need to learn that providers limitations. Whatever
provider you use, youll need to understand the general techniques for mixing IEnumerable with
IQueryable.

Going back to the example I showed at the beginning of this article, now its clear why the
AsEnumerable() call was added: Without it, the query always throws an exception, and never returns
any results:

// This query always throws an exception


// when run against azure table storage.
var records = this.CreateQuery<EntityType>()
.Where(item => item.PartitionKey == region)
.OrderByDescending(item => item.DateUpdated);

Now, lets drill into some of the details on why to use AsEnumerable(), and why AsEnumerable() goes
where I put it in the query. In general, the answer is easy: you want to leverage the LINQ provider to
minimize network traffic, and thereby improve performance. Of course, the standard disclaimer now
applies: like all performance recommendations, these are general guidelines, and you should measure
your specific scenario.

IQueryable implementations produce expression trees that enable the LINQ provider to examine the
expression and implement it in the most efficient manner for the data source. In the case of the Azure
Table Storage, the storage API supports Where clauses that test for equality on the PartitionKey and the
RowID. (Entity Framework and other LINQ providers support richer queries). The Azure Table Storage
API supports this limited set of queries because that is the only filtering supported by the underlying
REST API. This IQueryable implementation forces you to structure your queries so that you properly
leverage the underlying system.

The general rule is to place any query clauses that are implemented by the LINQ provider first. Follow
those clauses with the AsEnumerable() call, and then finish with those clauses that are not implemented
by the LINQ provider. For example, this variation of my first example would increase network traffic
between the VMs in the Azure fabric. Instead of transmitting only those entities that match a given
region, youll transfer every entity, and filter the entities on the client side.

// Inefficient: every entity is returned


// from storage, and the entire table
// is examined using LINQ to Objects.
var records = this.CreateQuery<EntityType>()
.AsEnumerable()
.Where(item => item.PartitionKey == region)
.OrderByDescending(item => item.DateUpdated);

The Azure Table Storage is optimized for very large tables, so the Azure team expects that your Azure
Tables will have very large numbers of records. Transferring an entire table between machines in the
Azure cloud would be a very expensive operation. You want to avoid transferring all those records when
you really only want a small subset. You would have similar problems if you sprinkled your Entity
Framework queries with AsEnumerable() clauses. That would mean you would be retrieving far too
much data from your database. Youd be filtering that data using LINQ to Objects instead of leveraging
the database engine.

That answers both questions I posed when I showed this sample. The final step is to explain the
different ways to force your query logic to use LINQ to Objects instead of a LINQ provider. I prefer using
AsEnumerable(). However, you can also force LINQ to Objects by using ToList() or ToArray(). The
following two statements will produce the same results, although they produce them in very different
ways:

// evaluate query and use list storage:


var records = this.CreateQuery<EntityType>()
.Where(item => item.PartitionKey == region)
.ToList()
.OrderByDescending(item => item.DateUpdated);
// evaluate query and use array storage:
var records = this.CreateQuery<EntityType>()
.Where(item => item.PartitionKey == region)
.ToArray()
.OrderByDescending(item => item.DateUpdated);

I prefer the call to AsEnumerable() because its intent is more clear. AsEnumerable() casts an
IQueryable<T> to an IEnumerable<T>, while ToList() and ToArray() evaluate the query and store the
results in a temporary storage location. Depending on the memory storage needs, AsEnumerable() is
likely faster than ToList() or ToArray(). However, thats a micro optimization that needs to be measured.
Using AsEnumerable() makes it clear that you want to execute the remaining query operations using
LINQ to Objects.

Theres one last API to mention, and one last reason to prefer AsEnumerable(): There is also an
AsQueryable() method in LINQ. AsQueryable() converts an IEnumerable<T> to an IQueryable<T>. If the
source already implements IQueryable, you will work with that IQueryable implementation. However, if
the sequence is not an IQueryable implementation, AsQueryable() returns an IQueryable
implementation that works against in-memory sequences and this is exactly what happens when the
source is a List<T> or an Array type.

In other words, AsQueryable() reverses the effect of AsEnumerable() so you can work with a specific
LINQ provider again. If you used ToList() or ToArray() to store the result sequence in a List<T> or an
array, you can only work with that sequence using LINQ to Objects.
Remember the general guidance for working with LINQ providers. Whenever your provider implements
IQueryable, you want to leverage the IQueryableimplementation for everything you can. However, not
every LINQ provider will implement the entire LINQ query pattern. Many times (like in the case of Azure
Table Storage) this is by design to highlight the design goals of the storage API. That means you should
put the query clauses implemented by the LINQ provider first, follow those with a call to
AsEnumerable(), and finish the query with any other logic that must be implemented using LINQ to
Objects.

2014 Microsoft

You might also like