Note: phiên bản Tiếng Việt của bài này ở link dưới.
https://duongnt.com/expression-tree-linq-vie
LINQ and the IQueryable<T>
interface can be used to execute queries on a data source. We provide an Expression Tree to LINQ. And the provider we use will transform that Expression Tree into a suitable query for the data source. Normally, we use lambda expressions to build that Expression Tree. But in case we want more flexibility, we can create the Expression Tree manually.
You can download all sample code from the link below.
https://github.com/duongntbk/ExpressionTreeDemo
The test data source
The DTO classes
In production code, we will use a database as the data source. But in this article, we can use an in-memory source. Below are the DTO classes.
public class Person
{
public string Name { get; set; }
public int Age { get; set; }
public DateTime Dob { get; set; }
public override string ToString()
{
return $"Name: {Name}, Age: {Age}, Dob: {Dob}";
}
}
public class Document
{
public string Title { get; set; }
public DateTime IssuedBy { get; set; }
public override string ToString()
{
return $"Title: {Title}, IssuedBy: {IssuedBy}";
}
}
public class Recipe
{
public string Name { get; set; }
public IList<string> Ingredients { get; set; } = new List<string>();
public override string ToString()
{
return $"{Name}: {{ {string.Join(", ", Ingredients)} }}";
}
}
The test data
And we create a few data points for testing.
{ // People
{ "Name": "John Doe", Age: 42, Dob: "1980-01-01 00:00:00" },
{ "Name": "Jane Doe", Age: 41, Dob: "1981-01-01 00:00:00" },
{ "Name": "Baby Doe", Age: 12, Dob: "2010-01-01 00:00:00" }
},
{ // Documents
{ "Title": "Birth Certificate", IssuedBy: "1980-01-01 00:00:00" },
{ "Title": "College Degree", IssuedBy: "2003-08-01 00:00:00" },
{ "Title": "Marriage Certificate", IssuedBy: "2008-01-01 00:00:00" }
}
{ // Recipes
{ Fried Rice: { eggs, rice, oil, vegetables } },
{ Omelette: { eggs, butter, oil } },
{ Pho: { pho, chicken, spice } },
{ sandwich: { bread", ham, vegetables } }
}
Create query with lambda expression
The difference between LINQ for IEnumerable<T> and for IQueryable<T>
At first glance, the LINQ predicate to query data from an IEnumerable<T>
and an IQueryable<T>
looks identical. But there is an important difference.
- For an
IEnumerable<T>
, the predicate is a delegate. - For an
IQueryable<T>
, the predicate is an Expression Tree.
By default, we can provide a delegate to an IQueryable<T>
and the runtime will translate that into an Expression Tree for us.
And we can easily convert an IEnumerable<T>
into an IQueryable<T>
.
var peopleList = <code to create a List<Person>>;
var people = peopleList.AsQueryable();
var documentsList = <code to create a List<Document>>;
var documents = documentsList.AsQueryable();
Expression Tree with lambda expression
This is how we use lambda expressions to retrieve all names and titles.
var names = people.Select(p => p.Name);
var titles = documents.Select(p => p.Title);
And this is how we find all people born after 1980/12/31
and all documents issued after 2000/01/01
.
var filteredPeople = people.Where(p.Dob > new DateTime(1980, 12, 31));
var filteredDocuments = documents.Where(p.IssuedBy > new DateTime(2000, 1, 1));
As you can see, we have to hard-code the attribute name in the lambda expression. It’s quite tricky to create general methods. Below are some cases where a general method is useful.
- A method that receives an attribute name as an argument then retrieves all values of that attribute.
- A method to filter a collection by creation time (
Dob
orIssuedBy
) without hard-coding the attribute names. - .etc
Manually create Expression Tree
A predicate for an IQueryable<T>
has the type Expression<Func<TSource, TReturn>>
. In a broad sense, we have two types of LINQ predicates.
Expression<Func<TSource, TResult>>
: use in methods to retrieve data. For example:Select/SelectMany/Max/...
.Expression<Func<TSource, bool>>
: use in methods to filter data. For example:First/Where/Single/...
.
Expression Tree to retrieve data
We will start from a simple case: retrieve values of an attribute from a collection. It is equivalent to the code below, but we don’t have to hard-code the attribute name.
var values = collection.Select(c => c.<AttributeName>);
// For the people collection: var names = people.Select(p => p.Name);
You can find the complete code here.
private static IQueryable<TColumn> GetField<TSource, TColumn>(IQueryable<TSource> collection, string columnName)
{
var collectionTypeExpr = Expression.Parameter(typeof(TSource)); // a ParameterExpression
var columnPropertyExpr = Expression.Property(collectionTypeExpr, columnName); // a MemberExpression
var predicate = Expression.Lambda<Func<TSource, TColumn>>(columnPropertyExpr, collectionTypeExpr);
return collection.Select(predicate);
}
We need to provide the collection as an IQueryable<TSource>
. And the attribute name is simply a string
. Notice that the type of the return value is also generic. Then we create two expressions for the collection and the attribute we want to retrieve. After that, we use the Expression.Lambda
method to create an Expression Tree. The predicate
variable can then be used to retrieve attribute values from the collection.
As mentioned earlier, the type of our predicate is Expression<Func<TSource, TColumn>>
.
Here is the result of our code.
var names = GetField<Person, string>(_people, nameof(Person.Name));
// John Doe
// Jane Doe
// Baby Doe
Expression Tree to filter data
For the next example, we will create a generic method to filter collection items by creation date. For Person
, we use Dob
. And for Document
, we use IssuedBy
. It is equivalent to the code below.
var values = collection.Where(c => c.<AttributeName> > lowerBound && c.<AttributeName> < upperBound);
// For the people collection: var filterData = people.Where(p => p.Dob > lowerBound && p.IssuedBy < upperBound);
You can find the complete code here. Below are some interesting lines.
var olderThanExpr = Expression.GreaterThan(timePropertyExpr, Expression.Constant(lower));
This BinaryExpression
is the lower bound for our creation time. We use the GreaterThan
method.
var newerThanExpr = Expression.LessThan(timePropertyExpr, Expression.Constant(upper));
This BinaryExpression
is the upper bound for our creation time. We use the LessThan
method.
var timeRangeExpr = Expression.And(newerThanExpr, olderThanExpr);
With the two expressions above, we can finally create our time range, which is also a BinaryExpression
. It is simply a combination of the lower bound and the upper bound.
var predicate = Expression.Lambda<Func<TSource, bool>>(timeRangeExpr, parameterExpr);
As mentioned in a previous section, the type of our predicate is Expression<Func<TSource, bool>>
.
Here is the result of our code.
var peopleInRange = GetInRange(
_people, nameof(Person.Dob), new DateTime(1980, 12, 31), new DateTime(1995, 1, 1));
// Name: Jane Doe, Age: 41, Dob: 1981/01/01 0:00:00
How to use an instance method in an Expression Tree?
In the last example, we only compare the attribute values with some constants. What if we want to call an instance method on those attributes as part of the filtering process? For example, we can find all people whose name starts with Jo
by the following code.
var peopleStartsWithJo = people.Where(p => p.Name.StartsWith("Jo"));
Here is the complete code. Let’s go through the important lines.
var methodInfo = typeof(string).GetMethods()
.Single(m => m.Name == nameof(string.StartsWith) &&
m.GetParameters().Length == 1 &&
m.GetParameters().Single().ParameterType == typeof(string));
To execute a method in our Expression Tree, we need to retrieve its method info. Because the string.StartsWith
method is overloaded with four signatures, we have to find the correct signature with one string
parameter. We can cache this method info to improve performance.
var startsWithExpr = Expression.Call(columnProperty, methodInfo, Expression.Constant(prefix));
We use the Expression.Call
method to create a MethodCallExpression
. There are quite a few overloads for the Call
method. The overload above is used to call instance methods.
The result of our code is as we expected.
var peopleStartWithJo = GetTextFieldStartsWith(_people, nameof(Person.Name), "Jo");
// Name: John Doe, Age: 42, Dob: 1980/01/01 0:00:00
How to use a generic static method in an Expression Tree?
In the last example, we will call a generic static method as part of our Expression Tree. It is equivalent to the following code to filter recipes by ingredients.
var recipesWithEggs = recipes.Where(r => r.Ingredients.Contains("eggs"));
Please find the complete code here. We already know how to call a method in an Expression Tree. But there are some differences to handle generic static methods instead of instance methods.
var methodInfo = typeof(Enumerable).GetMethods()
.Single(m => m.Name == nameof(Enumerable.Contains) && m.GetParameters().Length == 2);
var containsMethod = methodInfo.MakeGenericMethod(typeof(TField));
The code to find the correct overload is similar, but we also need to provide a concrete type for the generic method. This method info can also be cached if needed.
var containsExpr = Expression.Call(containsMethod, columnProperty, Expression.Constant(value));
Although we still use the Expression.Call
method, this time we use an overload that supports static methods.
This code can correctly filter recipes that contain a specific ingredient.
var recipeWithEggs = GetWithFieldContainValue(_recipes, nameof(Recipe.Ingredients), "eggs");
// Fried Rice: { eggs, rice, oil, vegetables }
// Omelette: { eggs, butter, oil }
Conclusion
Sometimes when working with LINQ, I notice lambda expressions with similar logic, with only a few differences in attribute names. In that case, I have two choices.
- Use the Dynamic LINQ library.
- Create one Expression Tree to handle all data sources and eliminate code duplication.