How to use your computed properties in predicates and projections.

If you're using an ORM, it's not uncommon to have computed properties in addition to the ones that are stored directly in the database. Unfortunatly, these computed properties don't work with Entity Framework out of the box. In this post I'm going to discuss the problem and suggest various ways of mitigating it.

For example, consider the following entities that I will use as an example throughout this post.

public class Customer
{
  public string FirstName { get; set; }
  public string LastName { get; set; }
  public virtual ICollection<Holding> Holdings { get; private set; }

  [NotMapped]
  public decimal AccountValue
  {
    get { return Holdings.Sum(x => x.Value); }
  }
 
  [NotMapped]
  public string FullName
  {
    get { return FirstName + " " + LastName; }
  }
}

public class Holding
{
  public virtual Stock Stock { get; set; }
  public int Quantity { get; set; }

  [NotMapped]
  public decimal Value
  {
    get { return Quantity * Stock.Price; }
  }
}

public class Stock
{
  public string Symbol { get; set; }
  public decimal Price { get; set; }
}

As you can see, there are several computed properties including one that aggregates data from a collection and another that uses data from a reference. These properties work fine once you've obtained the entities and want to work with them. The problems start when you want to use them within a query or as part of a projection. For example, the following query will throw an exception:

var customers = ctx.Customers.Where(c => c.AccountValue > 10);

Likewise, this projection will also fail:

var result = ctx.Customers
  .Select(c => new
  {
    FullName = c.FullName,
    AccountValue = c.AccountValue
  });

In the first case we're trying to use a computed property within a predicate, but Entity Framework doesn't know how to convert AccountValue to SQL code. In the projection, Entity Framework doesn't know how to convert FullName to SQL code either. In both cases, it will fail to formulate a SQL query and throw an exception because it just doesn't know what to do with the properties that aren't directly stored in the database. Let's take a look at some of the ways we can get around this problem.

Don't Use Computed Properties

Part of the problem here is that Entity Framework and LINQ to Entities just doesn't know how to look at the actual code for each property. It tries to map the property name to the database and when it doesn't find a match it gives up. However, Entity Framework does actually know how to translate a large number of primitive operations and LINQ methods into SQL code. They just can't be hidden behind properties. The queries above could be rewritten as:

var customers = ctx.Customers.Where(c => c.Holdings.Sum(h => h.Quantity * h.Stock.Price) > 10);
var result = ctx.Customers
  .Select(c => new
  {
    FullName = c.FirstName + " " + c.LastName,
    AccountValue = c.Holdings.Sum(h => h.Quantity * h.Stock.Price)
  });

Notice how the code for each computed property is used directly in the query. Also notice that in the case ofAccountValue we had to not only expand it's code, but also the code of Holding.Value. There can be no computed properties anywhere in the query code.

By now, it should be obvious why this doesn't work for anything but the simplest cases. It can become very complex when you have nested computed properties. It's also not DRY because you end up repeating the computation code in every query. And when the underlying logic changes you have to go find every query and change that too.

Materialize The Entities

This is the most common suggestion for getting around this problem. Entity Framework only has an issue if you try and execute the computed properties on the database server. It works just fine when you run the computations against fully materialized entities. To make this work you just have to materialize the entities (I.e., fetch data from the database server)before using any computed properties. With this approach the queries would look like:

var customers = ctx.Customers.ToList().Where(c => c.AccountValue > 10);
var result = ctx.Customers
  .ToList()
  .Select(c => new
  {
    FullName = c.FullName,
    AccountValue = c.AccountValue
  });

See that extra call to .ToList() in there? That forces Entity Framework to go to the database, execute whatever portion of the query it's seen thus far, and return a List<T> that can continue to be operated on. By the time the above queries get to the .Where() and .Select() calls, we're operating on a List<Customer> collection instead of on the database.

Let's think about what this means though. If we're fetching data from the database before executing the .Where()condition, then we're pulling down all the rows! Likewise, by going to the database before the .Select() projection, we're fetching all the columns and filtering after we've gotten them. This works okay for small tables and simple data. However, with any meaningful database it falls apart and your performance will quickly suffer.

Encapsulate

This approach gets a little more complicated. In this case we'll put the computed logic inside an extension method that we can use as if it were an actual LINQ operation. It will take the unmaterialized query and apply whatever additional computation we need. It's essentially like the first example of just not using computed properties except it puts the logic somewhere where it can be reused from other queries. Given the following extension method:

public static IQueryable<Customer> WhereAccountValueIsHigh(this IQueryable<Customer> customers)
{
  return customers.Where(c => c.Holdings.Sum(h => h.Quantity * h.Stock.Price) > 10);
}

We can now write:

var customers = ctx.Customers.WhereAccountValueIsHigh();

That extension can be reused wherever we want customers with high account values. Likewise, for the projection we could create the following extension:

public class CustomerData
{
  string FullName { get; set; }
  decimal AccountValue { get; set; }
}

public static IQueryable<CustomerData> SelectCustomerData(this IQueryable<Customer> customers)
{
  return customers.Select(c => new CustomerData
  {
    FullName = c.FirstName + " " + c.LastName,
    AccountValue = c.Holdings.Sum(h => h.Quantity * h.Stock.Price)
  });
}

Notice that I also need an actual projection type for the extension in this case, otherwise it wouldn't know what to use for the generic parameter of the return IQueryable. This extension can be called like:

var customers = ctx.Customers.SelectCustomerData();

Both extensions let you move the computation somewhere that is at least a little bit reusable. For example, I could now also write the following to chain them together:

var customers = ctx.Customers.WhereAccountValueIsHigh().SelectCustomerData();

We still have a number of problems with this approach. The first is that we still have to duplicate the computation within the property and then again within the extension. This may be fine if you have a computation that you really only need to use in queries (in which case you can remove it as a property), but if not you're still stuck maintaining it in two places. You also still have to expand any nested called to computed properties. For example, if the .WhereAccountValueIsHigh()extension had used the Holding.Value property directly, you still would have gotten an exception. This approach does not work around the limitations of the LINQ to Entity query provider not recognizing computed properties, it just puts the actual computations somewhere a little more reusable. Finally, this approach is not very composable. If you had another view that needed an additional property (like the customers initials) you would either need to create an additional extension for the new view or add the initials calculation to .SelectCustomerData() and then do an additional projection to remove it from the first view that doesn't need it.

Expression Trees and LINQ to Entities

All of the approaches so far have attempted to work around the problem by ensuring that the computed properties never reach LINQ to entities and therefore never get translated to SQL. Before looking at a different type of approach, let's back up a moment and consider how LINQ to entities actually does translate all those LINQ methods and primitive operations to SQL. Under the hood, LINQ to entities reads Expression Trees, which is a fundamental concept of the entire LINQ system. Essentially, expression trees represent code in a high-level abstraction. LINQ to entities knows how to understand an expression tree and convert most (but not all) of the corresponding commands into SQL statements. In addition, all of the LINQ extensions like .Where() and .Select() essentially build up the expression tree that LINQ to entities eventually processes.

The question then becomes, how can we pass our own expression tree to LINQ to entities? And how can be construct such an expression tree that represents the logic in our computed properties? Thankfully, there are a number of folks who've talked this very question.

Write Your Own Expressions

One approach would be to just write your own `Expression`. For example:

public readonly Expression<Func<Customer, bool>> AccountValueIsHigh = c => c.Holdings.Sum(h => h.Quantity * h.Stock.Price) > 10;

This can then be used like:

var customers = ctx.Customers.Where(AccountValueIsHigh);

One problem with this approach is that we can't use an Expression directly in a .Select() call because the compiler has no idea how to convert the Expression into something that can be assigned.

LINQ Expression Projection

To address the problem I just mentioned and provide a way to use your own Expression object directly in .Select()projection calls, Asher Barak has created a project called LINQ Expression Projection. It was originally explained in this CodeProject article and allows you to essentially wrap your expression in something that can be used as an assignment. For example, given the following Expression:

public readonly Expression<Func<Customer, decimal>> AccountValueExpression = c => c.Holdings.Sum(h => h.Quantity * h.Stock.Price);

You would write:

var customers = ctx.Customers
  .AsExpressionProjectable()
  .Select(c => new
  {
    AccountValue = AccountValueExpression.Project<decimal>()
  });

Notice the extra call to .AsExpressionProjectable(), which prepares the query for projection and sets up a scan for all instances of the .Project<T>() extension method. Once found, the library swaps the appropriate Expression in for the assignment at runtime.

Linq.Translations

Damien Guard, David Fowler, and Colin Meek has done something similar in a much more generic way. TheirLinq.Translations library was first described in a blog post on Damien's blog. The idea is that you set up an Expression-like variable that can be used within the computed property. Then when you write a query, you can just use the computed property without worrying about the Expression at all. The only extra thing you need is to make a final call to.WithTranslations() in the LINQ query to convert all the computed properties to their Expression equivalents. In usage, it would look something like this:

public class Customer
{
  public string FirstName { get; set; }
  public string LastName { get; set; }

  private static readonly CompiledExpression<Customer, string> fullNameExpression
     = DefaultTranslationOf<Customer>.Property(e => e.FullName).Is(e => e.FirstName + " " + e.LastName);

  [NotMapped]
  public string FullName
  {
    get { return fullNameExpression.Evaluate(this); }
  }
}
var customers = ctx.Customers
  .Select(c => new
  {
    FullName = c.FullName
  })
  .WithTranslations();

While this is somewhat similar to the LINQ Expression Projection project, it trades complexity during definition for ease of use in queries.

DelegateDecompiler

Hopefully you're still with me because this last project blew my mind. I first learned about DelegateDecompiler from this blog post on Jimmy Bogard's blog. While DelegateDecompiler still attempts to pass an Expression to the query provider, it takes the burden of creating the Expression out of your hands. It does this by compiling any computed properties into IL then decompiling the IL back into an expression tree. This way it can transparently handle just about any computation you give it including nested computed properties, custom methods, etc. The one caveat is that even though it may seem like it, it's not magic and can't make Entity Framework and LINQ to Entities recognize methods, classes, etc. that they wouldn't have if you had used them directly.

Using DelegateDecompiler is stupidly simple. You just decorate your computed properties with ComputedAttribute and then call .Decompile() in your query. For example:

public class Customer
{
  public string FirstName { get; set; }
  public string LastName { get; set; }

  [NotMapped]
  [Computed]
  public string FullName
  {
    get { return FirstName + " " + LastName; }
  }
}
var customers = ctx.Customers
  .Select(c => new
  {
    FullName = c.FullName
  })
  .Decompile();

That's all there is to it. There's even a DelegateDecompiler.EntityFramework library that you can use if you want to support more advanced Entity Framework functionality such as async queries (though it's not necessary for simple Entity Framework usage). In fact, you can make it even easier to use by automatically processing NotMappedAttribute to indicate computed properties. To do this, just create a new Configuration class:

public class DelegateDecompilerConfiguration : DefaultConfiguration
{
    public override bool ShouldDecompile(MemberInfo memberInfo)
    {
        // Automatically decompile all NotMapped members
        return base.ShouldDecompile(memberInfo) || memberInfo.GetCustomAttributes(typeof(NotMappedAttribute), true).Length > 0;
    }
}

And then register it like this:

DelegateDecompiler.Configuration.Configure(new DelegateDecompilerConfiguration());

Conclusions

As you can see, there are a lot of different ways of working around the lack of support for computed properties in Entity Framework and LINQ to Entities. Which approach you choose depends largely on the factors of your project such as database complexity, query complexity, reuse requirements, etc. Hopefully one of these methods will help you tame all that computation and help release your data from your database.