Attaching detached POCO to EF DbContext - simple and fast

Introduction

Recently I was playing around with Entity Framework (EF) and evaluating it for some projects. I had a very hard time figuring out how to attach detached object graph to DBContext in fast and reliable way. Here I am sharing simple AttachByIdValue() method implementation that can do this for you. If you are not interested in full explanation of the problem jump straight to method implementation and start attaching your objects.

The Problem

Let’s say we are using EF in web app to implement page for managing Order and Order Lines. So we have parent-child relation (Order and Order Lines) and some referential data that is displayed but won’t be updated (Customer and Products).

We would typically query above object graph from database (DB) using EF and send it to client (browser). When client sends this object graph back to server we would like to persist it and in order to do so we must first attach it to DbContext.

The question is how to attach this detached graph without reloading it form DB and applying changes. Reloading form DB is performance hit and it is invasive. If I couldn’t do it without reloading I would discard EF because this is very basic task that I expect my ORM to solve easily. Luckily I found the solution after lot of digging.

Add() or Attach()

There are two methods for attaching detached objects, Add() and Attach(), and they receive graph root object (Order). Add() method attaches all objects in graph and marks them as Added, while Attach() also attaches all objects in graph but marks them as Unchanged.

Since our object group will usually have new, modified and unchanged data our only option is to use one of these two methods to attach the full graph and then traverse the graph and correct state of each entry.

So which method should we choose?

Well actually Attach is not an option because attach can cause key conflicts due to duplicate key values for same object types. If we have Order with two new Order Lines, those Order Lines would probably have Id = 0. Attaching this Order with Attach method would break because Attach will mark these two Order Lines as Unchanged and EF insists that all existing entities should have unique  primary keys. This is why we will be using Add method for attaching.

Resolving new and modified data by Id value

The question is how will we know the state of each object in graph (New/Modified/Unchanged/Deleted)? Because detached objects are not tracked the only reliable way would be to reload the object graph form DB, and as I stated before I don’t want to do that because of the performance.

We can use simple convention. If Id > 0 object is modified, and if Id = 0 then object is new. This is pretty simple convention but with  drawbacks: 

  • We can’t detect unchanged objects so we will be saving to DB unchanged data. On the bright side these object graphs should not be that big so this should not be performance issue.
  • Deleting objects must be handled with custom logic. E.g. having something like Order.DeletedOrderLines collection.

In order to read Id value when attaching objects, all entities will implement IEntity interface.

public interface IEntity
{ 
    long Id { get; }
}  

Ignoring referent data

Each object graph can contain referential (read-only) data. In our case when we are saving Order, we might have Products and Customer objects in graph but we know that we don’t want to save them in DB. We know that we should save only Order and Order Lines. On the other hand EF doesn’t know that. This is way AttachByIdValue accepts array of Child types that should be attached for saving along with Order. All objects in graph that are not root nor are of Child Type will be attached to context, but will be marked as  Unchanged so they won’t be saved to DB.

To save only Order (without Order Lines) we should call:

myContext.AttachByIdValue(Order, null);
myContext.SaveChanges();  

So to save Order and Order Lines we should call:

myContext.AttachByIdValue(Order, new HashSet<Type>() { typeof(OrderLine) });
myContext.SaveChanges(); 

Off course above HashSet<Type> can be cached in static field to avoid calling typeof on every object attaching.

private static readonly HashSet<Type> OrderChildTypes = new HashSet<Type>() { typeof(OrderLine) }; 
... 
myContext.AttachByIdValue(Order, OrderChildTypes);
myContext.SaveChanges();   

The final solution

/// <summary>
/// Attaches entity graph to context using entity id to determinate if entity is new or modified.
/// If Id is zero then entity is treated as NEW and otherwise it is treated as modified.
/// If we want to save more than just root entity than child types must be supplied.
/// If entity in graph is not root nor of child type it will be attached but not saved
/// (it will be treated as unchanged).
/// </summary>
/// <param name="context">The context.</param>
/// <param name="rootEntity">The root entity.</param>
/// <param name="childTypes">The child types that should be saved with root entity.</param>
public static void AttachByIdValue<TEntity>(this DbContext context, TEntity rootEntity, HashSet<Type> childTypes)
    where TEntity : class, IEntity
{
    // mark root entity as added
    // this action adds whole graph and marks each entity in it as added
    context.Set<TEntity>().Add(rootEntity);
    // in case root entity has id value mark it as modified (otherwise it stays added)
    if (rootEntity.Id != 0)
    {
        context.Entry(rootEntity).State = EntityState.Modified;
    }
    // traverse all entities in context (hopefully they are all part of graph we just attached)
    foreach (var entry in context.ChangeTracker.Entries<IEntity>())
    {
        // we are only interested in graph we have just attached
        // and we know they are all marked as Added 
        // and we will ignore root entity because it is already resolved correctly
        if (entry.State == EntityState.Added && entry.Entity != rootEntity)
        {
            // if no child types are defined for saving then just mark all entities as unchanged)
            if (childTypes == null || childTypes.Count == 0)
            {
                entry.State = EntityState.Unchanged;
            }
            else
            {
                // request object type from context because we might got reference to dynamic proxy
                // and we wouldn't want to handle Type of dynamic proxy
                Type entityType = ObjectContext.GetObjectType(entry.Entity.GetType());
                // if type is not child type than it should not be saved so mark it as unchanged
                if (!childTypes.Contains(entityType))
                {
                    entry.State = EntityState.Unchanged;
                }
                else if (entry.Entity.Id != 0)
                {
                    // if entity should be saved with root entity
                    // than if it has id mark it as modified 
                    // else leave it marked as added
                    entry.State = EntityState.Modified;
                }
            }
        }
    }
}  

One gotcha

As I explained earlier, EF insists that all existing entities should have unique primary keys and this is why you cannot attach to DbContext two unchanged objects of same type with same Id. This shouldn’t be the case in general but I have found one edge case where it might occur. Let’s say we are loading Order, Order Lines and Products and we have two different Order Lines pointing to same Product. Normally EF will set reference to same Product object to these Order Lines unless you are loading your data using AsNoTracking to get better performance in which case each Order Line gets reference to separate Product object that is equal by all values. I didn’t find documentation of this behavior anywhere, I have discovered by accident why struggling to attach objects to DBContext.

posted @ 2014-05-07 15:50  happyu0223  阅读(406)  评论(0编辑  收藏  举报