Unit Testing with the Repository Pattern

justin.fisher posted on Friday, October 01, 2010

My development team has been struggling to test out the logic in our business objects due to the massive amount of data that needs to be injected into and removed from the database prior to running the tests. We are constantly running into situations where the test setup isn't able to drop data because of key constraints or the structure of the classes changes which requires us to update all out test data insert and delete scripts. Generally speaking this whole process has been very time consuming and prone to error. We are looking to repository pattern to help alleviate some of this headache.

I have been reviewing some interesting blog posts from Peter Stromquist discussing the use of the repository pattern with CSLA. I have also reviewed Rocky's DPRepos example provided with the Core380 video series where he creates repository classes which pass IDataReader instances between the repository and the business logic. I have been able to merge these two example with some success.

I have had success implementing a Linq to SQL and a Mock data access layer but some on our development team have suggested that instead of Linq to SQL we should be using Entity Framework for object persistence / retrieval within the repository classes. Personally, I don't see much of a difference between these two technologies aside from the fact that Entity Framework (EF) provides a mapping layer where I can map my database fields into my DTO object. This feature does not seem particularly useful since I am going to map my database fields into my DTO and then inside of the DataPortal_xyz methods will again map that into my business object properties.

I am hesitant to use EF right now because I don't know of a context manager available inside of CSLA which will support the EF. I'm also not familiar with EF and would have to deal with the learning curve of getting up to speed with the technology. Does anyone out there have any experience with using the repository pattern along with the EF? Does anyone out there have any general feedback about using EF with CSLA? Are there benefits of EF over Linq to Sql that I am overlooking?

tmg4340 replied on Friday, October 01, 2010

CSLA does have an ObjectContextManager, which supports context managers for EF ObjectContext objects. So you should be good there.

One of the more obvious "benefits" of EF over L2S is that Microsoft has stopped developing L2S. EF has been decided as "the way to go" in the future. I say "benefit" because you'll have to decide whether that's important to you. From my perspective, Microsoft intends to develop EF as their "official ORM", so you can probably assume over time it will become more robust. They're already building in features to make it easier to test using EF, and they are taking the first steps towards the NHibernate "build the model and let NHibernate create the database" concept. Again, you have to decide whether those are benefits to you.

I don't consider the learning curve from L2S to EF to be particularly large either. Many of the same concepts apply, and while there certainly are differences, I don't think you'd have a hard time catching up.

The mapping features available in EF could be useful, since it saves you a transfer step - i.e. one less object to move your data through. Given the "massive amounts of data" you seem to be contending with, eliminating a mapping step you have to write and maintain could be beneficial. I've also heard the SQL that EF generates is better than what L2S does, so you could see some efficiency gains there.

I actually haven't used EF with a CSLA project yet, but I use EF quite a bit at my job, and I don't see how it could work any worse than L2S and CSLA (which seems to work pretty well).

HTH

- Scott

edore replied on Friday, October 01, 2010

Hi!

I agree with Scott about the learning curve and the overall benefits of using EF over L2S. Regarding the Repository pattern implementation, my team implemented it very successfully and we are particularly happy of this decision. The repository to instantiate for each BO is configured in a unity config section so we can even have multiple implementations, say a base one + one per customer. It also enables us to have multiple dal technologies depending on the specific needs in terms of performance and functionality. Actually, that's what we do the most frequently, we use EF (all repository/context in a specific assembly) for the main CRUD stuff but when we need performance we use SQL Client (the repository and context are also in a specific assembly). Some integrations (basically mapping) with a legacy system through web services are implemented through another assembly.

All in all, we can switch the dal implementation very easily.

Hope this helps!

justin.fisher replied on Friday, October 01, 2010

How do you typically structure your repository objects? Do you normally create a repository object per business object or do you use one repository object per table in your database?

Typically we will have a few business objects utilizing the same table. For instance we use an info object to navigate between instances of a detail object. Would you recommend using a single DAL for the customer table or would you have a DAL for CustomerInfo and CustomerDetail? It would seem that if you had a single DAL then you would end up returning a lot of data that won't be used by the business object. If you were returning the whole customer row to CustomerInfo where it only needed a name and an identifier. It would ignore the address, phone number, and all the other data stored in each of the customer rows. The DAL is not smart enough to determine that the calling code does not need the full customer row. Do you have any examples of the approach you are using or any resources you would recommend for review?

edore replied on Monday, October 04, 2010

Hi Justin,

We do have one Repository and one Context object per business object. Yes this leads to a lot of objects but overall, if you think about it all basic CRUD operations are almost identical, all that differs are the column names and the table name. While building the Contexts and Repositories, we ended up creating generic abstract classes who expose the standard operations and some sub-classes who are responsible of implementing the repetitive algorithms. The concrete sub classes only contain methods override to expose the table specific stuff and do the mapping between EF and a dto. So, the context and repository are pretty small concreate classes. Moreover, the RO entities are often the result of some aggregation or more complex queries on a view or on multiple tables. This validates the fact it is not a good idea to mix up the RO and editable root context and repository.

Here's our base classes for RO entities repository and context :

public abstract class SelectableEntityRepositoryBase<TDto, TEntityId> : IRepository<TDto, TEntityId>
        where TDto : DtoBase<TEntityId>, new()
    {
        public TDto CreateDto()
        {
            return new TDto();
        }

        /// <summary>
        /// Exposes the context.
        /// </summary>
        /// <returns></returns>
        public abstract SelectableEntityContextBase<TDto, TEntityId> CreateContext();

    }

    public abstract class SelectableEntityContextBase<TDto, TEntityId>
         where TDto : DtoBase<TEntityId>
    {


        public abstract IList<TDto> GetList();

        public abstract TDto Get(TEntityId primaryKey); 
    }

Here's the base classes for an editable root's repository and context :

    public abstract class EntityRepositoryBase<TDto, TEntityId> : SelectableEntityRepositoryBase<TDto, TEntityId>, IRepository<TDto, TEntityId>
        where TDto : DtoBase<TEntityId>, new()
    {
        public TDto CreateDto()
        {
            return new TDto();
        }

        /// <summary>
        /// Exposes the context.
        /// </summary>
        /// <returns></returns>
        public abstract EntityContextBase<TDto, TEntityId> CreateContext();

    }

public abstract class EntityContextBase<TDto, TEntityId> : SelectableEntityContextBase<TDto, TEntityId>

         where TDto : DtoBase<TEntityId>
    {

        public abstract void Insert(TDto dto);
        public abstract void Update(TDto dto);

        public abstract void Delete(TEntityId primaryKey);

    }

    So these classes are pretty obvious.  The most interesting one is a sub class of EntityContextBase, called ExtendedEntityContextBase.

In fact, this is the class responsible of implementing our repetitive CRUD algorithms.  Since this may depend on many factors or tastes,

I won't provide it here.

As you see the Contexts always return TDto objects, which are then mapped to a BO.  The TDto are only POCO objects created from the data read in the DB.  Since we use EF

we must map the EF entity to a TDto before returning it to the caller.  One may think it is a big overhead but frankly, with parallelism it is very fast.

I hope this helps and if you have further questions I'll be glad to help!

Etienne