Is there anyone out there that can provide some guidance on DataMapper Best Practices

JonStonecash posted on Thursday, April 22, 2010

As a part of some "architectural studies" for an up-coming project, I have been looking at the DataMapper class. We are looking at using data transfer objects (DTO) to communicate with the database and probably MVVM objects to communicate with WPF. One of the downsides with each of these is the need to transfer data from one object to another; e.g., from the DTO to the business object and back. On a recent project using DTOs, a significant source of "oh, crap" errors sprang from changes in the DTOs (which were code generated) that did not get into the business objects. Dropping a field/column will get you a compile error, thus, getting the attention of the developer. The effects of adding a property and not including in the various "copy lists" are typically much more subtle. That is, the typical pattern was add the new thing to the database, regenerate the DTO code, add the property to the business object but forget to add the transfer to the fetch or to the update logic.

At first glance, the DataMapper would seem to be a way to handle this automagically. However, in working with this functionality, I have bumped my nose several times:

One of the prime drivers for the use of DTO and MVVM objects is the ability to unit test everything completely. I have worked out how to structure things so that I can do true unit tests. For my studies, I am writing unit tests using RhinoMocks. DataMapper does not like the proxy object generated by RhinoMocks. I can avoid most of the problems by suppressing exxceptions, but that leaves me feeling somewhat dirty.
DataMapper goes in the from does by invoking the properties on the object. This is appropriate for the DTO and MVVM objects but not so much for the business objects. Without DataMapper, I would be using LoadProperty and bypassing a lot of the events and checks. I can suppress the checks but again I am left feeling a bit dirty.
DataMapper is not particularly friendly when there is no setter for the property. This makes it painful to use for the primary key values that typically do not have a setter. This makes it useless for read only objects. Granted the typical read-only "info" object only has a couple of properties, but that is not always the case.

It would seem to me that it would be useful to have a more specialized data mapper that was smarter in at least two ways:

It would only attempt copies for the set of properties present on both objects, minus any properties on an ignore list.
If the target object was derived from BusinessBase (and thus had a FieldManger), the mapper would use Load/Set Property to store data into the object. This would eliminate the problem with read-only properties.

I am looking on feedback on a couple of issues. One, is my intended use of DataMapper reasonable? Two, has anyone done something along this line? If so, is anyone willing to share?

I certainly can come up with a more-or-less reasonable solution using code generation with partial classes and even partial methods. Even so, this seems like a more fragile solution. DataMapper (or a variation) seems like the better way to go.

Jon Stonecash

RockfordLhotka replied on Thursday, April 22, 2010

I emailed you directly regarding "MVVM objects", so I'll ignore that here.

In terms of using DTOs to persist data, I'm right there with you - great idea! But I think you are putting them at a peer level with the business objects, and I surely wouldn't do that. Instead, I'd treat the DTOs like the parameters passed to and from a database via stored procedure calls.

In other words, you are constructing a data access layer (DAL) that consumes and provides DTOs. That's good, because it decouples the caller of the DAL from anything beyond simple .NET types. And it is good because this sort of DAL is easily mocked. (Hey DAL! You have a huuuge nose!)

But it is still just a DAL - which means it should be invoked from DataPortal_XYZ methods or from an object factory (subclass ObjectFactory).

If you invoke the DAL from DP_XYZ you can use all the standard friendly data portal and child data portal constructs, which is nice. Let CSLA manage the metastate properties for you, etc. Better still, you don't need to break encapsulation because the code interacting with your business object property values is inside the business object. This is honestly my preferred approach - use a provider or DI model to select the concrete DAL and away you go. Easy to do, easy to mock/test, easy to maintain.

If you invoke the DAL from factory objects, you can use all the standard stuff in ObjectFactory, including LoadProperty/ReadProperty/MarkAsChild/MarkNew/MarkOld/etc to get/set property values and manage the metastate properties as required. This is more work, and requires that you have a deeper understanding of the metastate property behaviors so you get everything done correctly. But it can be very powerful if you create specialized subclasses of ObjectFactory that meet your particular technical requirements.

And if you use the object factory approach you can do your mocking at the object factory level (implement a custom factory loader and swap out the entire set of factory objects), or by using a provider or DI model to select the concrete DAL.

The object factory approach isn't as easy and requires a deeper understanding of CSLA behaviors, but it can be more powerful.

In neither case do you need DataMapper, because you aren't dealing with peer level objects. The DTOs are always subservient to the DP_XYZ or object factory code. When using DP_XYZ you preserve encpasulation - it is the purest OO solution. When using factory objects, the ObjectFactory base class helps you break encapsulation in ways that aren't too terrible, and at least are easy to do.

JonStonecash replied on Friday, April 23, 2010

Rocky,

My approach is right in line with your comments. I have looked at the Object Factory and it seems a bit too massive for my needs. I am planning on using a dependency injection / provider model for the DTO/DAL functionality. Within that approach, the very specific issue that I am concerned with is in the DataPortal_xxx methods. Let me narrow the discussion down to DataPortal_Insert where:

The logic creates an appropriate DTO object.
The logic copies the contents of the Business Object into the DTO.
The logic sends the DTO off to the DAL to be inserted in the persistent store.
The logic captures the side effects of the insert (generated primary key, if we do not use a GUID, timestamps, values diddled by defaults and triggers, etc).

What I am concerned with is the statements in the code that handle step #2. Without a data mapper, I will have one DTO.XXX = BO.XXX for each property (or LoadProperty(XXXProperty, DTO.XXX) for Fetch). If the business object contains a significant number of properties, that can add up to a lot of code that has be to generated and maintained. If the set of properties is stable, we can write the code once and get on with other things. If the set of properties is volatile, it is all too easy to forget to add a line in each of the DP_XXX methods (or the common methods called by these methods). Been there, done that, got the (now ratty) T-shirt to prove it.

It just seemed to me that an intelligent DTO data mapper could reduce the friction of this particular aspect of using DTO to transport data. The current data mapper seems to have echos of an earlier version of CSLA. What I was looking for was something more attuned to the managed properties approach.

Let me toss in two additional points: First, I will be using project-specific base classes that derive from r each of the CSLA business base classes. This is to simplify the handling of the dependency injection; I want the business objects to be tightly focused on business concerns rather than plumbing. I anticipate that I could add logic in these base classes to make the mapping process easier. Second, we will be using code generation to build the core of the DTO objects. There will be some hand-worked DTO classes and extensions to the generated code through partial classes and methods but the vast majority of the classes will be simple enough to crank out like potato chips. If I go down the data mapper route, each of these classes could inherit from a hand-crafted DTOBase class that could provide some helper logic (either directly or through composition).

Any thoughts?

Jon Stonecash

RockfordLhotka replied on Friday, April 23, 2010

The DataMapper exists to address two specific scenarios - copying data into and out of a Web Form postback and copying data into and out of a DTO for a web service or WCF service. In both cases the operation is outside the object and is conceptually at a peer level - the interface object is a peer to the business object.

You can use a DataMap to get more advanced, including mapping to fields - but that assumes you have private fields - and it would require that you maintain the map, so I don't think it addresses your concern.

Certainly you could create something like DataMapper that does what you are talking about - though it might need to subclass ObjectFactory to get access to the LoadProperty/ReadProperty methods necessary to break encapsulation.

JonStonecash replied on Saturday, April 24, 2010

Just to tie this one off, here is what I ended up doing:

I wrote a static CoreDataMapper class with two public methods: one to copy the contents of a data transfer object (DTO) to a business object (BO), and a second to copy the contents of the BO to the DTO. The DTO objects implement (what turned out to be) a "marker" interface that provides a tiny bit of type safety. I will probably switch this to "object" at some time in the future; that way, I am not restricted on what I can use as a DTO. The BO objects implement an IMapBusiness interface. I want to use use the data mapper with items eventually derived from BusinessBase and ReadOnlyBusinessBase and the interface seems to be the most straight forward way of getting that to happen.

The IMapBusiness interface has three methods: First, GetRegisteredProperties returns a List<IPropertyInfo> for the object. Second, LoadNamedProperty invokes the protected LoadProperty method of the BO. Third, ReadNamedProperty invokes the protected ReadProperty method of the BO. Each of these methods is a one line "pass through" method. As I noted earlier in this thread, we are extending the standard CSLA business base classes with our own base classes. [This is a technique that I strongly recommend; it allows for project level extensions of CSLA without making changes to CSLA, itself.] I implemented this interface in our base classes for BusinessBase and ReadonlyBusinessBase as an explicit interface; the explicit implementation keeps the methods out of Intellisense and allows me to be less guilty about breaking the encapsulation of the business objects.

The intent of the CoreDataMapper is to reduce the amount of code that I have to write in the DataPortal_XXX and Child_XXX methods. The data mapper is not all that clever; at this point, there are several situations (including child collections) that it does not handle. I added a "DoNotAutoMap" attribute that can be applied to the properties of a BO to warn off the data mapper. In those cases, I must revert to writing the copy logic myself.

Of course, none of the of the above has been tested in full project implementation. I fully expect that a fair amount of it will under go changes as we make our way through the project.