Data Portal Memory Leak?

rob.polak posted on Monday, August 16, 2010

I recently had to create a Multi-Threaded CSLA process that updates an object based on a long-running webservice call. Based on what is returned from the webservice we may or may not update the object (only if something has changed). I am using the C# ThreadPoolManager to make the application multi-threaded.

I noticed that when there was a significant amount of changes that the memory of my program jumped up a significant amount. I traced the memory increase to the "Object".Save() method. To rule out all my code I changed the logic to look like this:

"Object".Save()

        [Transactional(TransactionalTypes.TransactionScope)]
        protected override void DataPortal_Update()
        {

return;

}

I eliminated any possible user code that could cause a memory leak. My collection has about 4000 objects and by the time the process is done the memory usage is approaching 3gb in size. I explicitly make sure all of the objects are de-referenced and GC.Collect() is being called. Any other tips on what I can do to reduce my memory footprint?

Versions:

CSLA Version: 3.8.2

.Net : 3.5 SP1

Thanks!

rob.polak replied on Tuesday, August 17, 2010

So after hours of trying to figure this out I found out what was the issue. For a local dataportal the CSLA framework creates a clone of itself incase of data corruption.

Dataportal.cs

        if (!proxy.IsServerRemote && ApplicationContext.AutoCloneOnUpdate)
          {
              // when using local data portal, automatically
              // clone original object before saving
              ICloneable cloneable = obj as ICloneable;
              if (cloneable != null)
              obj = cloneable.Clone();
          }

This was causing my multi-threaded application to hog up memory (as the data structure is quite complex). So I added the following config setting to remedy the issue:

This should really be called out more clearly, because this option set to true has HUGE performance impacts on applications. With this option set to true my memory was spiking to 2gb and it was taking 2-3 seconds per save. With this option set to false my memory is hovering around 50 meg and each save is taking 40-50ms.

I really suspect that the memory is not getting correctly disposed with the cloned object.

rsbaker0 replied on Wednesday, August 18, 2010

It's certainly true that cloning the object has a performance impact, but the alternative is worse IMHO.

If you turn this off an encounter an exception anywhere during the save process, then your object is left in an indeterminate state. The database may have rolled back the update, but changes made to your objects during the save would be retained. For example, if you were inserting a new object, then child objects in the graph might be marked as old and been assigned autonumbering (e.g. IDENTITY with SQL Server) values that don't exist in the database because the insert was rolled back. If you tried to repeat the save operation, then these objects would be presumed clean and not be saved. This is a disaster in the making.

Also, the AutoCloseOnUpdate setting really only has an effect when using the local data portal. If you are scaling up to 3 or more tiers, then the equivalent processing is being done to serialize the object to the remote server. So, turning this off has no impact except when using a local dataportal -- if you ever switch to using a remote portal, you'll incur the "clone" overhead anyway.