Transactions and SQL Connection Efficiencies

DCottle posted on Friday, January 30, 2009

I have to questions, related but distinct:

1. During the update/insert process for an object with child object graphs, a very normal operation would be to get back the primary key for an inserted parent and pass that to the children updating their key with the parents newly created primary. Using the new .NET Transaction scope, what would happen if the last of these children actually failed in their update. I know the data in the database would all get rolled back, but what about the parent and all of the children that were updated previous to the one that failed...how would they have their primary / foreign key data rolled back to the previous state?

2. The company I am currently working for has an in house Data Access Library (DAL) that they have optimized to handle many 100's of transactions per second. So, using a dataset with relations established, they will basically construct all of the command objects (NOT executing them at this time), and then once they are all constructed they will open a connection and start executing each of the command objects that would represent the data hierarchy in rapid sequence. The only actions that really happen between the execute of one command to the next is updating a primary / foreign key if necessary. This ensures that their connections objects are open for the minimum amount of time possible, and thus reduces any chances of locking or deadlocks that you could find in a large volume application. Specifically, how does the performance of CSLA's built in DataPortal_Update application compare with this optimized scenerio. I know (or at least I think I know), that CSLA will open a connection for me with the first reference to the ConnectionManager, and basically as long as I have any objects in my object tree that are being updated with that connection it will hold onto the connection as long as needed.

How much more likely is a blocking or deadlock situation using this method employed by CSLA? Do you have any recommendations that we can make sure we try to follow to prevent those situations?

Thanks

ajj3085 replied on Friday, January 30, 2009

DCottle:
1. During the update/insert process for an object with child object graphs, a very normal operation would be to get back the primary key for an inserted parent and pass that to the children updating their key with the parents newly created primary. Using the new .NET Transaction scope, what would happen if the last of these children actually failed in their update. I know the data in the database would all get rolled back, but what about the parent and all of the children that were updated previous to the one that failed...how would they have their primary / foreign key data rolled back to the previous state?

They wouldn't. You'd get back an exception, and the original object is cloned before saving, so it shouldn't be left in an inconsistent state. Note that some earlier versions of Csla didn't do the cloning if you're using the Local DataPortal, and so you need to do this step yourself to prevent corrupting the in memory object.

DCottle:
2. The company I am currently working for has an in house Data Access Library (DAL) that they have optimized to handle many 100's of transactions per second. So, using a dataset with relations established, they will basically construct all of the command objects (NOT executing them at this time), and then once they are all constructed they will open a connection and start executing each of the command objects that would represent the data hierarchy in rapid sequence. The only actions that really happen between the execute of one command to the next is updating a primary / foreign key if necessary. This ensures that their connections objects are open for the minimum amount of time possible, and thus reduces any chances of locking or deadlocks that you could find in a large volume application. Specifically, how does the performance of CSLA's built in DataPortal_Update application compare with this optimized scenerio. I know (or at least I think I know), that CSLA will open a connection for me with the first reference to the ConnectionManager, and basically as long as I have any objects in my object tree that are being updated with that connection it will hold onto the connection as long as needed.

Csla leaves it up to you to do your data access. So it offers nothing in this regard. Your DataPortal_Insert or Update code must be setup to keep the connection open as little as possible. So you could even continue to use your custom DAL in the same manner. I'll also point out that this is how Linq To Sql and likely Linq to Entities works (that is, they figure out all the statements, then execute in a single batch).

HTH

DCottle replied on Friday, January 30, 2009

That is not exactly true as far as CSLA leaving it up to me to do my data access. It does in fact provide a ConnectionManager that will take advantage of the .NET TransactionScope object, but by utilizing the ConnectionManager I am pretty much forcing my connection to stay open during the entire update process am I not?

It is this time comparison that concerns me. Normal CSLA Parent/Child/Grandchild updates, versus the optimzed DAL that will batch all of the Command objects and send them all in rapid succession.

DCottle replied on Friday, January 30, 2009

ajj3085:
The connection manager has nothing directly to do with TransactionScope. It's purpose is to allow the use of a single, shared connection, but you don't have to use it if your custom DAL will do this for you.

You can use the TransactionScope object directly, or via an attribute.. but neither method requires the use of the ConnectionManager. It's there for convince for those that don't use a custom DAL.

Got it. I understand that I do not have to use the ConnectionManager, nor the TransactionScope. But using the ConnectionManager does give me the ability to use the TransactionScope because each of my calls for a new connection will return the same one, thus providing for the context of the transaction to remain open.

I understand I do not HAVE to use these features. I am simply stating that I WANT to use the features if I can do so while still maintaining efficient data access and avoiding the locking scenerios described above.

So back to my original question: Does anyone have any input as to how scalable it would be (say using just regular ADO.NET) to use the ConnectionManager and the TransactionScope to manage our database inserts/updates.

Has anyone else noticed when they scale this up using the ConnectionManager and the TransactionScope features provided with CSLA that they have locking issues?

JoeFallon1 replied on Friday, January 30, 2009

In my opinion the use of the ConnectionManager will be fine. It will operate just as fast as the custom DAL code.

They are both doing similar things: opening a connection, and then processing SQL commands as fast as they can. There is nothing special about either of them.

When you setup a transaction for use in ADO.Net then you will have the same potential issues with deadlocks as with your custom DAL which uses transactions too. To me they are equivalent.

The key idea when coding your Insert/Update/Delete operations inside a tr is to keep them open as short as posisble. In other words, when you do an insert, do not turn around and do a Fetch, and then some massive processing while the tr is still uncommitted. Also, once the tr is open be sure each of your queries uses it to prevent other issues. This is easier to overlook than you might think. You could be updating many different BOs and one of them might do a quick lookup during processing and if it does not use the tr it won't see the new data. Or it could deadlock waiting for the tr to commit.

Joe

RockfordLhotka replied on Saturday, January 31, 2009

DCottle:

Has anyone else noticed when they scale this up using the ConnectionManager and the TransactionScope features provided with CSLA that they have locking issues?

As Joe points out, you can get in trouble. But the trouble is the kind you'll get into with any DAL.

If all your code is running within a transaction, then you have to "get in and out fast" and avoid doing a lot (or any) data reads or other long-running work.

If you use the Transactional attribute on a DataPortal_XYZ method, then all your server-side code will run within the transaction. The data portal starts the transaction before calling your code, and ends it after your code completes.

You need to be aware that this is the case, and realize that all your code is inside a transaction, and so you can have NO LONG RUNNING BEHAVIORS in your server-side code.

If you need long running behaviors, then don't use the Transactional attribute. Use manual transactions, and create your own TransactionScope object. You should realize that the Transactional attribute really saves you exactly 2 lines of code. Instead of this:

[Transactional(TransactionScope)]
protected override void DataPortal_Update()
{
// update here
}

You have this:

protected override void DataPortal_Update()
{
using (var tr = new TransactionScope())
{
// update here
}
}

Arguably you've added two lines of code: "{" and "}" ;)

Either way, you can use ConnectionManager - it has nothing at all to do with the Transactional attribute.

The thing I want you to notice, is that in the second code block, you could execute code before or after entering the transactional scope. You could run long-running code outside the transaction.

However, I also want to point out that you'll cascade the calls to your child objects inside the transaction scope in both cases. So if the child needs to do some long-running code outside the transaction, you'll have to devise your own scheme by which the root can invoke the child before or after the transactional scope block.