BeginFetch and FetchAsync on local DataPortal

BeginFetch and FetchAsync on local DataPortal

Old forum URL: forums.lhotka.net/forums/t/11821.aspx


ngm posted on Tuesday, February 05, 2013

Rocky,

You confirmed the following for the next CSLA release at ( http://forums.lhotka.net/forums/p/11814/54722.aspx#54722 ):

"All that said, the local data portal will be changing in the next release of 4.5. Currently when you do an async call (BeginFetch or FetchAsync) against a local data portal the call is ultimately synchronous unless your DP_Fetch code does something async. That behavior is fundamentally different from the remote data portal behavior, and is already the cause of substantial confusion.

As a result, what I'm working on right now is making the local data portal always spin work off to a background thread if the top-level call was BeginFetch or FetchAsync. That way the local data portal directly emulates the remote data portal to eliminate that source of confusion."

Personally I'm totally against this move and I believe it can degrade the number of CSLA use cases. Let me explain briefly.

When implementing async methods in libraries (especially general frameworks like CSLA), they should guarantee asynchronity rather than offloading such as invoking or wrapping synchronous method asynchronously.

If you look at majority of BCL that's exactly how it's done i.e. WCF proxy or file stream will not put your async call on the worker thread just because you called async method variant.

This brings in two major questions.

1) Is local data portal capable of delivering real async functionality?

2) What's the ultimate goal you want to achieve with this offloading?

With regards to the remote data portal with which you're looking to provide call symmetry, its source of asynchronity is particular network stack (usually WCF proxy). From the local data portal perspective, there's no such component and so the need for asynchronity is very questionable. However, data access as implemented by business developer can be implemented as async most commonly by opening db connections or file / network streams. In that case, local data portal is expected to follow that async path.

That brings us to the second question. If the only aim is to offload data portal invocation from the UI thread then I'm affraid CSLA's client - UI technology agnosticism is being jeopardized here.

Consider my service or server-side data portal code that invokes client data portal. Do I really want to have this invocation taking possible two threads from thread pool and allocating / switching contexts just because it's configured to use local data portal there?

var yesterdayOrders = await Orders.GetOrdersAsync(yesterdayCriteria);

var unpaidOrders = await Orders.GetOrdersAsync(unpaidCriteria);

On the other hand, consumer of synchronous method GetOrders can always easily control offloading off UI thread by invoking it from Task.Run.

I know that nowadays people take async methods as "invoked on the new thread" easily, but I strongly believe this will align itself once it gets established. However, CSLA shouldn't make API surface even more confusing.

I would like to hear your thoughts.

Thanks,

- ngm

 

RockfordLhotka replied on Wednesday, February 06, 2013

Many of my design thoughts were captured in my blog as I was doing the work for .NET 4.5 last year. For example:

http://www.lhotka.net/weblog/CSLADataPortalChangesInVersion45.aspx

There are two basic goals.

First, on the client-side data portal I want to enable the use of the await keyword when calling the data portal. To this end the data portal exposes CreateAsync, FetchAsync, UpdateAsync, ExecuteAsync, and DeleteAsync methods. As a side-effect, the pre-existing BeginXYZ methods (such as BeginFetch) now delegate to these new async methods behind the scenes.

Second, the DataPortal_XYZ and factory object methods should support the async keyword. For example:

private async Task DataPortal_Fetch()
{
}

It is important to remember that there is never a direct flow-through of the client-side data portal to the server-side data portal. The client-side business object graph is cloned, and that's what flows to the logical server. The term "flow" is implemented by a proxy/host channel. The simplest channel is the LocalProxy, and the most widely used is probably WcfProxy. Ultimately the call to the "server" (the host) is synchronous. This is because the server is assumed to be an actual server, and the call is assumed to be a network call on which we wait for a response.

In the new 4.5.11 prerelease I just put online the LocalProxy now ensures that the logical server-side code is running on a background thread from the thread pool, thus emulating the remote data portal. Really it was a bug when I removed this behavior in the original 4.5 release, but I was trying to be clever. As it turns out, my being clever just caused confusion and heartburn - not only for a lot of people on this forum, but for myself as well (I was burned a couple times by my own breaking change...).

Fwiw, my goal for the past 17 years has been for the data portal to abstract the concept of the network. Every single time I've deviated from that goal (each time for noble ideals) I've ended up returning to the core goal: when you call the client-side data portal you should get the same logical behavior regardless of whether the "server-side" code runs local or remote.

What exists now in 4.5.11 is extremely consistent and straightforward.

When the client calls BeginFetch or FetchAsync you know that it won't be a blocking call - the logical server-side code will not run on the UI thread. You don't know if it will run locally or remotely, but you can be assured that it won't block the UI thread so there's no difference in behavior between local and remote data portal configurations.

When the client calls Fetch you know that it will be a blocking call. You don't know on what thread (or physical location) the server-side code will run, but you are assured that the Fetch call itself is blocking.

When you implement your DP_XYZ method you can implement it as sync or async. If you implement it as sync (void) it is no different from the behavior you've had since 2001. If you implement it as async (async Task) you can be assured that the data portal will await your result, and within your DP_XYZ code you can do all the fancy task/await stuff you desire.

The other area of interest here is exception handling. This is because exceptions that occur during an async operation are not singular, they are a collection. The data portal is aware of this, and it is aware how you called the data portal. If you called the data portal using one of the new async methods then the entire exception collection flows back (well, on a pure .NET app - not so in WinRT, WinPRT, Silverlight, WP8, or .NET using MobileFormatter). But if you called the data portal with a sync method or a BeginXYZ method the data portal strips the top exception out of the collection and returns it - thus preserving backward compatibility with pre-4.5 behaviors.

ngm replied on Thursday, February 07, 2013

Rocky,

Thank you for clarifying the design goals.

I read that blog article a while ago. While it points very clearly to the way you unified Data Portal with its new async support, it really doesn't touch this topic we've got going on here at all.

I completely agree that one of the main goals behind Data Portal is to abstract the concept of the network (although 17 years is too much even for something as complex as network is hehe, just kidding).
Once you accept to abstract something then in some ways you take responsibility for all those abstracted bits and pieces. Abstracting the network is good thing for sure, but providing faux asynchronity in the name of symmetry is probably not so good in my opinion. It's been said that too many sacrifices are created by always pursuing software symmetry.

Philosophy aside, I think I was right when pointing out that the basis of the issue is pretty much UI i.e. offloading of UI thread. That's probably the biggest issue I've got with this decision. Why would you assume that client-side Data Portal is always invoked from interactive client i.e. the one with UI thread? Isn't this pretty big coupling for something as generic as Data Portal is?

Also you mentioned above that one of the goals was to enable the use of the await keyword when calling the Data Portal. That's exactly what I want to use here, but you ultimately require offloading to the other thread in order for me to await the method.

The reason I posted this topic isn't really just to express my concern behind this design decision but rather to stress out how my scenarios will suffer in the light of new changes.

Let's say I've got Order BO. When it's on the server-side Data Portal, prior to its persistence, it has to coordinate interaction with several other BOs such as Customer, ProductList and GeneralLedger. Order simply calls factory method on each of those objects in order to fetch them. Since server-side Data Portal is using local proxy, fetching will be done on the very same physical tier:

protected new async Task DataPortal_Update()
{
 var customer = await Customer.GetCustomerAsync(_customerId);
 // Interact with customer
 var product = await ProductList.GetProductListAsync(this);
 // Interact with product
 var generalLedger = await GeneralLedger.GetGeneralLedgerAsync(_tranId);
 // Interact with generalLedger
 
 // Persist Order
}

So, in this very common scenario where client Data Portal is used with non-interactive client such as server-side Data Portal, these three requests will be offloaded to at least three potentially different threads. I say potentially because they would come from thread pool and therefore might be reused. I also say at least because if I had my data access in these three objects implemented to do its work on seperate thread that would add to the number of involved threads.
More complex interaction such as when some of these objects interact with some other objects would just keep multiplying potentially used threads.

Pretty much, a lot of context switching, queue synchronization and allocation of the other objects just for the sake of offloading if there happens to be an interactive client on the UI thread.

Let's consider for a sec that everything stays like it was in v4.5.10. The very same scenario above would be executed on the request thread unless there's async data access implementation.

- ngm

RockfordLhotka replied on Thursday, February 07, 2013

ngm

Let's say I've got Order BO. When it's on the server-side Data Portal, prior to its persistence, it has to coordinate interaction with several other BOs such as Customer, ProductList and GeneralLedger. Order simply calls factory method on each of those objects in order to fetch them. Since server-side Data Portal is using local proxy, fetching will be done on the very same physical tier:

[...]

So, in this very common scenario where client Data Portal is used with non-interactive client such as server-side Data Portal, these three requests will be offloaded to at least three potentially different threads. I say potentially because they would come from thread pool and therefore might be reused. I also say at least because if I had my data access in these three objects implemented to do its work on seperate thread that would add to the number of involved threads.

That is a very good point. I had a feeling there was some issue I was missing by changing LocalProxy, but hadn't figured out what it was.

I wonder if one solution to this is to only spin the work onto a background thread if the LogicalExecutionLocation isn't Server. I think that'd address this issue, and will do some experimentation to see.

RockfordLhotka replied on Thursday, February 07, 2013

Yes, this appears to work quite well as a solution. Once the flow of execution has crossed from logical client to logical server the LocalProxy stops spinning work onto the thread pool, allowing normal async/await behaviors to work as expected.

Of course that's still a little risky - but in a way I'm happy to accept. It is "risky" because if you do something that causes the TPL to spin you off onto a thread you do lose ApplicationContext. But you should know that you wrote code that caused that to happen, so you can take steps to flow the context to the new thread.

One relatively easy way to accomplish this is to use Csla.Threading.BackgroundWorker, because it dispatches work onto the thread pool, but also ensures that ApplicationContext flows to the new thread.

ngm replied on Thursday, February 07, 2013

RockfordLhotka

Yes, this appears to work quite well as a solution. Once the flow of execution has crossed from logical client to logical server the LocalProxy stops spinning work onto the thread pool, allowing normal async/await behaviors to work as expected.

Of course that's still a little risky - but in a way I'm happy to accept. It is "risky" because if you do something that causes the TPL to spin you off onto a thread you do lose ApplicationContext. But you should know that you wrote code that caused that to happen, so you can take steps to flow the context to the new thread.

One relatively easy way to accomplish this is to use Csla.Threading.BackgroundWorker, because it dispatches work onto the thread pool, but also ensures that ApplicationContext flows to the new thread.

I see where you're coming with LogicalExecutionLocation.

However, I still believe that SynchronizationContext would be much better thing to lean against. It would enable even scenarios where interactive client with UI thread compose several invocations wrapped in new task such as:

await Task.Run(async () => {

_orders = await Orders.GetOrdersAsync();

_product = await Product.GetProductAsync(id);

_transactions = await Transactions.GetTransactionsAsync();

});

LogicalExecutionLocation will not help here, still all three requests will be spinning a thread. If you inspect SynchronizationContext.Current, it should be null since user invoked a new task.

- ngm

 

ngm replied on Thursday, February 07, 2013

Well, I'm not quite sure you've got LogicalExecutionLocation established at the point in LocalProxy where you need to decide whether to offload onto background thread, right?

Probably you meant ExecutionLocation which is based on the proxy itself i.e. whether the proxy is remote or not.

However, the issue I explained above that really multiplies as the number of objects you're interacting with grow, is really not specific to server-side interaction at all. That can easily become a bottleneck on the interactive client which uses local proxy as well.

Consider having Unit of Work object (BTW I like that concept of yours), UI developer might be calling single async factory method on it:

var ot = await OrderTransactionUoW.GetOrderTransactionUoWAsync(tranId);

The data portal works locally i.e. via local proxy. Here's how UoW object's data access looks like:

private async Task DataPortal_Fetch(Guid transactionId)
{
  var tranInfo = await TransactionInfo.GetTransactionInfoAsync(transactionId); // 1st LocalProxy.FetchAsync
  var gl = await GeneralLedger.GetGeneralLedgerAsync(); // 2nd LocalProxy.FetchAsync
  
  if (gl.AreItemsLoaded)
   await gl.LoadItemsAsync(); // 3rd LocalProxy.FetchAsync - lazy loaded collection
  
  var glItems = gl.Items;  // can't call await on lazy loaded property, must load it otherwise will get null here
  
  if (tranInfo.AreDetailsLoaded)
   await gl.LoadDetailsAsync(); // 4th LocalProxy.FetchAsync - lazy loaded collection
  
  foreach (var tranDetail in tranInfo.Details)
  {
   this.GeneralLedgerItems.Add(dlItems[tranDetail.GeneralLedgerPostId]);
  }
  
  this.Customer = await Customer.GetCustomerAsync(tranInfo.CustomerId); // 5th LocalProxy.FetchAsync
}

This will become thread switching nightmare. Everything is happening on the client. WP or RT machines are not going to be happy with this for sure ;)
From UI developer's perspective, she just wants to get single UoW object and that object's interaction with the other BOs is not terrible complex here, although not the smartest design for sure.

Now, think for a second, if every single of those involved BOs have their data access async "by the book" i.e. opening connection or file stream asynchronously. That's exactly why I'm doing all those awaits in DataPortal_Fetch after all. But those asynchronous IO benefits would be totally neutralized by the fact that in order for me to await anything I've got to be coming from another thread and that's per every single await.

And again, all this spinning just for the sake of saving UI guy from invoking:

Task.Run(async () =>
   await OrderTransactionUoW.GetOrderTransactionUoWAsync(tranId));

That would achieve offloading of UI thread and it will still be getting the best possible scalability depending of the data access strategy in BOs.

But even after this, if you're still in the game of bringing client Data Portal closer to the UI, then I would probably use SynchronizationContext.Current to determine if there's one before offloading. That should be able to provide the "dual personality" behavior of LocalProx ?
In my opinion, that's 100% the job of ViewModel and it’s not a concernt of CSLA's inner channel such as LocalProxy for sure.

Take care,

- ngm

RockfordLhotka replied on Thursday, February 07, 2013

ExecutionLocation is physical, and LogicalExecutionLocation is logical.

So if I'm on a client using the local data portal then ExecutionLocation is always Client. But LogicalExecutionLocation changes as the processing flows through the data portal to reflect the logical switch from "client' to "server".

As a result, the first time LocalProxy is invoked the logical location is Client because the flow of processing hasn't traversed to the logical server.

Subsequent times LocalProxy is invoked (from your code in your DP_XYZ methods) the logical location is Server, regardless of the physical location.

The same logical location change occurs regardless of whether the data portal is configured to run local or remote.

 

btw, your Task.Run example is invalid UI code, because the ApplicationContext won't automatically flow onto that background thread. If a UI developer does write that line of code they have introduced a bug.

If they really want to manually spin the work onto a background thread they need to either write their own task launcher, or use Csla.Threading.BackgroundWorker to achieve the correct result.

Finally, I should point out that 4.5.10 isn't actually the desired behavior. Since 2007 the BeginFetch and similar methods have spun the data portal call onto a background thread. The 4.5.10 behavior broke/changed that behavior, causing an obvious backward compatibility issue. This is because I didn't want to implement and maintain three entirely different data portal process flows. Two is bad enough, and it turns out that async/await can be used to simulate the old-style event-based model.

Again, in the final analysis the data portal needs to provide consistent behavior regardless of local or remote configuration. 4.5.10 broke that contract, and that's the bug I'm fixing in what will be the 4.5.20 release.

ngm replied on Thursday, February 07, 2013

I understood what you want to achieve with LogicalExecutionLocation. But again, if you come to the root of an issue as I pointed out already, it's really not related to whether it's executing on the logical server or logical client side. Even you pointed out that it can be too fragile.

What you really want to determine is whether the invocation to async method on client Data Portal comes from UI thread, if so, it will offload that work. That would cover both execution on logical server and logical client side.

You're right that the Task.Run example is invalid. Personally, I would be much more worried because of that issue than offloading of UI thread i.e. I cannot use typical .NET async mechanisms such as APM, Thread, ThreadPool or most importantly Tasks because the context doesn't flow. I have to use BackgroundWorker which is getting back into old era of event driven asynchronity.

I think you should reconsider alterntaive methods of passing context, one that is compatible with .NET infrastructure. I pointed out to CallContext in one of the previous posts.

This looks like achilles heel of CSLA now when it opened its async doors. The wider you accept asynchronity throughout CSLA plumbing the more issues you'll have when relying onto TLS for storing context. BackgroundWorker really just postpones the issue.

- ngm

 

 

RockfordLhotka replied on Thursday, February 07, 2013

We need to take this one step at a time. Continually changing the goal is ok, as long as the first mole is whacked before the next one shows up.

With that in mind, the change I made to LocalProxy earlier today solves the UoW scenario.

You are right, it doesn't solve the scenario where the UI developer makes numerous calls - those calls will spin onto the thread pool of the client.

The thing is, this isn't really as bad as you make it sound. Mostly because the overhead of serialization/deserialization and network overhead (or file I/o) is almost infinitely higher than the cost of transitioning the work to a thread on the thread pool. Yes, this behavior might cost a few nanoseconds, but that's noise compared to the milliseconds (or seconds) involved in the rest of the process.

Now if we can find a way to intelligently eliminate that overhead I want to do that - I'm not lobbying for worse performance :)   I am just suggesting that this issue isn't really a big one in the scheme of things - where the UoW scenario _was_ a big deal.

 

It is probably worth exploring the synchronizationcontext idea. That's tricky though, because those work differently (or not at all) in various environments CSLA supports, such as:

I know that LogicalExecutionLocation works consistently across these environments, because I've made that happen as they've each arrived on the scene over the years. I also know that synchronizationcontext does not work consistently across these environments. Thus the effort involved in testing its use across all these scenarios with local and remote data portal configurations is pretty large.

fwiw, I'm under time pressure to get 4.5.20 online. Enough so that I know for sure I can't test all those scenarios to see how synchronizationcontext changes between them (and to find corresponding workarounds where it doesn't provide the right solution).

RockfordLhotka replied on Thursday, February 07, 2013

fwiw, on WinRT the CSLA BackgroundWorker uses async/await for its underlying implementation :)

I agree that flowing context is a problem. I have yet to find a solution that'll work in all the environments where CSLA is used...

ngm replied on Thursday, February 07, 2013

Your challenge here is kind of bigger than CSLA, that is to get huge chunk of plumbing unified and consistent across all emerging and legacy platforms. That's very bold goal nowadays. Take MobileFormatter for example, that thing can be used to simulate NDCS across new constrained platforms without using CSLA at all.

I didn't even think for a split of a second that you have to implement what I suggest here, nor do I know if it works or makes sense because as you're stating above, someone needs to look the wholistic picture and that's gotta be you my friend ;)

Also I'm not here to criticize, far away from that. CSLA made me and my teams delivering and maintenanting mission impossible projects over years and years. Even when not using CSLA due to client's specific requirements, the concepts and overall paradigm that you established made my way through.

Finally, I'm here to see how you're gonna solve that next platform fragmentation issue around the corner :)

That being said, these days I'm deep into evaluating CSLA's new goodies for upcoming assignments and that's the reason I'm throwing my suggestions from time to time, hoping CSLA can benefit out of them at some point.

- ngm

 

RockfordLhotka replied on Thursday, February 07, 2013

ngm

That being said, these days I'm deep into evaluating CSLA's new goodies for upcoming assignments and that's the reason I'm throwing my suggestions from time to time, hoping CSLA can benefit out of them at some point.

Well that is a done deal - you found and helped solve (at least for now) what would have been a nasty bug on the server-side data portal. I really appreciate that!

Copyright (c) Marimer LLC