Asynchronity on the Data Portal server-side

Asynchronity on the Data Portal server-side

Old forum URL: forums.lhotka.net/forums/t/11814.aspx


ngm posted on Saturday, February 02, 2013

I was going through my notes around last few days I'm examining Data Portal mechanisms.

Let me just say that it's astonishing work to see that the main principles behind Data Portal from over a decade ago are still in place with enhanced capabilities and further separated responsibilities, yet they're unified to provide the same distributed pipe line on today's increasing number of client / server technologies.

Back to my notes, I spotted that some parts of Data Portal server side do not have all its invocation paths as async and thus can potentially block worker threads and degrade server-side performance and scalability in general. Most notably they do not always invoke underlying data access logic asynchronously.

The Mobile WCF Data Portal uses MobileRequestProcessor which serves both factory and regular data access invocations but sadly they are both synchronous. Latter is synchronous because it delegates sync call to client Data Portal and the former because it invokes sync factory method through MethodCaller.

Even though the non-mobile WCF Data Portal invokes server Data Portal directly, it does that in a sync manner. I don't see why is this a case when Csla.Server.DataPortal is completely async.

This discussion might be indireclty relevant ( http://forums.lhotka.net/forums/t/11797.aspx ).

Is there any barrier that's preventing asynchronity on the Data Portal server-side?

Thanks,

- ngm

RockfordLhotka replied on Tuesday, February 05, 2013

The server-side data portal is never actually async, because the top-level invocation from WCF (or any other data portal host) is ultimately synchronous.

My goal was to enable the scenario where you implement async behaviors within your DataPortal_Fetch or factory methods.

It is also the case that the sync data portal remains entirely intact. This is important if I want to continue to support existing ASP.NET and WCF service implementations. There's a fair amount of refactoring required for an ASP.NET application to work with async calls behind web pages, so clearly the data portal needs to retain its sync behaviors for that environment.

(and for all existing sync Windows Forms and WPF apps - of which there are many of course)

 

All that said, the local data portal will be changing in the next release of 4.5. Currently when you do an async call (BeginFetch or FetchAsync) against a local data portal the call is ultimately synchronous unless your DP_Fetch code does something async. That behavior is fundamentally different from the remote data portal behavior, and is already the cause of substantial confusion.

As a result, what I'm working on right now is making the local data portal always spin work off to a background thread if the top-level call was BeginFetch or FetchAsync. That way the local data portal directly emulates the remote data portal to eliminate that source of confusion.

ngm replied on Tuesday, February 05, 2013

Rocky,

I'm not quite sure we're on the same page. Here I'm specifically referring to asynchronity on the server-side of Data Portal.

Even more precisely, by asynchronity I don't mean multi-threading per se and server-side doesn't consider sync or async proxy implementation on the client at all.

What I want to achieve with my scalable Data Portal host is to follow all its async paths i.e. if I implement data access method (factory or in BO itself) as async it should await it throughout the whole call stack. Then it's up to business developer to decide whether that data access would be invoked on worker or IO thread by spinning new thread or invoking db connection asynchronously for example.

Why would Csla.Server.Hosts.WcfHost block current request thread as in this fetch implementation:

        result = portal.Fetch(request.ObjectType, request.Criteria, request.Context, true).Result;

WCF supported server-side asynchronity even before we got async/await baked. It was called server async pattern implemented by APM. It's just that's now much easier to achieve it.

Again, it's nothing to do with the client asynchronity i.e. the client should be able to invoke this service either synchronously or asynchronously.

- ngm

 

ngm replied on Tuesday, February 05, 2013

Just tried making WcfHost's Fetch async along with my ObjectFactory's Fetch which is async as well.

Fetch factory implementation started a new thread simulating very long operation.

Good and bad news.

It works pretty well, freeing request thread. Once factory completed its work, the request resumed where it supposed to.

The issue is that ApplicationContext is not capable of flowing global context. I tested this with WPF app, web app might be in better position due to different ApplicationContextManager which relies on HttpContext instead of thread's local slot. Principal seems to be flowing correctly though.

Obviously it's not trivial, but personally I think it's worth pursuing.

- ngm

 

RockfordLhotka replied on Wednesday, February 06, 2013

GlobalContext should flow one-way, from the caller to the callee. If that doesn't happen then that's a bug. But if Principal is flowing then I suspect the context dictionaries are flowing as well - but again, bugs are possible.

You are right that it doesn't automatically flow back to the caller. It really can't, because you could have numerous async operations running, so how would CSLA merge the multiple GlobalContext objects back into the caller's single instance?

Since 2007 with the introduction of BeginFetch, etc. the GlobalContext has flowed back to the DataPortal instance object on the client. It is up to the app developer to get that dictionary and merge it back into the UI thread's context dictionary. So few apps use GlobalContext (which is good - it is expensive!) that this hasn't been an issue. At least nobody has ever brought it up as an issue over the past few years.

ngm replied on Thursday, February 07, 2013

You mean ClientContext should flow one-way? GlobalContext should flow in both directions, right?

The reason Principal flows and ClientContext / GlobalContext doesn't is because System.Threading.ExecutionContext captures System.Security.SecurityContext which preserves current principal. Contexts are stored in TLS (at least with Csla.Xaml.ApplicationContextManager) and therefore they're not part of captured ExecutionContext. Also the current culture is not preserved as well.

What I proposed initially in this post is to enable WCF infrastructure to release the requesting thread if there's asynchronous data access and thus achieve better scalability. However, this issue with context flowing is not specific to the modification I've done above. Pretty much if today, business developer starts data access (or any other logic)  on the separate thread, the context will not flow in any direction to and from that thread.

As far as inbound context goes, I think it would be pretty expected to have it flowing throughout the whole logical call stack, no matter how many threads it's spanning. One solution to it might be preserving contexts as part of logical call with LogicalSetData / LogicalGetData methods of System.Runtime.Remoting.Messaging.CallContext instead of TLS. Then ExecutionContext should be able to have context flowing.

Regarding outbound context i.e. passing the context out of executing thread, if I recall it correctly, calling LogicalSetData on the CallContext will not make that value available to the caller thread - the one that offloaded work onto other thread. However, if the reference to the object such as dictionary is already set on the caller thread, the executing thread should be able just to add items or change properties referencing the object. That's probably exactly what you want going on in ApplicationContextManager implementation.

While I agree with the problem about merging of potentially multiple GlobalContext objects back into the caller's single instance as you stated above, I believe it's much more related to the client-side than the server-side. The biggest difference is that server-side Data Portal has got the logical call established no matter how many threads it's switched on. Even if you've got some parallelism occuring somewhere on the logical call, setting context values, that should be considered as typical multithreading issue.

The fact that GlobalContext is not used so much (yes, I used it very rarely) doesn't mean it should be inconsistent.

Still getting fully into this async game on the server side of the house would pretty bold achievement for CSLA.

- ngm

 

RockfordLhotka replied on Thursday, February 07, 2013

I agree that it would be very cool for the data portal to support async channels. This is something I've thought about off and on since WCF introduced the concept years ago, but it just has never floated to the top of my priority list.

It is challenging to consider that the client might call the data portal and not get a response for seconds, minutes, or hours (or days) because the server is totally async. If you take this to its logical conclusion there's a pretty fundamental impact on the overall design and expectations around the data portal.

For example:

order.ReadyToShip = true;
await order.SaveAsync();

This call could start an async workflow process on the server that won't complete until tomorrow when the shipping department is able to fulfill. Obviously the client wont' sit around waiting until then :)

Not to say that supporting true async message-based server interactions isn't a good idea, because I think it is a good idea. But it is to say that this requires some careful thought.

What WCF bindings would be allowed? How would the ASP.NET Web API be supported? Do we still provide any support for synchronous channels, or completely switch to an async channel model? (probably yes, otherwise the data portal would act differently depending on your channel, and that'd be a mess)

ngm replied on Thursday, February 07, 2013

Rocky,

Async channels would be awesome, but that requires serious reengineering of the infrastructure.

But I'm not talking about async channels here at all. Maybe my explaination of the final goal is pretty muddy but basically I'm talking about service asynchronity, usually called async pattern in WCF terms. Something that ASP.NET Web API you mentioned supports as well.

Here's one article on the msdn that describes exactly what I want to achieve here ( http://blogs.msdn.com/b/wenlong/archive/2009/02/09/scale-wcf-application-better-with-asynchronous-programming.aspx ). Although it's pretty old - it doesn't cover async / await but rather APM, that's the one I just found and still the essence is the same.

- ngm

 

RockfordLhotka replied on Thursday, February 07, 2013

I think I accomplish the goal of that MSDN post. In fact, the data portal holds true to one of their tenants:

"One important thing to note is that WCF is a fully decoupled platform. This means that the client-side asynchrony has nothing to do with the service-side asynchrony"

The data portal does the same thing regardless of whether you use WCF or not. The client-side asynchrony and server-side asynchrony have nothing to do with each other.

In this thread our discussion is (as I understand it) focused on the server side, and specifically within the context of a WCF host.

The data portal requires that the top-level service/operation API be based on a synchronous WCF binding. As you and I note, async bindings would be awesome, but that's a pretty tall order... Not likely to happen anytime soon.

The data portal host (WcfPortal) creates an instance of Csla.Server.DataPortal and invokes its async methods, but with the .Result() method so the invocation is synchronous at the top level. This allows the actual workflow to use all the cool async/await features, but ensures that the WCF service call itself doesn't return to the client until the server work is complete.

The entire flow of control from Csla.Server.DataPortal down to the SimpleDataPortal or FactoryDataPortal (the two types that invoke YOUR code) is a series of async methods.

This is because the IDataPortalServer interface is used to flow the calls through the pipeline, and all its methods return Task<DataPortalResult>. So pretty much by definition the entire call chain on the server is a series of async/await calls.

The exceptions being, again, at the top level the .Result() method is used to prevent returning to the client prematurely. And at the bottom of the chain SimpleDataPortal and FactoryDataPortal examine YOUR DataPortal_XYZ or factory methods to see if they are 'void' or 'async Task' and the calls to your code are adjusted accordingly.

As a result, if your DataPortal_XYZ method is 'async Task' then SimpleDataPortal will await your method, and inside your method you can use the full set of async/await behaviors.

For example, you might await the result from a database call as suggested in the MSDN article. Behind the scenes Microsoft is nice enough to use IO completion ports in their implementation, so your await is truly async and won't block a worker thread.

At least so goes my understanding.

ngm replied on Thursday, February 07, 2013

Got you!

Now I know where confusion lies:

RockfordLhotka

"The data portal host (WcfPortal) creates an instance of Csla.Server.DataPortal and invokes its async methods, but with the .Result() method so the invocation is synchronous at the top level. This allows the actual workflow to use all the cool async/await features, but ensures that the WCF service call itself doesn't return to the client until the server work is complete."

There's no need to hold server-side request thread in order to keep client's request alive i.e. not to return it. That's baked, even better baked with Task based contracts / service implementation. It used to be total mess with APM and passing the state between BeginXXX and EndXXX methods.

My PoC up there where I modified WcfHost's Fetch was:

    [OperationBehavior(Impersonation = ImpersonationOption.Allowed)]
    public async Task<WcfResponse> Fetch(FetchRequest request)
    {
      Csla.Server.DataPortal portal = new Csla.Server.DataPortal();
      object result;
      try
      {
        result = await portal.Fetch(request.ObjectType, request.Criteria, request.Context, true);
      }
      catch (Exception ex)
      {
        result = ex;
      }
      return new WcfResponse(result);
    }

- ngm

 

RockfordLhotka replied on Thursday, February 07, 2013

Two questions then.

First, does this change the service contract? I'm guessing not - it is just an internal implementation difference?

Second, it sounds like WCF is doing the same thing SimpleDataPortal is doing - i.e. detecting that the target method returns a task and therefore invoking it in an async-friendly manner. Do you agree?

ngm replied on Thursday, February 07, 2013

RockfordLhotka

Two questions then.

First, does this change the service contract? I'm guessing not - it is just an internal implementation difference?

Depending what you mean by service contract. It doesn't change actual service contract on the wire i.e. WSDL, but interface of the service - IWcfPortal has to be changed for sure. However, the proxy on the client can stay the same.

RockfordLhotka

Second, it sounds like WCF is doing the same thing SimpleDataPortal is doing - i.e. detecting that the target method returns a task and therefore invoking it in an async-friendly manner. Do you agree?

That's exactly it.

Whatsoever, that's how I found this blocking at the first place - by following all your awaits through the whole server-side call stack. It's not because my application server stopped serving the requests ;)

- ngm

 

RockfordLhotka replied on Thursday, February 07, 2013

I made this change (a little trickier than expected because I had to split the .NET 4 from 4.5 implementation.

It seems smooth enough - at least the existing tests pass and my sample apps run :)

ngm replied on Thursday, February 07, 2013

That's nice!

Yeah, .NET 4 is still in APM world. But it shouldn't take too much to support that as well.

You'll need to have separate, APM base IWcfPortal for .NET 4, something like:

  [OperationContractAttribute(AsyncPattern = true)]
  [UseNetDataContract]
  IAsyncResult BeginFetch(FetchRequest request, AsyncCallback callback, object asyncState);

  WcfResponse EndFetch(IAsyncResult result);

AsyncPattern attribute is important here.

That interface should be used both for proxy - ChannelFactory and service - WcfPortal.

Implementation wise, on the WcfProxy, you should be able to use Task.Factory.FromAsync and get a task from BeginFetch / EndFetch methods on the proxy. So you'll get rid of all BackgroundWorker clutter. That way, .NET 4 clients are not in disadvantage of spinnning separate thread.

Oh boy, do I hate that BackgroundWorker, especially used so low in the Data Portal stack ;)

As for WcfHost, I would go with something along these lines:

        private IAsyncResult BeginFetch(FetchRequest request, AsyncCallback callback, object state)
        {
            ...
            var task = portal.Fetch(...request params...);

            task.ContinueWith(t => callback(t), TaskContinuationOptions.ExecuteSynchronously);
            ...
   
            return task;
        }

        private WcfResponse EndFetch(IAsyncResult result)
        {
            return ((Task<WcfResponse>)result).Result;
        }

Didn't try, but that should do it.

Please note that this pattern will not start additional task / thread, it relies on any await lower in the call stack. However, if there's no offload or IO async used, this BeginFetch might be long running guy. That's not good practice for sure. So you might decide to wrap asynchronously the call to portal.Fetch just to ensure that you returned from BeginXXX as quickly as it requires to start the task.

- ngm

 

RockfordLhotka replied on Friday, February 08, 2013

I don't think I'll put in the time/effort for .NET 4 though. Always looking forward, that's the way to be! :)

Copyright (c) Marimer LLC