Web Service data portal regularly hanging in our testing - how to troubleshoot?

Web Service data portal regularly hanging in our testing - how to troubleshoot?

Old forum URL: forums.lhotka.net/forums/t/8909.aspx

rsbaker0 posted on Tuesday, May 11, 2010

We're testing the web service data portal in pre-production and it is "hanging" on a regular basis for reasons we don't understand. The typical error message is "The operation has timed out", with a call stack like this:

   at System.Web.Services.Protocols.WebClientProtocol.GetWebResponse(WebRequest request)
   at System.Web.Services.Protocols.HttpWebClientProtocol.GetWebResponse(WebRequest request)
   at System.Web.Services.Protocols.SoapHttpClientProtocol.Invoke(String methodName, Object[] parameters)
   at Csla.WebServiceHost.WebServicePortal.Update(Byte[] requestData)
   at Csla.DataPortalClient.WebServicesProxy.Update(Object obj, DataPortalContext context)
   at Csla.DataPortal.Update(Object obj)
   at Csla.DataPortal.Update[T](T obj)
   at Csla.DataPortal.Execute[T](T obj)

Recyling the application pool for the service in IIS immediately corrects the problem (but we have it set to automatically reset on a regular basis anyway but perhaps that is not working).

The memory being taken up by the pool does not seem excessive enough to cause the problem (50-80MB typically), and after recycling the pool it almost immediately allocates about the same amount of memory again anyway and continues to operate for a time.

When the service goes "down", it doesn't even respond to pasting the web service address into a browser, so it's like the entire application pool is toast. (The portal we are testing is the only thing running in the pool).

Any suggestions for how I might troubleshoot this?  We use log4net and I'd be willing to log all the data portal calls for a period if I knew the best place to do this. (If the last logged call is always the same, this would be a good clue).

RockfordLhotka replied on Tuesday, May 11, 2010

If you are running 3.8 the best place to log is in an IAuthorizeDataPortal provider, since that is the first thing called on every data portal invocation.

"First thing" is relative. It is called after:

  1. IIS has done whatever it does to handle an inbound client request
  2. ASP.NET has done whatever it does to handle an inbound client request
  3. the asmx service has been loaded and invoked by ASP.NET
  4. CSLA has deserialized the inbound byte stream into an object graph
  5. the server-side ApplicationContext has been set

But still, it is the first place you can get into the call chain to log the request within the context of CSLA.

You could also get into the call chain in ASP.NET, global.asax events are invoked when using asmx services.

rsbaker0 replied on Tuesday, May 11, 2010

That's a good thought (and gave me exactly what I needed in terms of where to look), but our 3.8 development is still in progress and everyone is on 3.5.1 at the moment. Also, I'd like to be able to bracket each call (e.g. record the entry and exit to the extent possible with timestamps).

Making minor changes to and rebuilding CSLA itself is something we have done before. If I just go into the Csla.Server.DataPortal class and put simple logging calls into Create, Fetch, Update, and Delete, will I pretty much have covered it? (Seems like I would be somewhere between your 4 and 5 above).

RockfordLhotka replied on Tuesday, May 11, 2010

If you have custom base classes you can always put the calls into the pre- and post-processing DataPortal methods (DataPortal_OnDataPortalInvoke, etc).

rsbaker0 replied on Tuesday, May 11, 2010

I have an implementation in place and this is going to turn out to be very useful.  Based on what I am seeing, you might consider building some sort of formal logging interface into CSLA (rather like you did with your IAuthorizeDataPortal, with the default implementation being a NOP).

I had previously corrected a problem with SetLogicalExecutionLocation to store the "server depth" in the LocalContext to handle re-entrancy issues. So, in addition to being able to log the entry and exit of each data portal call, I can use this for the call depth and am using this to "indent" the log entries with padding so you can see the depth of each call. So the calls appear as tree structure in the log file.

I also added a static request number that increments with each call so you can match an entering call with another one potentially much later in the log as well as see when the application pool has been recycled and the request numbers start over again.

Since the calls are timestamped, it's almost like having a full profiler trace of the data portal activity  (essentially a high level object trace of the entire application)available on demand. Exceptions that occur during the calls are logged also.  This will be very helpful for diagnostics in general. I can post more details if anyone is interested (although there wasn't much too it).

Now if it will only help me catch this particular problem...



rsbaker0 replied on Wednesday, May 12, 2010

The offending web service was hung again this morning.  I captured some information from our SQL Server before recycling the application pool.

Evidently, there were some 60+ connections to the database from the ADO.NET connection pool on the machine hosting the web service, but no open transactions, and most of the connections last activity was not current.

Ordinarily, there are only 4-5 connections shown in this pool (I'm looking at it now). Also, the queries were trivial (and in fact the most common one only occurs in one place, the log in screen for the application).  Of course, it's possible that I have some sort of connection leak in the infrastructure, but you would think this would show up elsewhere (e.g. the client  application running over the local portal uses the identical BO/data layer).

Any other thoughts on what might cause the suddenly cause the connection pool to grow (or what I might look at the next time it happens to try to gather additional diagnostic information?)

Copyright (c) Marimer LLC