Remoting hang under high volume / ConnectionManager

Remoting hang under high volume / ConnectionManager

Old forum URL: forums.lhotka.net/forums/t/12896.aspx


ajj3085 posted on Monday, June 08, 2015

We're having a problem introducing connection manager to our existing application.  While root objects usually share the SqlConnection with their children, there are a lot of places where other BOs get involved, which end up opening their own connection.  We've introduced ConnectionManager in another application successfully and this resolved the connection pool being exhausted (it was already set to 500) and seemed to improve performance a bit too.

However we're having trouble in our main app, which has a much larger user base.  I think we've narrowed the issue to some BOs which are used by the client to get updated information frequently, approximately every minute or so.

The issue seems to be that sometimes under high load, the Asp.Net AppPool hosting the remoting (yes, remoting) site hangs.  New remoting connections are accepted, but stay forever at waiting for response, and everything just stops.  The only solution is doing an IISRESET, which hard kills the w3p.exe process hosting the remoting site (it doesn't response to a normal recycle either, although IIS still thinks things are fine). 

I've had a look at the Csla code, and I see its doing some locking which is necessary on 2-tier apps but probably less relevant for hosting under Asp.net.  I've seen the thread around that here too so I think its unlikely to be our issue.  I'm wondering if somehow async / await in combination with ConnectionManager is causing connections not to be returned to the pool (we have some errors logged indicating this).

Anyone else encountered this?

RockfordLhotka replied on Monday, June 08, 2015

Is there a way to try using WCF instead of Remoting? Microsoft hasn't recommended using Remoting for client/server calls since around 2004, so perhaps there's some issue lurking there?

ajj3085 replied on Tuesday, June 09, 2015

RockfordLhotka
Is there a way to try using WCF instead of Remoting? Microsoft hasn't recommended using Remoting for client/server calls since around 2004, so perhaps there's some issue lurking there?

Well we are going to be moving to HttpProxy (to support Xamarin clients) but that is a bit down the road so in the short term we're stuck with remoting.  Do you think ConnectionManager isn't playing nice specifically with remoting though?  We did a hotfix to remove the new users of ConnectionManager and it seems to have resolved the hang issue.

RockfordLhotka replied on Tuesday, June 09, 2015

My suggestion around remoting is just to eliminate one possible issue.

But if you are confident that it is connectionmanager that's pretty telling.

Do you have any idea what specifically is causing the issue?

ajj3085 replied on Tuesday, June 09, 2015

RockfordLhotka
My suggestion around remoting is just to eliminate one possible issue.

But if you are confident that it is connectionmanager that's pretty telling.

Do you have any idea what specifically is causing the issue?

Yes, I'm fairly confident it's ConnectionManager.  The issue seems to be, ironically, that the connections are not being properly returned to the pool.  We already have the pool size at 500, from the 100 default.

Last year, when we were still on Csla 3.7 we tried to do a wholesale conversion from manually new'ing up SqlConnection to using ConnectionManager.  Quite literally a find/replace.  It worked well for a few days, but then we got these hangs.  We did see some errors logged about timeouts trying to get a connection from the pool.  We reverted the changes and redeployed, and things went back to normal.  The timing of the hang seems to be as our load increases (when our East coast customers begin coming online through when our West coast customers do).

We had hoped that our planned Csla 4.5 update would have some changes that would resolve this, and after we did the update we've been slowly converting things to use CM as we touch the BOs for other reasons since upgraded about six months ago (and are on the most current version).

We have a bunch of background threads (our SaaS app is a WinForms client) pooling for things at various times.  This last release we did some refactoring and added one more, and moved something that was not previously in the DB to the DB; both these threads basically are new hits to the DB and we did them with ConnectionManager.  The hangs returned, and I did find more errors indicating timeouts getting a connection from the pool.  I changed the server code used by these two threads to just new up the connection, and our hang has gone away again, so far.  We'll see, we just did the hotfix on Sunday so I want to give it a few more days, but based on our previous experience I don't think the issue will return.

So, that's all I have, and of course we've only seen this in production, so we haven't had a lot of opportunity to investigate as we have to get the system back online.  The issue does seem to only manifest during our busy times, which are mornings and especially weekends (Saturday / Sunday), so I suspect load has something to do with it.

RockfordLhotka replied on Tuesday, June 09, 2015

I'll add an issue in GitHub so we remember to look through the ConnectionManager code at some point. Clearly this won't be easy to find/fix unless we get very lucky.

RockfordLhotka replied on Tuesday, June 09, 2015

https://github.com/MarimerLLC/csla/issues/377

 

 

ajj3085 replied on Wednesday, June 10, 2015

RockfordLhotka
I'll add an issue in GitHub so we remember to look through the ConnectionManager code at some point. Clearly this won't be easy to find/fix unless we get very lucky.

Thanks Rocky.  I suspect if we're the first ones hitting this it probably isn't too important; and hopefully as we move to EF anyway this issue will go away.

Copyright (c) Marimer LLC