Here is a brief run down of the web service:
The web service encapsulates calls to a Windows WF library. Version 1 used the native WF ability to create web services; but the only way you can do authentication is to use impersonation for the ASPNET worker process. This wasn't an option and this also meant that anyone accessing the web service had to be an AD account - the Java and PHP applications hitting this can't easily pass in Windows credentials to IIS...so version 1.1
We already had a security framework in place that uses CSLA so we decided to use that and pass in the user name/password in a header - pretty much exactly like the PT web service does.
So we about 75 unit tests and we implemented the security framework and everything worked great - until we tried to see how this perform under load compared to the original version 1 implementation; we knew it would be slower because of the extra security checks, but this was acceptable.
When we ran the load test (100 users step load) we immediately started seeing problems...the unit tests were failing with 'not authenticated', 'bad user name', (the common one you see when you forget to turn on CSLA authentication)', and all sorts of things that indicated the security prinicpal was being lost. Keep in mind, that the unit tests in the load test all pass outside of a load test. Also keep in mind not all the tests are failing, only roughly half, and the ones that fail vary over time...for instance TestA may pass 5 times in row, fail once, pass 3 times, fail 4 times, pass 12 times, etc.
So what appears to be happening, and this goes against everything I thought I knew about how web applications/web services execute, is that underload we're running into this kind of conditions:
Test A Starts ---------- Test A does security check ------- Test A Ends
Test B Starts -------------- Test B does security check -------Test B Ends
Test C Start --------------------Test C does security check ----- Test C Ends
Now if you expand this to what eventually becomes 100 simulatneous users, what I think is happening is that in Test A, we've swithed the ApplicationContext.User to an anoymous user (which is a CslaPrincipal) in order to do authentication (we use a remote portal, so we have to use CslaPrincipal - however in this particular load test we're using the local portal instead). I think what is happening is sometimes while it's getting ready to make a call to the data portal, another request comes in which changes the HttpContext.Current.User back to the generic principal and that principal is getting passed and used by the DataPortal insteady of the anonymous CSLA principal.
Now this goes against everything I know about web apps/services. The way I understood it is that each request coming in will be handled uniquely - so HttpContext.Current.User will always refer to the user that started the request, even if other simultaneous requests are occurring; but the behavior seems to be more that HttpContext.Current.User is the last user who currently hit the system, and if you have enough simultaneous users, you will see HttpContext.Current.User differ from the beginning of the method to the end.. And we all know that Csla.ApplicationContext.User maps to HttpContext.Current.User in a web app/service; so if this is the case, this would explain the errors we were getting under load.
I was able to get everything working by putting a lock around call that authorizes the user through our CSLA library. I know this is terribly ugly and you should NEVER lock ASP.NET threads (thankfully none of our code is a long running process in the lock); but this is beyond my level of knowledge and I don't know how else to fix.
I know there has to be someone else out there who has a web app/service and uses CSLA authentication and I'm curious if they've load tested and found this, or if they have any ideas what I may be doing wrong here.
This was actually a red herring being caused by a 'catch' statement in our security framework that was swallowing a dead-lock condition in the database; once this was fixed everything worked the way I thought it should to begin with.
Love those!
That is good news! Thanks for posting the resolution to the problem. I was concerned about this thread before you made your 2nd post.
Joe
Copyright (c) Marimer LLC