batchqueue or worker thread for expensive dataportal method

AzStan posted on Wednesday, March 19, 2008

I have an expensive database script that I need to run when a certain business object is updated. I want the cheap part of the update to run and then return to the client. The expensive method will run and then generate an e-mail message with the results. The expensive operation involves the complete scripting of a new database.

It seems that the old BatchQueue service/assembly from Rocky's 2004 Expert C# Business Objects would provide this capability, but might be overkill for what may just be a long running worker thread process.

If I do run it as a worker thread, what are the issues associated with launching (and leaving) such a thread from within IIS? Should I use delegate.invoke(), thread.start, or launch it as standalone executable?

Thanks in advance.

tmg4340 replied on Wednesday, March 19, 2008

I am not an expert here, but I think trying to do what you want could prove problematic in an IIS environment. IIRC, IIS doesn't take into account worker threads that your process starts when determining if it's done handling the request. IIS is stateless, so essentially every call to the server can be considered a single transaction. Once the IIS process is done processing the request, it tears down all the scaffolding it created to handle that call. If your worker thread isn't done by then, it's basically aborted, and your long-running process will get cut off wherever it is. I know that ASP.NET 2.0 introduced an asynchronous page model, but I don't think that'll get you what you want.

Any proposed solution is going to depend on what your client is written in. If it is a web client, then you're going to have issues, for the reasons I just discussed. I'm honestly not sure what to recommend - the typical solutions in this area usually revolve around some sort of message-based solution using MSMQ and something like BatchQueue. That obviously introduces another couple of layers of complexity to the overall architecture.

If it is a Windows client (and you're hosting your remote DataPortal within IIS), I'd probably investigate separating the two calls on your client. The first "cheap update" would run and do its thing. Upon successful completion, you could then issue a CommandBase-derived object call to do your long update on a separate thread on your client, using the technique of your choice to accomplish that. Then your client can keep the worker thread going, waiting for your remote DP call to return. Ultimately, you're not getting a return value, so all you're waiting for is to see whether you get an exception, and doing whatever you need to to handle that.

The only thing you'd have to consider investigating is any threading issues with the client DataPortal. I believe Rocky has mentioned that the client DP is not thread-safe, and if your long-running process is long enough, it's conceivable the user could issue another DP call on your UI thread while your long-running call is still going. In the end, it shouldn't be a big issue for what you're trying to accomplish, but it would certainly be worth testing. I don't think you can deal with that with a custom proxy or transport implementation.

HTH

- Scott

JoeFallon1 replied on Thursday, March 20, 2008

I agree with Scott's comments.

I would definitely split things in two.

I have a table where I can make an entry saying that the long running job needs to occur.

Then a scheduled program runs every hour (or once a day or every 10 minutes or ....).

This program checks that table for work and then performs it if a record exists. Think of it as a simple way to implement MSMQ.

Joe

AzStan replied on Thursday, March 20, 2008

Thanks to Scott and Joe, your comments are helpful.

I had anticipated the need for a state management database table, similar to the one Joe mentioned, to prevent an impatient user from restarting a job that is in progress. I'm researching the use of MSMQ to run the long job, but your frequently scheduled task would certainly get the job done.

I am building a web client app in this case, so can't have the client component re-issue the request for the long-running job. I think I would still prefer a single call to the dataportal anyway.

Thanks, Scott, for saving me the trouble of learning (the hard way) that IIS would kill my worker thread.

On a more general note. I've always thought that Rocky's BatchQueue service was pretty cool, and have wanted to implement it in an app. I'm wondering what approach he would take today given our new 3.0 framework.

tmg4340 replied on Thursday, March 20, 2008

I wouldn't necessarily take my IIS comments as gospel; I'm not a web expert. I'm going off of memory here - I remember reading something like this out on the web somewhere, but it's been a while. But even if I am wrong, I think you still have a problem, since HTTP is still a single request-response model. Even if you can spawn a worker thread to run your long-running job, it's still within your IIS process, so the server will still wait for that thread to complete before returning to the browser. So your worker thread doesn't gain you anything there. And I don't know of any way to spawn an entirely new process from IIS, which is basically what you would need.

If a scheduled polling task will get the job done, I'd probably go with that. MSMQ isn't hard, but there's no need to introduce another technology to the party unless you really need it. And implementing some protection against restarts may actually be easier using a database table instead of writing MSMQ message management code.

- Scott

ajj3085 replied on Thursday, March 20, 2008

Actually I think the worker thread would continue running and the page thead would return the page when it's done running. IIRC. So the only drawback will be you have an expensive process running on one of your IIS threads, which hurts performance and may introduce other issues if another thread is spun up to deal with the same job.

That said, your suggestion is certainly the way to go; anything that can be really intensive should not be done within the IIS context, instead in something else, perhaps a Windows service. Not only does keep IIS performing better, it gives you the chance to move said service to another machine entirely if the need presents itself.

I'm not a web expert either though. Smile [:)]