I'm trying to implement a background workorder in an app that retrieves potentially large read only collections and displays in a data grid.
It's pretty easy to implement however I'm stuck with a bit of a problem, how can you go about cancelling the retrieval?
Since the business object might be across a boundary on a dataportal it seems like a bad idea to hook it up into the events of BackGroundWorker.
Any ideas or hints as to how you have used this would be appreciated, if I learn anything more I'll post it here, I'm sure it will be of interest to others at some point.
This article (2 parts) on Infragistics site explains how to use the Background worker component with their grid but I don't think it is specific to UltraWinGrid. It does not address the Dataportal issue but still might be helpful information.
http://devcenter.infragistics.com/Support/KnowledgeBaseArticle.aspx?ArticleID=9838
http://devcenter.infragistics.com/Support/KnowledgeBaseArticle.aspx?ArticleID=9839
Thanks Steven, I saw those (I'm using the same grid actually), that is a good article but it addresses a different problem that again I'm thinking is hindered by using a dataportal, in their article you could fill a grid dynamically in a background worker and cancel it at any time as it's loading each record. So the end user would see the grid filling up and could cancel at any time.
That seemed ideal, but because the read only collection returns the complete object and does it in the dataportal fetch method it could be running on a remote data portal and might not work very well with the events involved in teh background worker etc.
I've actually timed it and the majority of time is spent filling the collection after the database query has returned the result set so I'm contemplating the idea of moving the actually filling the collection code out of the dataportal and into the shared fetch method so I could hook into it with the background worker, but I have at least 65 of those collections in my app that are all debugged and in production so instead I was thinking what if it just does as it does now (all at once when it's done instead of bit by bit as in that article on infragistics site) but the user can cancel the operation (so they can apply more filters to the query if it's taking too long).
I did find the answer and replied to my own post with it, someone has extended the background worker class and I followed that concept.
Found the answer after doing some more searching: an immediately stoppable background worker:
http://weblogs.asp.net/rosherove/articles/BackgroundWorkerEx.aspx
I would be careful with a scheme like that. I didn't look at his code, so I don't know for certain, but odds are good that he's aborting the worker thread. That is dangerous in general, and is particularly problematic in a distributed setting (data portal, web service, you name it).
The reason aborting a thread is dangerous in general is that there are a few, rare, cases where you can get .NET itself into an indeterminate state (read: cause unpredictable crashes). I don't recall the details, but they have to do with aborting the thread when the thread is in the middle of constructing a new object - like a new business object, new String or whatever.
The reason aborting a client-side thread that has made a synchronous server-side call is dangerous is that all you've done is aborted the client-side thread. The server-side call is still active, and so the processing will continue on the server until it is complete. Then the results are essentially discarded into the bit bucket. I've personally seen cases where the server-side components got confused by the lack of a client and went into infinite loops and stuff like that - though not with remoting or web services.
I think a far better solution is to "chunk" your requests for data. Rather than having your read-only collection's factory directly call the data portal with a criteria object, instead you can have it call either another (private) read-only object or a CommandBase-derived object that goes to the server and gets just a chunk or page of data. When it returns, you add that data to the actual read-only collection and then repeat to get the next page. When there are no more pages of data, you are done.
This allows for cancelation, because between each page there's a point at which the factory method can decide not to get any more pages.
dean:If your lucky enough to use sql 2005 its new row_number() capabilitity can apparently make this approach much easier - see this article:
Custom Paging in ASP.NET 2.0 with SQL Server 2005
http://aspnet.4guysfromrolla.com/articles/031506-1.aspx
Dean
Unfortunately we use a data access layer so we're supporting sql 2000 and up as well as FireBird and probably Oracle down the road. Everything becomes so much more tricky when your supporting multiple databases. I'll have a look though and see if I can get some inspiration from it for alternatives thank's for replying.
Hmmm..definitely something to ponder and test out. I'd hate to cause an IIS server to crash when remoting, definitely not a good thing.
We initially just went with giving people the ability to filter their request, but people being people they seem to always just go for the whole list and in some cases it's taking minutes for the request to finish.
I'm refactoring what is dozens of read only collection based grid forms into hopefully as few or ideally one single form if possible to cut down on redundant code and make the port to the web interface as easy as possible. It's amazing how a project can grow beyond what you expected very quickly and it's seemingly so much easier to just crank out another form when you need one.
I'd love to do the chunking thing, but I'll really have to ponder how to make it work.
For it to really work the business object would have to report back to the client after every chunk with the results of that chunk and check if the end user has cancelled. This seems to be moving towards putting some of the business objects functionality in the client in a way.
Plus I'm supporting both SQL server and Firebird so I'm not sure how exactly to split the query into chunks in a DAL scenario.
Time for some experimentation I guess.
I hope this off-the-cuff response is helpful.
Thank you, it gives me much to ponder. Unfortunately we're using CSLA 1.3 (plus many modifications) ; there are no dataportal events in 1.3 AFAIK so I'd have to look at how that works in csla 2.x and see what can be done with it.
I've been looking over the 2.x book which I just got last week and it's clearly almost entirely different than the code base we're using now.
I'm considering porting the app to csla 2.x but that won't happen for at least a year, it was just released in January after two years of development and is quite huge so it's not going to be an easy turn around by any means.
However all in all I think the best idea is forcing or warning the user if no filter is applied, I can add that functionality quite easily and then it's clear to the user that they should be prepared to wait.
I've also identified that the largest bottleneck with large read only collections is the code in fetch that adds the safedatareader records to the collection itself so I'm looking at how to speed that up. The actual sql query is pretty fast on either firebird or mssql for all but the most staggeringly large result sets.
Thanks again for replying so thoroughly to my dilemma.
You may want to look at upgrading to v1.52 as I believe 2 DataPortal events were added in v1.5..
For the record I've been playing with this and a profiler and have discovered that it's a very fast operation to count the resultset without actually retrieving it in both FireBird and MSSQL (our app supports both), same query, same critiera system etc, just no columns specified and a Count(*) instead which works on both Firebird and MSSQL.
In fact a query that took 62 seconds on Firebird (8 seconds on sqlexpress!) to query and populate the read only collection takes only 1.5 seconds to get the count for the same query on FB and a fraction for mssql.
So what I'm thinking is I'm going to add a global setting to the application that is user configurable to pop up a warning if the potential result is over X number of records. We'll default it to 1000 or whatever in the end seems appropriate and then let the end user adjust it accordingly as well as turn it off completely if desired.
This way they get a warning before a potentially time consuming result set, no warning before a minimally time consuming one, I don't need to worry about events and they can have full control over whether to accept or refine their criteria. As well it's a very minor change to the dataportal methods as I can just add a count property in the criteria object.
If I had known just how efficient it is to count records on both platforms I would have gone with this from the start, it's potentially handy for other things as well.
Cheers to all for your comments, I'm sure they will be helpful to others as well down the road.
JCardina,
Two comments:
1) Make sure you're testing everything using a remote dataportal so that you'll know if you're doing something bad. (Like expecting direct response from inside dataportal_fetch)
The way you put it, you'll need to make 2 dataportal calls instead of one.
2) Querying with count(*) is faster because it doesn't return all the data, but be aware that if the where clause you're using doesn't have an index that covers your query, it'll read all the table from start to end. (All that is an over simplification of a bigger issue that is not the point of this thread, but I bet a new thread will make a good discussion on the subject.) This stresses your server if you're working on big tables and you'll be doing that operation twice regardless of the cost of returning the results, which the user will probably want anyway.
Now, on that "the user will probably want anyway" part, generally, you have a rather static set of users, which generally use the app frequently, and eventually, learn what's good and what's not. So it'll come to a point when the user will be well aware of the time it takes the app to return a result set, and if it's big, you'll be forcing an unwanted overhead (of checking for count(*)). Besides, it adds a lot of repetitive coding which you will have to create and maintain.
If your user base is not stable (such as a web page of some sort), then _maybe_ it might be a better idea.
Anyway, in general it is best to do a good analisis and force better filtering before the user gets a change to call the dataportal for the lenghty tables.
Andrés
Now, on that "the user will probably want anyway" part, generally, you have a rather static set of users, which generally use the app frequently, and eventually, learn what's good and what's not. So it'll come to a point when the user will be well aware of the time it takes the app to return a result set, and if it's big, you'll be forcing an unwanted overhead (of checking for count(*)). Besides, it adds a lot of repetitive coding which you will have to create and maintain.
That's a very good point and the more I think about it the more it does make sense. A lot of repetitive coding is something I don't want to do right now and you are right of course the users do quickly get a feel for what is a good idea and what isn't.
Lot's of good stuff in this thread, thank you.
JHurrell:Is there a legitimate busines case for allowing users to work with 1000 records?
Unfortunately yes: Reporting.
Our app is designed to report off what you see in the grid. It's an elegant solution, they filter, sort whatever and the report they get is the same exact data which is very handy and intuitive.
Unfortunately for reporting purposes it's not unusual for users to want a *lot* more data than they would normally want to view on an ongoing basis.
It does give me some ideas though, since I'm moving the filtering code out of the built in method in the Infragisitcs grid component into a separate form so users can save filtered views and select them from a drop down, one of the criteria I could add as an option to that view is a maximum results field.
Or perhaps put a maximum results option in the toolbar as a drop down and default it low, so they could select 100, 1000, 5000 or all.
This "refactoring" all my read only collection grid forms into one is turning out to be very cool, now I can use some of these ideas quickly and easily and if something new comes along pop that in as well.
It's amazing how enslaved you can become to a design just because of sheer quantity alone.
Copyright (c) Marimer LLC