Caching large lists of data

Caching large lists of data

Old forum URL: forums.lhotka.net/forums/t/4723.aspx


dtabako posted on Monday, April 21, 2008

I am trying to figure out the best way to handle large lists of data (in a readOnlyBase - readOnlyList setup). I have a setup in my app for selecting records that I like from a functional perspective. I have a comboBox on the entry/edit form that contains a list of, say, customers. The user can switch the display value in the list between the customer Id and the customer name (since both values are in the CustomerInfo object, I just programmatically change the "DisplayValue" property on the comboBox based on a radioButton selection). I have also set up the auto-complete feature on the comboBox. So it all works very nicely. The problem is my customer list could potentially be pretty large - we have clients with upwards of 10,000 customer records. And it gets worse with inventory (I'd like to use the same setup with my Inventory Items entry/edit form) - we have clients with 50,000 or even 100,000 inventory items.

My question is, are these lists too long to reasonably cache? In  the old app that I'm rewriting, we utilize an archaic search function. If these lists are too big to cache (or read in full in the form's OnLoad) then I can create a search dialog. I just like what I've done so far from a functional perspective and hate to have to abandon it. And I don't have the time to enter thousands of records of test data (we will eventually have a data conversion process in place but I probably won't have that piece for quite some time) to test the performance. I was interested to know if there was anyone else out there that was handling very large lists of data and if so:

a) Does it look like I'll be able to keep my current model

b) If not, what have you done that's been successful?

Thanks alot,

Dennis

sergeyb replied on Monday, April 21, 2008

I have tried this before as an experiment.  I tried to load a list of 20,000 items into memory.  It would take about 6-7 seconds to load the list, although it was wider than yours.  UI would also take a hit to paint.  Because of that I abandoned that route and when with search function similar to the one you describe.  This changes was actually welcomed by the users because none of them intended to look at thousands of items, usually at one or two.  Search function was pretty much instant even on 20,000 items even without full-text search capabilities on back end.

 

Sergey Barskiy

Senior Consultant

office: 678.405.0687 | mobile: 404.388.1899

cid:_2_0648EA840648E85C001BBCB886257279
Microsoft Worldwide Partner of the Year | Custom Development Solutions, Technical Innovation

 

From: dtabako [mailto:cslanet@lhotka.net]
Sent: Monday, April 21, 2008 3:53 PM
To: Sergey Barskiy
Subject: [CSLA .NET] Caching large lists of data

 

I am trying to figure out the best way to handle large lists of data (in a readOnlyBase - readOnlyList setup). I have a setup in my app for selecting records that I like from a functional perspective. I have a comboBox on the entry/edit form that contains a list of, say, customers. The user can switch the display value in the list between the customer Id and the customer name (since both values are in the CustomerInfo object, I just programmatically change the "DisplayValue" property on the comboBox based on a radioButton selection). I have also set up the auto-complete feature on the comboBox. So it all works very nicely. The problem is my customer list could potentially be pretty large - we have clients with upwards of 10,000 customer records. And it gets worse with inventory (I'd like to use the same setup with my Inventory Items entry/edit form) - we have clients with 50,000 or even 100,000 inventory items.

My question is, are these lists too long to reasonably cache? In  the old app that I'm rewriting, we utilize an archaic search function. If these lists are too big to cache (or read in full in the form's OnLoad) then I can create a search dialog. I just like what I've done so far from a functional perspective and hate to have to abandon it. And I don't have the time to enter thousands of records of test data (we will eventually have a data conversion process in place but I probably won't have that piece for quite some time) to test the performance. I was interested to know if there was anyone else out there that was handling very large lists of data and if so:

a) Does it look like I'll be able to keep my current model

b) If not, what have you done that's been successful?

Thanks alot,

Dennis



Fintanv replied on Tuesday, April 22, 2008

A nice addition to Sergey's suggestion is the ability to persist the search criteria.  The users then can have access to their list of favorite search queries, and can execute them quickly and easily.  Most of the time they are going to be running the same handful of searches time and again.

-- Fintan

JoeFallon1 replied on Tuesday, April 22, 2008

I agree. I do keep the search criteria around for the session.

For Report search criteria I save it to the DB and let them retrieve the named criteria that they saved and re-use it. Very useful feature.

Joe

dtabako replied on Wednesday, April 23, 2008

Thanks for all the replies everyone. It looks like a search function is probably the way to go. That will actually match the functionality of the old app (although the new search function will be much better). I liked what I had from a functional and design perspective. It would have been a big imporvement over the old functionality. But from a performance perspective it probably wouldn't have worked - at least for some of our clients.

Thanks for all your help and if anyone has any advice on implementation details for creating a generic search dialog that can be used for any ReadOnlyList, I'd be happy to hear it.

Thanks, Dennis

Justin replied on Wednesday, April 23, 2008

If your using MSSQL Server then why not use TOP, as in "select TOP 100 * from Customers where Name like @NameCriteria% order by Name", we use this pattern for most of our searches and results in sub second response in combo boxes backed by tables with 3million+ rows.

Justin 

chrisghardwick replied on Monday, April 21, 2008

Dennis,

I've been through this a few times. It depends a ton of a few factors, and typically comes down what preforms better in your enviroment.

In most cases I've opted to use a hybrid approach when I need cached data using the Patterns and Practices Enterprise Library caching.

I created a custom ReadOnlyListBaseCache object that inherited from ReadOnlyListBase, which would cache collections using the EnterpriseLibrary Caching functionality.

On a new search, I would load the information from the database, and keep the result in memory (for 15 minutes or the life of the form) for the user in case they wanted to revisit the same list information again.

Hope this helps.

Chris

 

JoeFallon1 replied on Tuesday, April 22, 2008

In my Web app I try to keep lists around for the life of the page (including postbacks). When the user makes a selection and navigates off the list page, I purge the list from the cache.

I also have a system option which allows the client to set the max number of records in the list. If the number of rows exceeds that value then the list is not returned and they have to perform a search instead.

Joe

 

AaronErickson replied on Saturday, April 26, 2008

You should look and see if the indexing features of 3.5 would work, at least on the search side.

Cache it once, index the searches you are likely to do, and suddenly list size itself is pretty much just a memory consideration.

To do it, put [Indexable(IndexMode.Always)] on the field you are expecting to search by, and then use Linq to Objects to do the search.

Copyright (c) Marimer LLC