CSLA Indexed LINQ Queries

CSLA Indexed LINQ Queries

Old forum URL: forums.lhotka.net/forums/t/6774.aspx


JoeFallon1 posted on Friday, April 10, 2009

There is a new feature in CSLA 3.6 which uses advanced search technology to speed up LINQ queries.
A standard LINQ query for 1 million records could require 1 million comparisons to locate a given record.

Using an indexed query works just like a database index. The advanced data structure is known as a Red-Black tree and to find 1 record in a million requires at most 22 comparisons. As you can see it is significantly faster!

Bottom Line:
Indexed queries are useless in a Web environment.

Details:
If you apply a custom attribute to a property in a class then that property can be used in an indexed search.
e.g.
Indexable(IndexModeEnum.IndexModeOnDemand) _
Public ReadOnly Property Key() As Integer

When the collection loops over the dr and calls Add(item) the CSLA framework class BusinessListBase calls its InsertItem method.
This method does a lot of work but the key step related to indexed queries is the call to InsertIndexItem(item).

InsertIndexItem calls DeferredLoadIndexIfNotLoaded (which is called by most methods).
This checks if the internal field _indexSet is null or not. If it is null then it creates an instance of the IndexSet class using the item type as the T parameter.
The constructor for the IndexSet class loops over the list of propertyInfo objects for T and looks for the Indexable attribute. If it finds it then it stores the property name and a BalancedTreeIndex into an internal Dictionary named _internalIndexSet.

The bottom line is that the BLB collection knows which child properties are indexed.

The BLB method InsertItem then calls base.InsertItem(index, Item) to actually insert the item into the collection.
So as items are added to the collection the index is also “kept up to date”. Ditto for Remove, etc.
Think of it as the exact same thing as a database index which is kept up to date when you Insert or Delete rows.

Since it is “expensive” to build the index it must only be done in cases where you plan to do more than 1 search.

A good example for this is a MasterDetail pagethat lists the lines for each Header as it is selected.

If I choose another header record then I need to filter the lines collection by the Key to get back just those lines.

The indexed query “works” but not as I hoped.
This is because the index is destroyed when the BO is serialized to Session state.

The problem is outlined in Chapter 14 of the 2008 book:

Serialization and Indexing
When a CSLA .NET collection is serialized, the indices within the collection do not pass the serialization boundary.
This would be possible in theory, but the performance implications of having not
only the objects themselves but all the indices on those objects get passed over the wire are not insignificant.
In addition, the indexing mechanism depends on hash-code generation of objects that is
only consistent at the scope of a physical machine—that is, hash codes are not guaranteed to be
equivalent on, say, a 64-bit operating system and a 32-bit operating system—so it would be impractical
to translate the index values during serialization in the absence of a generic hash code generator
that could guarantee durability across machine boundaries.
Although the index itself is not passed over the serialization boundary, the index is re-created
according to the options specified on the child class. This typically happens upon the first instance
of a query that utilizes the index on the other side of the serialization boundary.

As you can tell from the above text – the index is destroyed every time we leave the web page because the BO is serialized into Session state.

Therefore the penalty to re-build the index is incurred every time you postback the page.

So the “advantage” of building the index once and re-using it is never achieved!

Therefore I conclude that it is better to not use this new feature of CSLA in a Web environment.

Joe

PS - this assumes that session is stored out of process using a State Server or SQL Server. IMO you should never run a Production app in-process anyway. The appdomain recycles way too often and your users lose their session.

PPS - the attribute above is missing the angle brackets - this forum won't show them so I omitted them intentionally.

RockfordLhotka replied on Monday, April 13, 2009

Thank you Joe, well put.

I added a link to this post from the new FAQ.

Copyright (c) Marimer LLC