CSLA performance issues with large data sets

CSLA performance issues with large data sets

Old forum URL: forums.lhotka.net/forums/t/10275.aspx


JFernando posted on Monday, April 18, 2011

Normal 0 false false false EN-US X-NONE X-NONE

Hello folks,

Though CSLA works quite nicely with smaller datasets I am seeing a performance issue when accessing larger data sets.  In particular I have a CSLA Read Only List Object that in some cases could contain over 19K child elements. In these cases, I am observing that Csla.ReadOnlyBase.LoadProperty chews up majority of time (in reflection) when building the list; and of course, this significantly impacts performance!   

 

{To the interested reader – the profiling data is given below}

 

Has anyone seen this performance issue before? 

Has this been fixed in the framework (I am currently using CSLA Ver 4.0.1.0) or is there a workaround?


Here’s the stack

Function Name

Inclusive Samples

Exclusive Samples

Inclusive Samples %

Exclusive Samples %

Module Name

-

MiddleTier.Library.AMS360ReadOnlyBase`1.LoadProperties(object)

6,562

1

96.01

0.01

MiddleTier.Library.DLL

-

MiddleTier.Library.AMS360ReadOnlyBase`1.LoadProperties(object,string[])

6,561

32

95.99

0.47

MiddleTier.Library.DLL

-

Csla.ReadOnlyBase`1.LoadProperty(class Csla.Core.IPropertyInfo,object)

5,544

43

81.11

0.63

Csla.DLL

 

System.RuntimeType.GetMethods(valuetype System.Reflection.BindingFlags)

1,633

1,633

23.89

23.89

mscorlib.ni.dll

 

System.Reflection.RuntimeMethodInfo.MakeGenericMethod(class System.Type[])

1,603

1,603

23.45

23.45

mscorlib.ni.dll

 

System.Reflection.MethodBase.Invoke(object,object[])

1,164

908

17.03

13.28

mscorlib.ni.dll

-

System.Linq.Enumerable.FirstOrDefault(class System.Collections.Generic.IEnumerable`1<!!0>)

929

334

13.59

4.89

System.Core.ni.dll

 

Csla.ReadOnlyBase`1.<LoadProperty>b__a(class System.Reflection.MethodInfo)

595

222

8.71

3.25

Csla.DLL

-

Csla.Reflection.MethodCaller.CallPropertyGetter(object,string)

570

50

8.34

0.73

Csla.DLL

 

Csla.Reflection.MethodCaller.GetCachedProperty(class System.Type,string)

413

9

6.04

0.13

Csla.DLL

   

And the function that takes up time

protected virtual void LoadProperty(IPropertyInfo propertyInfo, object newValue)

    {

      var t = this.GetType();

      var flags = System.Reflection.BindingFlags.Public | System.Reflection.BindingFlags.NonPublic | System.Reflection.BindingFlags.Instance;

      var method = t.GetMethods(flags).Where(c => c.Name == "LoadProperty" && c.IsGenericMethod).FirstOrDefault();

      var gm = method.MakeGenericMethod(propertyInfo.Type);

      var p = new object[] { propertyInfo, newValue };

      gm.Invoke(this, p);

    }

 

thanks,

 

JonnyBee replied on Tuesday, April 19, 2011

Could you show us an example of your DataAccess code?

If your DAL code could be changed to call the generic LoadProperty<P> you would not have the cost of reflection.

Another option would be to expand the Csla.Refelction,.MethodCaller with a new CallGenericMethod and  cache the reflected method.

 

 

 

 

tiago replied on Tuesday, April 19, 2011

Using

using (BypassPropertyChecks)
{
    DocID = dr.GetInt32("DocID");
    DocDate = dr.GetDateTime("Date").ToString();
    Subject = dr.GetString("Subject");
}

must be faster than

LoadProperty(DocIDProperty, dr.GetInt32("DocID"));
LoadProperty(DocDateProperty, dr.GetDateTime("Date"));
LoadProperty(SubjectProperty, dr.GetString("Subject"));

but I didn't test...

 [edit]

On secondthoughts, it must be the exact same timings...

[/edit]

JFernando replied on Tuesday, April 19, 2011

Normal 0 false false false EN-US X-NONE X-NONE

Per my understanding this codebase was initially setup by Magenic and the DataAccess in the BusinessObject is as follows

 

  private void Child_Fetch(AlertNotificationRecipientViewInfoDTO childData)

        {

            this.LoadProperties(childData);

        }

 

Which bubbles down to

 

protected void LoadProperties(object sourceBusinessObject)

        {

            string[] ignorelist = new string[0];

            LoadProperties(sourceBusinessObject, ignorelist);

        }

 

And

 

protected void LoadProperties(object sourceBusinessObject, params string[] ignoreList)

        {

            List<string> ignore = new List<string>(ignoreList);

 

            var listofRegisteredProperies = FieldManager.GetRegisteredProperties();

            foreach (var propertyName in GetPropertyNames(sourceBusinessObject.GetType()))

            {

                if (!ignore.Contains(propertyName))

                {

                    object value = MethodCaller.CallPropertyGetter(sourceBusinessObject, propertyName);

                    //This will avoid access to Modified Closure

                    string name = propertyName;

                    var pinfo = listofRegisteredProperies.Find(x => x.Name == name);

                    LoadProperty(pinfo, value);

                }

            }

        }

 

And now note the LoadProperty call.

RockfordLhotka replied on Tuesday, April 19, 2011

Although this technique is fine for small object graphs, you can clearly see how it wouldn't be appropriate for large object graphs.

I am firm believer in avoiding the anti-pattern of premature optimization - which is to say that I see nothing wrong with using a pragmatic solution like this code - when it works. The non-generic LoadProperty method exists for low-volume dynamic data scenarios, mostly centered around implementation of service interfaces or web UI postbacks. But if it works for DAL interaction in object graphs, then that's good - pragmatism is valuable.

But I'm also a firm believer in applying optimization when appropriate. Obviously a 19k element list is huge, and as I said in my earlier post, I'm not surprised that using the non-generic LoadProperty is causing an issue in such a scenario. When a performance problem is encountered, then fixing it is no longer premature optimization :)

There are three possible solutions.

  1. The best is to redesign the scenario to avoid retrieving such a huge list - but I understand that end users can be stubborn creatures, so this is often not possible :)
  2. The second best is to implement some type of paging - especially if this is a web app, where paging is expected. If it is a smart client app (WPF or SL in particular) I'd look at doing async background loading of everything after the first page. There's an example of this in the Samples download.
  3. The third option is to use the generic LoadProperty for this particular business class. I wouldn't give up the reusable productivity of the existing code (I assume it is widely used) for other types, just for this type.

Obviously these solutions can also be used in combination if necessary.

JFernando replied on Tuesday, April 19, 2011

Normal 0 false false false EN-US X-NONE X-NONE

Normal 0 false false false EN-US X-NONE X-NONE

Indeed, as indicated the perf issues becomes acute with large datasets and not necessarily visible with smaller data sets.   If possible I would like to keep the coding pattern consistent, and this would be possible if the framework could implement a generic solution that could scale by caching and reusing the dynamically created info… thoughts?

 Yes, though we reduce the data transfer volumes fine-tuning the query or via paging, there are scenarios where we have to access a bit of data… and unfortunately this is one of them…

From what I understand, the pattern set by Magenic in our codebase was to merely uses the non-generic LoadProperty pattern, Do you have some sample where you use the generic LoadProperty overload? If so, can you share this sample? 

RockfordLhotka replied on Tuesday, April 19, 2011

Nearly every sample in the Samples download, and all the samples provided in the Using CSLA 4: Data Access ebook use the generic LoadProperty overload - either directly or (more commonly) indirectly through the property setter with BypassPropertyChecks.

As I said in my previous post, looking at the code you provided, it makes sense how the non-generic LoadProperty is being used to implement a dynamic data loader. That approach is fine for smaller object graphs, but obviously won't work for a list of 19k items.

If the DAL implementation doesn't allow for plugging custom code for a  specific object type, you'll need to enhance the implementation so it has that flexibility.

appWPF replied on Monday, February 10, 2014

Hi, I am using WPF and I don't have silverlight installed. So I was wondering if the sample your are talking about, concerning paging, do you have it in WPF version?

RockfordLhotka replied on Tuesday, April 19, 2011

fwiw, "enhancing the implementaiton" looks pretty easy. It looks to me like you can make the LoadProperties method virtual, so a specific business class can override the method to implement strong-typed behavior when needed to optimize for performance.

JFernando replied on Tuesday, April 19, 2011

Normal 0 false false false EN-US X-NONE X-NONE

Indeed, thanks all – this has been helpful.  We have ~80% perf improvement on this scenario.

tiago replied on Tuesday, April 19, 2011

RockfordLhotka

fwiw, "enhancing the implementaiton" looks pretty easy. It looks to me like you can make the LoadProperties method virtual, so a specific business class can override the method to implement strong-typed behavior when needed to optimize for performance.

Hi Rocky,

What do you mean by "enhancing the implementation"? I looked it up in Using Csla4, both Object and Data Access but found nothing like it.

RockfordLhotka replied on Wednesday, April 20, 2011

In the ebooks I show the direct techniques for implementing the interface to the DAL, using strong typing. I don't have a dynamic loader example, because (imo) that's an advanced scenario. So is creating a more abstract dynamic DAL.

The people on this project implemented a more abstract dynamic DAL, presumably to meet some client requirement(s). And (from the code posted in this thread) it looks like the abstraction was done in custom base classes - which is a recommended technique, so that's good.

It just looks like the DAL methods weren't virtual, and that'd prevent a specific subclass from overriding the dynamic implementation with a strongly typed implementation when necessary.

So "enhancing the implementation", in this case, probably (hopefully) just meant adding the virtual keyword, and overriding the method in the specific business class that was having the perf issue.

RockfordLhotka replied on Tuesday, April 19, 2011

Why is your DAL using the non-generic LoadProperty?

It is not a surprise at all that this method is comparatively slow, because it does use reflection. It isn't really designed or intended for bulk data loads, where you should have access to the static metadata field for each property and can therefore use the generic LoadProperty overload.

In fact, the use of reflection in this non-generic overload exists specifically so it can invoke the generic implementation - by dynamically creating the info necessary to invoke that generic implementation.

Copyright (c) Marimer LLC