Static constructor performance.

Static constructor performance.

Old forum URL: forums.lhotka.net/forums/t/6578.aspx


markell posted on Sunday, March 08, 2009

Hi folks.
I would like to ask a question about the performance impact of having the static constructor. In the book Expert C# Business Objects on page 249 there is a short discussion devoted to the subject. The statement that having the static constructor adversely affects the performance seems so interesting to me that I have taken the liberty to test it.

Here is my test app:

public static class C1
{
  public static int N;

  static C1() { N = (new Random()).Next(); }
  public static int f(int p) { return p + 1; }
}
public static class C2
{
  public static int f(int p) { return p + 1; }
}
class Program
{
  static void Main()
  {
    const int N = 1000000;
    int i, x = 0;
    Stopwatch sw = new Stopwatch();
    for (i = 0; i < N; ++i)
    {
      sw.Start();
      x = C1.f(x);
      sw.Stop();
    }

    // Make sure no code gets optimized away.
    Console.WriteLine("{0} iterations of C1.f took {1}ms (C1.N = {2})",
      x.ToString(),
  sw.ElapsedMilliseconds, C1.N);

    x = 0;
    sw.Reset();
    for (i = 0; i < N; ++i)
    {
      sw.Start();
      x = C2.f(x);
      sw.Stop();
    }

    // Make sure no code gets optimized away.
    Console.WriteLine("{0} iterations of C2.f took {1}ms", x.ToString(),
     
sw.ElapsedMilliseconds);
  }
}

I have compiled it in the release build (VS 2008) and here are the results:
1000000 iterations of C1.f took 1764ms (C1.N = 1876561082)
1000000 iterations of C2.f took 1757ms

I am running on Intel(R) Core(TM)2 Duo CPU P8400 @ 2.26GHz

According to this simple test there is almost no performance penalty in having the static constructor. So, I wonder whether my test is wrong or there are some changes in the way C# compiler creates code, which have eliminated the performance impact described in the book.
What do you think?

Thanks.

tmg4340 replied on Sunday, March 08, 2009

This is not how I typically see timing code written.  I'm not saying it's wrong - but I'm betting you would get different results if you didn't start and stop around each method call.  Typically, the stopwatch is started before the loop is started, and stopped after it's ended.  You can argue that it's not a true timing, since there is code that is not germane to the test factoring into the overall time.  But it's also closer to how your "real world" code is executed, and the extra instructions aren't likely to drastically affect the overall timing.  Having the stopwatch code inside the loop is going to interrupt the actual flow of the loop, which would tend to even out the results.  It also adds quite a bit of time to the overall performance.

I took your code, moved the "Start" and "Stop" calls outside the loops, built it in Release mode, and ran it.  The first loop took 7ms, while the second loop took 2 ms.  I'm running on a P4 3GHz.  Obviously, 7ms vs 2ms is not going to make anyone complain about performance - but it still took almost four times as long, and illustrates what I was talking about.  Extrapolate that out to a class that does "real" work, and you'll see what Rocky was talking about.

HTH

- Scott

markell replied on Sunday, March 15, 2009

Do you mean 7ms for 1,000,000 iterations, which is 7ns per iteration.
So it is 7ns vs 2ns, when the functions are practically empty, so we measure just the method invocation time. If the functions did anything useful, the difference would have been negligable.
I still do not see why should I bother about static constructors.
BTW, I have not noticed any change in reflector for both calls. Can you explain it?
Thanks.

tmg4340 replied on Sunday, March 15, 2009

Your test is testing how long it takes to call static methods in a class, which in this (and almost any) case would be negligible.  Rocky's discussions concerning performance relate to calling instance methods in a class that also has a static constructor.  That is a different set of code to test, and I have to assume Rocky tested that rather extensively - otherwise it wouldn't have warranted mention in the book (or in the numerous forum discussions surrounding this issue.)

You won't see any differences in Reflector because your test code only has static methods.  If your test classes were more similar to CSLA BO's - i.e. had instance methods, a static construcor, and test loops that called the instance methods - I would expect to see different code in Reflector, as well as different performance profiles.

HTH

- Scott

ajj3085 replied on Monday, March 16, 2009

Honestly I'm not sure about the warning against using static constructors.  At this point, we (the Csla community on 3.6) are using static field initializes, and as far as I can tell... the compiler does this by creating a static constructor for you.

Check it out sometime using MSIL.  I believe this code:
private static int x = 35;

Ends up being compiled to this:

private static int x;

static MyClass() {
    x = 35;
}

So if you have your own static ctor code, the x initialization will be done prior to your code.  So if you have this:

private static int x = 35;

static MyClass() {
   doSomething();
}

It will compile to:
private static int x;

static MyClass() {
    x = 35;
    doSomething();
}

If you find something different let me know... but I think static field initializiers get the same performance hit as just using your own static ctor.

markell replied on Monday, March 16, 2009

Well, I have changed the test to measure instance methods. The results are consistent with the static calls. Invoking an instance method of a class with the static constructor takes 1 to 4 ns more on average, which amounts to 1 to 4 ms on 1,000,000 calls. In my opinion, it is negligible.

I have also compared the IL code in reflector and found that both loops differ in just one opcode -
the one with the static constructor calls ldloc.3 where the other one calls ldloc.s c2  (c2 is the instance of the class without the static constructor). However, since I do not grok IL, that says absolutely nothing to me, except that probably ldloc.3 takes slightly more time to execute than ldloc.s c2.

Copyright (c) Marimer LLC