Bug in MethodCaller

Bug in MethodCaller

Old forum URL: forums.lhotka.net/forums/t/11363.aspx


tbaldridge posted on Wednesday, May 16, 2012

I've found a bug in the way the MethodInfo is cached in CSLA. I'm using the most recent release. The issue comes down to this code: 

The _methodCache in MethodCaller is not thread safe. This means that if one thread is making the first DataPortal call to an object while another thread is doing the same. One of the threads either hangs or we get an invalid argument exception. 

We can duplicate this with a common Dictionary<int, int> as follows:

 

 

            IDictionary<int, int> dict = new Dictionary<int,int>();

            while (true)

            {

                Console.Write(".");

                var x = Enumerable.Range(0, 20)

                                  .Select(t => Task.Factory.StartNew(() =>

                                  {

                                      dict[t] = t;

                                      return dict[t];

                                  }))

                                  .ToArray();

                Task.WaitAll(x);

            }

I've developed two solutions to this problem. 1) We can switch the _methodCache to use the ConcurrentDictionary, but then we're locked into .NET 4 +. Otherwise, we can wrap Dictionary into a new class called "SafeDictionary" and do something like this:

 

 public class SafeDictionary<TKey, TVal> : IDictionary<TKey, TVal>

    {

        private Dictionary<TKey, TVal> _d;

        public SafeDictionary()

        {

            _d = new Dictionary<TKey, TVal>();

        }

 

 

        public void Add(TKey key, TVal value)

        {

            lock (_d)

            {

                _d.Add(key, value);

            }

 

        }

 

        public bool ContainsKey(TKey key)

        {

            lock (_d)

            {

                return _d.ContainsKey(key);

            }

        }

 

 

etc.....

 

Or app is currently suffering form this bug. They are few and far between, but about once every few days our app just blows up due to an internal CLSA issue dealing with these global thread-unsafe dictionaries. 

RockfordLhotka replied on Thursday, May 17, 2012

Your post doesn't show the actual code used by CSLA. We don't use a raw Dictionary.

I'm not saying there might not be an issue, but I would appreciate it if you can try duplicating the issue with equivalent code.

I am not eager to assume the high cost of using a lock on every dictionary operation. The performance ramifications of all those locks will be non-trivial.

The actual code that is used to access the dictionary is this:

    private static DynamicMethodHandle GetCachedMethod(object obj, System.Reflection.MethodInfo info, params object[] parameters)
    {
      var key = new MethodCacheKey(obj.GetType().FullName, info.Name, GetParameterTypes(parameters));
      DynamicMethodHandle mh = null;
      if (!_methodCache.TryGetValue(key, out mh))
      {
        lock (_methodCache)
        {
          if (!_methodCache.TryGetValue(key, out mh))
          {
            mh = new DynamicMethodHandle(info, parameters);
            _methodCache.Add(key, mh);
          }
        }
      }
      return mh;
    }

The only issue I can see with the existing code is that the dictionary Add method might not be atomic. In other words, it may add the key first or value first, and perhaps that's where the conflict comes into play. One would hope that they add the value first, then the key, which should be quite safe - but that may not be the case... 

RockfordLhotka replied on Thursday, May 17, 2012

OK, I think I have a cheaper answer for the issue.

The conflict should be extremely rare, and would only occur during app initialization (basically) because once the dictionary is loaded there'd never be a conflict.

When a failure occurs it is an exception in the first TryGetValue call. Rather than forcing 100% use of expensive lock statements, I tried wrapping that first call in a try..catch block.

      public int GetValue(int key)
      {
        int result = -1;
        bool found = false;
        try
        {
          found = _dict.TryGetValue(key, out result);
        }
        catch
        { }
        if (!found)
        {
          lock (this)
          {
            if (!_dict.TryGetValue(key, out result))
            {
              result = key;
              _dict.Add(key, key);
            }
          }
        }
        return result;
      }

This appears to solve the problem. Are you willing to apply a similar change to the MethodCaller code to see if it solves your problem for real?

tbaldridge replied on Thursday, May 17, 2012

 

Line 229 in MethodCaller.cs defines the _memberCache as:

private static readonly Dictionary<MethodCacheKey, DynamicMemberHandle> _memberCache = new Dictionary<MethodCacheKey, DynamicMemberHandle>();

I'd love to provide the actual code causing the error, but this is a bug that only happens in very specific situations. For instance, it blows up in several cases in our project, but I re-order the dataportal calls, or connect a higher number (or smaller number) of clients then the code blows up less often.

The actual MSDN documentation on this  collection states:

 

Public static (Shared in Visual Basic) members of this type are thread safe. Any instance members are not guaranteed to be thread safe.

Dictionary<TKey, TValue>.KeyCollection can support multiple readers concurrently, as long as the collection is not modified. Even so, enumerating through a collection is intrinsically not a thread-safe procedure. To guarantee thread safety during enumeration, you can lock the collection during the entire enumeration. To allow the collection to be accessed by multiple threads for reading and writing, you must implement your own synchronization.

 

" The performance ramifications of all those locks will be non-trivial."

 


Well, on my machine I'm doing a read and a write in the following times:

Dictionary<int, int> - 150ns
SafeDictionary<int, int> - 230ns
ConcurrentDcitonary<int, int> - 70ns

I wouldn't be so fast at writing alternative methods off without actually testing them. In fact, the ConcurrentDictionary is actually quite a bit faster than the standard dictionary when it comes to reads. So, on .NET 4.0+ this will actually be a speed improvement. So you could use locking on everything < .NET 4.0 and no locking on everything more than that.
 
" One would hope that they add the value first, then the key, which should be quite safe"

Not in the slightest! What's going to happen in the following situation?

1) Thread 1 adds the key
2) Thread 2 reads the dictionary and sees the key, and reads the value, which is null since Thread 1 hasn't inserted the value yet
3) Thread 1 adds the value

It's utterly impossible to have a thread-safe dictionary that supports concurrent writes without at least a CAS if not also a lock. The documentation for 
Not in the slightest! What's going to happen in the following situation?


1) Thread 1 adds the key
2) Thread 2 reads the dictionary and sees the key, and reads the value, which is null since Thread 1 hasn't inserted the value yet
3) Thread 1 adds the value


It's utterly impossible to have a thread-safe dictionary that supports concurrent writes without at least a CAS if not also a lock. The documentation for Dictionary states that it does neither, so that's the problem right there.
 

 

RockfordLhotka replied on Thursday, May 17, 2012

Sorry, I think you missed my point. I gave you the code used by methodcaller to interact with the dictionary in my post. I don't really care about your code at all - my point was that you provided an example against a raw dictionary, and that is not what is occurring within methodcaller.

The fact that your test was invalid doesn't really matter though, as I was able to duplicate the issue when running code comparable to the actual code used in methodcaller.

I believe the code I subsequently posted will resolve the issue - catching the rare scenario where there is a failure, and still only requiring a lock when the item isn't already present in the dictionary.

tbaldridge replied on Thursday, May 17, 2012

I thought your solution might work at first, but I'm still seeing problems with it. The reason is that the Dictionary can be in the middle of being updated when a reader comes along to read the value. The problem is that Dictionary does not have any atomic methods at all.

 

For example:

 

 

 

    class Program

    {

        public static IDictionary<int, int> dict = new Dictionary<int, int>();

        static void Main(string[] args)

        {

 

            while (true)

            {

                Console.Write(".");

                dict = new Dictionary<int, int>();

                var ts = DateTime.Now;

                var foo = new Foo();

                var mi = foo.GetType().GetMethod("MyMethod");

                var param = Expression.Parameter(typeof(int));

                var ld = Expression.Lambda<Func<int, int>>(Expression.Call(Expression.Constant(foo), mi, param), param).Compile();

                var x = Enumerable.Range(0, 10)

                                  .Select(t => Task.Factory.StartNew(() =>

                                  {

                                      int z = 1;

                                      for (int c = 0; c < 10000; c++)

                                      {

                                          try

                                          {

                                              if (!dict.ContainsKey(c)) dictCoffee = c;

                                          }

                                          catch

                                          {

                                              lock (dict)

                                              {

                                                  if (!dict.ContainsKey(c)) dictCoffee = c;

                                              }

                                          }

                                          z = GetValue(c);

                                      }

                                  }))

                                  .ToArray();

 

 

                Task.WaitAll(x);

                Console.WriteLine(DateTime.Now - ts);

 

            }

 

 

        }

 

        public static int GetValue(int key)

        {

            int result = -1;

            bool found = false;

            try

            {

                found = dict.TryGetValue(key, out result);

            }

            catch

            { }

            if (!found)

            {

                lock (dict)

                {

                    if (!dict.TryGetValue(key, out result))

                    {

                        result = key;

                        dict.Add(key, key);

                    }

                }

            }

            return result;

        }

 

 

The above code dies in the first four iterations (but runs fine with a single thread). So trying to selectively lock is not going to help. You can't have CAS-free mutable thread-safe data structures, it's just not possible. 

 

tbaldridge replied on Thursday, May 17, 2012

The other thing I'll point out is that this happening "during app initialization (basically)" is a bit misleading. This conflict can occur whenever there is a MethodCache cache miss. So this can happen whenever any method is invoked for the first time (see line 61 in MethodCaller.cs). 

Actually, we have global non-thread safe dictionaries in the following files:

FieldDataManger.cs - _consolidatedLists

PropertyInfoManager.cs - _propertyInfoCache

MethodCaller.cs -_methodCache

AuthorizationRuleManager.cs - _perTypeRules

BusinessRuleManager.cs - _perTypeRules

DataPortalMethodCache.cs - _cache

RockfordLhotka replied on Thursday, May 17, 2012

We aren't using comparable code.

MethodCaller doesn't use the dictionary methods you are using, nor does it do arbitrary interaction with the dictionary. It interacts with the dictionary in a very specific manner through a single method. That's generally true of all the dictionaries you listed in an earlier post.

The following code uses the same pattern as CSLA, but with the exception handling added. It doesn't fail on my 8 core machine, even when allowed to run for a very long time. Notice that each iteration blanks the "cache" so the odds of cache collisions are the same (high) for each outer iteration of the loop.

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;

namespace ConsoleApplication1
{
  class Program
  {
    static void Main(string[] args)
    {

      while (true)
      {
        var dict = new Data();

        Console.Write(".");

        var x = Enumerable.Range(0, 10)

                          .Select(t => Task.Factory.StartNew(() =>
                          {
                            for (int i = 0; i < 10; i++)
                              dict.GetValue(i);
                          }))

                          .ToArray();

        Task.WaitAll(x);

      }
    }

    public class Data
    {
      private Dictionary<int, int> _dict = new Dictionary<int, int>();

      public int GetValue(int key)
      {
        int result = -1;
        bool found = false;
        try
        {
          found = _dict.TryGetValue(key, out result);
        }
        catch
        { }
        if (!found)
        {
          lock (this)
          {
            if (!_dict.TryGetValue(key, out result))
            {
              result = key;
              _dict.Add(key, key);
            }
          }
        }
        if (key != result)
          throw new Exception("value mismatch");
        return result;
      }
    }
  }
}

tbaldridge replied on Friday, May 18, 2012

That seems to fix it. I tried it in several situations, including with colliding hash keys, and it seems to work okay. Thanks!

RockfordLhotka replied on Friday, May 18, 2012

Good, thank you for the confirmation.

I plan to incorporate this change into the upcoming 4.3 bug fix release in the next few days, and of course into 4.5.

Copyright (c) Marimer LLC