MS-bug in OnDeserializedHandler?

MS-bug in OnDeserializedHandler?

Old forum URL: forums.lhotka.net/forums/t/831.aspx


olafb posted on Monday, August 07, 2006

Hi,

i noticed a strange bug affecting the OnDeserializedHandler (with the OnDeserializedAttribute): when this handler is called, the items of some collections or dictionary objects are not yet deserialized. I ecountered this playing with the new Shared Validation Rules: some rules in my BO depend on a Dictionary(Of String, String) object, which was not properly populated in the handler (.Count = 0) . But inspecting the object after deserialization shows that all items are there. It looks like the OnDeserializedHandler is called too early.

Here is a simple console application to reproduce this bug:

Imports System.IO
Imports System.Runtime.Serialization.Formatters.Binary
Imports System.Runtime.Serialization
Imports System.Collections.Specialized
Imports System.ComponentModel

Module Module1

    Sub Main()

        Dim obj1 As New MySerializableClass
        obj1.Populate()

        Using buffer As New MemoryStream()
            Dim formatter As New BinaryFormatter
            formatter.Serialize(buffer, obj1)
            buffer.Position = 0

            obj1.Print(" before Serialization")
            Dim obj2 As MySerializableClass = _
                        DirectCast(formatter.Deserialize(buffer), MySerializableClass)
            obj2.Print(" after Deserialization")

        End Using

    End Sub

End Module

<Serializable()> _
Public Class MySerializableClass
    Public genDic As Dictionary(Of String, String)
    Public lst As List(Of String)
    Public hashTbl As Hashtable
    Public hd, hdl As HybridDictionary
    Public sd As StringDictionary
    Public ld As ListDictionary
    Public sc As StringCollection
    Public sl As SortedList
    Public gensl As SortedList(Of String, String)
    Public gensd As Generic.SortedDictionary(Of String, String)
    Public bl As BindingList(Of String)

    Public Sub Populate()
        genDic = New Dictionary(Of String, String)
        genDic.Add("foo", "bar")
        lst = New List(Of String)
        lst.Add("foo")
        hashTbl = New Hashtable
        hashTbl.Add("foo", "bar")
        hd = New HybridDictionary
        hd.Add("foo", "bar")
        hdl = New HybridDictionary
        For i As Integer = 1 To 10
            hdl.Add(i, i)
        Next
        sd = New StringDictionary
        sd.Add("foo", "bar")
        ld = New ListDictionary
        ld.Add("foo", "bar")
        sc = New StringCollection
        sc.Add("foo")
        sl = New SortedList
        sl.Add("foo", "bar")
        gensl = New SortedList(Of String, String)
        gensl.Add("foo", "bar")
        gensd = New Generic.SortedDictionary(Of String, String)
        gensd.Add("foo", "bar")
        bl = New BindingList(Of String)
        bl.Add("foo")
    End Sub

    <OnDeserialized()> _
    Private Sub OnDeserializedHandler(ByVal context As StreamingContext)
        Me.Print(" in OnDeserializedHandler")
    End Sub

    Public Sub Print(ByVal location As String)
        Print("small HybridDictionary             : ", hd.Count, location)
        Print("ListDictionary                     : ", ld.Count, location)
        Print("StringCollection                   : ", sc.Count, location)
        Print("SortedList                         : ", sl.Count, location)
        Print("List(Of String)                    : ", lst.Count, location)
        Print("SortedList(Of String, String)      : ", gensl.Count, location)
        Print("BindingList(Of String)             : ", bl.Count, location)

        'these have no items in OnDeserializedHandler:
        Print("Hashtable                          : ", hashTbl.Count, location)
        Print("large HybridDictionary             : ", hdl.Count, location)
        Print("StringDictionary                   : ", sd.Count, location)
        Print("Dictionary(Of String, String)      : ", genDic.Count, location)
        Print("SortedDictionary(Of String, String): ", gensd.Count, location)

        Console.WriteLine()
    End Sub

    Private Sub Print(ByVal s As String, ByVal iCount As Integer, _
                         ByVal location As String)
        Console.WriteLine(s & iCount.ToString & location)
    End Sub

End Class

 

 

 

 

RockfordLhotka replied on Monday, August 07, 2006

Well that's ugly...

I would bet that the objects that aren't fully initialized all implement ISerializable.

Deserialization happens (conceptually at least) in 3 passes. It is my understanding that the following approach is taken:

In pass 1, objects that implement ISerializable are created in memory using a special constructor that can create an object instance without invoking any constructors.

In pass 2, objects that do not implement ISerializable are just directly created through the special serialization constructor in the first pass. They are created using a reverse dependency tree, so by the time an object is created, all objects it depends on exist (though in the case of ISerializable objects, the object might be totally uninitialized).

Then, in pass 3, the deserializer runs through and calls the constructors on those objects.


What I'd never tested, but you appear to have found, is that they call the OnDeserialized handler in pass 2 for simple objects, and in pass 3 for ISerializable objects - immediately follow each individual object's full deserialization is complete.

This certainly restricts what you can do within the OnDeserialized handler, but I doubt Microsoft would consider it to be a bug.

AndrewCr replied on Monday, August 28, 2006

   I think a problem I'm seeing is a result of this behavior, and wanted to see if people agree or if I'm off base:

   Apparently, Dictionary(Of String, Object) falls into this not-available-in-OnDeserialized category.  This has the (very unfortunate) effect of preventing me from accessing business object properties from the OnDeserialized handler.  What seems to be happening is: the CanReadProperty() call in a property's Get method (after several levels of other calls) calls GetRolesForProperty, which tries to access the AuthorizationRules.Rules property, and fails with a NullReferenceException.

   This is a very subtle and non-intuitive problem.  (It's been driving me crazy for hours now.)

   Does this sound reasonable or am I missing something else?  Has anyone else run into this?

Thanks,
    Andy

JoeFallon1 replied on Monday, August 28, 2006

Andy,

I just read this thread. Wow. What a repro from olafb.

As far as your problem goes, can you skip the Property Get/Set and use the member variable directly instead?

===================================

Rocky,

You win the bet - all the objects do implement ISerializable.

Can you please elaborate on the implications of this with respect to Andy's problem and other things we should be careful about while handling OnDesrialized? Also, do the calls that you make to OnDeserialized in the framework work in all cases? Or are there other problems we should be aware of?

Joe

 

 

xal replied on Tuesday, August 29, 2006

I've also run into this same issue while doing some things with the active objects code.
There are other objects that seem to have this same issues (I think arraylists and hashtables have this issue). I don't know if this works for all of the cases, but for dictionary<,> you can do this:

 <OnDeserialized()> _
    Protected Sub OnDeserialized(ByVal context As StreamingContext)
         If Not myDictionary Is Nothing Then
              myDictionary.OnDeserialization(Nothing)
              For Each key as String in myDictionary.Keys
                Console.WriteLine(myDictionary(key).ToString())
              Next
          End If
    End Sub



That bold line forces deserialization. Don't ask me why. Perhaps someone could contact somebody in redmond that can shed some light into the matters....

Andrés

RockfordLhotka replied on Tuesday, August 29, 2006

Here's my understanding on how deserialization works (I don't know that all the details are correct, but this is pretty close):

  1. The byte stream contains n objects that form an object graph
  2. The formatter loops through, instantiating each of the n objects
    1. If the object is ISerializable it is created WITHOUT running a ctor
      1. The object's state is put into a dictionary
      2. Entries in this dictionary pointing to objects that aren't yet deserialized are put in a fix-up list for later resolution
    2. If the object does not implement ISerializable it is created without running a ctor
      1. The object's fields are loaded with values
      2. If a field references another object in the graph, and that other object hasn't yet been deserialized, then the field is put into a fix-up list for later resolution
  3. A fix-up process runs to fix any missing references
  4. ISerializable objects have their special serialization ctor invoked
This is all to handle circular (direct or indirect) references within the object graph being deserialized.

I don't know when the OnDeserialized method is called in this process, but we can surely speculate that it happens in a different place for ISerializable objects as opposed to other objects.

It also may be the case that step 4 is actually 2.1.3, and that an ISerializable object is expected to merely hold that dictionary until OnDeserialized is called - but I don't recall seeing that in the docs, and so I doubt that's accurate - this is why I think the order is as shown above.

AndrewCr replied on Tuesday, August 29, 2006

  Thanks everyone for the Info.  For my current case, I'm now using a friend property that doesn't check rules.  (I didn't want to expose the field directly.)

  Rocky, could you elaborate on your statement that "by the time an object is created, all objects it depends on exist", please?  Does this mean that when any of the objects in the graph's OnDeserialize method is called, all objects in the graph will at least exist, if not be fully initialized?

  I ask because I have a model where some great-great-grandchildren of the base object have an interface reference to the base object.  (It provides some services to these children.)  For this to work if/when I clone all or part of the tree, each of the children needs to keep a reference to this interface, for instantiating their children, and so on down to the bottom, where the great-great grandchildren consume the interface.  (This is loosely based on the CSLA Parent references.)  My question is how I should propagate this down the tree.

  I see two options:  First, I could have each child have it's own OnDeserialize method that updates its children. But, do I know that the child's parent has updated the child yet?  The other option seems to be to have each child's OnDeserialize call the base object and let it walk through its children, who in turn walk thier own children, etc.  But, this seems like a lot of redundant calling.

Thanks,
  Andy

RockfordLhotka replied on Tuesday, August 29, 2006

I think the issue here is entirely with objects that implement ISerializable. They are handled in a special way, because of how they are deserialized. Normal business objects do not implement this interface, and so follow a simpler deserialization path.

The whole challenge here is circular references. A->B, B->C and C->A - or any variation on that theme.

Serializing this is easy - you just flatten it into an array, replacing the references with "pointers" to the array index.

Deserializing is harder, because what do you deserialize first? A? You can't do that without B. B? You can't do that without C. C? You can't do that without A. Damn.

So what they do is a "fix-up". They deserialize A and make note that the B reference (field) is wrong. Then they deserialize B and note that the C ref is wrong. Then they deserialize C and it is just fine. Then there's the fix-up phase, where the bad A->B and B->C references are set - they literally just set those fields to the right references.

ISerializable is harder though, because the formatter doesn't know which fields will contain the values. So instead, it has to keep the dictionary (propertybag) of values until after the fix-up process, and THEN have the ISerializable objects deserialize themselves.

So the process is the same as above, except that suppose A is ISerializable. What happens is that they create an uninitialized instance of A, and they keep its dictionary too, noting the required fix-up for the B ref. They create B (noting the fix-up for C), and then create C, which points to the uninitialized (but technically valid) A. Then during the fix-up phase, they put the valid B ref into the A dictionary and THEN give the dictionary to the A instance through the special constructor. They also fix the B->C ref of course.

What I don't know is exactly when OnDeserialized is called on each object during this process. From what you guys have found, it is quite clear that it happens before ISerialized objects are fully initialized.

What you could do is create such an A,B,C object graph and do Debug.WriteLine statements so you can see as they get serialized and deserialized and find out - I don't have time to do that just now or I'd do it, because it sounds like an interesting excercise. Especially given that I've been told there are subtle differences in how WCF's NetDataContractSerializer makes its OnDeserialized calls - and so the behavior will change when moving to WCF.

The clear recommendation (if not rule) is that an object should not use any references to other objects in the object graph during OnDeserialized, because they are not guaranteed to exist and/or be properly initialized when OnDeserialized is invoked.

AndrewCr replied on Tuesday, August 29, 2006

  Okay, here's what I decided:  I'm overloading the Clone method in each of my child objects, so it calls the base Clone and then calls a SetInterface method on the clone, which in turn calls its children etc.  That way I ensure that each of the child objects will exist.

  This works fine in a 1-machine scenario, but I suspect I will have trouble if/when I start to use remoting, since the serialization/deserialization won't be done as a result of the Clone method call.  I'm under some time pressure, so I guess I'll have to cross that bridge when I come to it.  I'll keep you posted if I learn anything or have any brilliant insights.

Thanks for the help,
  Andy

AndrewCr replied on Wednesday, August 30, 2006

One thing I forgot to mention.  To override the Clone method, I had to change the Clone function signature in BusinessListBase.vb from:

  Public Overloads Function Clone() As T

to

    Public Overridable Function Clone() As T


Since this now matches the signature from BusinessBase.vb, is it safe to assume that the original signature was incorrect, or am I missing something?

Thanks,
  Andy

RockfordLhotka replied on Wednesday, August 30, 2006

Oh, that's really messed up!
 
You are supposed to override GetClone(), not Clone(). And Clone() shouldn't be virtual/Overridable anywhere.
 
I wonder how that didn't get caught before... I'll fix it in 2.1 - so please override GetClone() so you stay in line with the original plan.
 
Rocky
 


From: AndrewCr [mailto:cslanet@lhotka.net]
Sent: Wednesday, August 30, 2006 12:05 AM
To: rocky@lhotka.net
Subject: Re: [CSLA .NET] MS-bug in OnDeserializedHandler?

One thing I forgot to mention.  To override the Clone method, I had to change the Clone function signature in BusinessListBase.vb from:

  Public Overloads Function Clone() As T

to

    Public Overridable Function Clone() As T


Since this now matches the signature from BusinessBase.vb, is it safe to assume that the original signature was incorrect, or am I missing something?

Thanks,
  Andy




AndrewCr replied on Wednesday, August 30, 2006

Will do.

Thanks,
  Andy

xal replied on Wednesday, August 30, 2006

Andy, have you tried calling OnDeserialization() in your dictionary??
As I said in a previous post, that forces deserialization, and after that, you can use it inside your OnDeserialized()...


Andrés

RockfordLhotka replied on Wednesday, August 30, 2006

While this is surely true (I trust your observations), I am skeptical that it is a reliable solution for any and all ISerializable objects.
 
It sounds like the dictionary class is implemented such that its deserialization constructor merely holds the propertybag until OnDeserialized is called, and then it does the work of actually reloading itself with data.
 
I can see two possible problems here. First, I am sure that not all ISerializable objects work this way. I've written several, and have seen several others, and none of taken that approach. Second, I am sure the reason they are taking that approach is because the dictionary doesn't want to actually deserialize until all the other objects in the graph have deserialized, and there must be a reason for that. By forcing it to deserialize earlier in the process (at an indeterminate point actually), you could cause unforseen issues. Given the indeterminate nature of this process, it may be very hard to test - kind of like with threading, the code may run 99 times out of 100 and never fail in test, but then fail in production.
 
In short, while I am sure you have found this solution to work, it is the kind of hack that makes me _very_ nervous.
 
Rocky
 


From: xal [mailto:cslanet@lhotka.net]
Sent: Wednesday, August 30, 2006 7:31 AM
To: rocky@lhotka.net
Subject: Re: [CSLA .NET] RE: MS-bug in OnDeserializedHandler?

Andy, have you tried calling OnDeserialization() in your dictionary??
As I said in a previous post, that forces deserialization, and after that, you can use it inside your OnDeserialized()...


AndrC)s



xal replied on Wednesday, August 30, 2006

Yes, I'm aware of the hackiness of that. It's more of a workaround to get things done.
Maybe the only real solution, is to not use hashtables / dictionaries if you rely on those object to be ok at the time OnDeserialized is called.

Andrés

Bryan Dougherty replied on Friday, September 08, 2006

I had this same issue and came up with another workaround.  Basically, I'm serializing the Dictionary's keys and values separately and rebuilding it during deserialization.

Here's more info:

http://blogs.claritycon.com/blogs/bryan_dougherty/archive/2006/09/08/1775.aspx

 

Wilfred replied on Saturday, December 16, 2006

I am using activeobjects , and am using custom activelookups, when debugging vs is complaining about ondeserialized method , when getting a value from the activelookup.

What should i do , the activelookups inherit from NameValueListBase

Thank's.

Willem

xal replied on Saturday, December 16, 2006

The latest release of active objects has a couple of bugs when using remoting. I tried talking to Petar, but I can't seem to reach him lately.
I have the fixes for those issues and others and I wouldn't mind sending that to anyone who needs it until Petar reappears.


Andrés

Wilfred replied on Saturday, December 16, 2006

That would be nice, cause in csla /activeobjects 1.5, there were no such issues.

 

My email adress is :   wilfredvancasteren@hotmail.com 

 

thank you

xal replied on Saturday, December 16, 2006

I've attached my version here so that anybody can get it...


Cheers,

Andrés

Wilfred replied on Saturday, December 16, 2006

thank's

Wilfred replied on Sunday, December 17, 2006

xal:
I've attached my version here so that anybody can get it...


Cheers,

Andrés

I have a problem creating a custom activelookup , should i use T as below, or use (of K, V) or something else. it's just not easy using templates.

Public Class ActiveDatatypeLookup(Of T)

Inherits NameValueListBase(Of String, String)

.....

End class

xal replied on Sunday, December 17, 2006

I'm not sure what you're trying to do there, but if all you need is an active lookup where boths params are string just do:

Public Class ActiveDatatypeLookup

Inherits NameValueListBase(Of String, String)

.....

End class


Andrés

Copyright (c) Marimer LLC