Thread Safe Temporary Integer Object IDs

Thread Safe Temporary Integer Object IDs

Old forum URL: forums.lhotka.net/forums/t/2546.aspx


JabbaTheNut posted on Saturday, March 17, 2007

I am very new to CSLA and am working my way through a number of basic issues for my business objects.  One issue for me is the management of temporary object ids. 

I have been reviewing the posts on this site regarding Guids, CombGuids, Integers, etc.  The debate is clearly lively and very interesting.  In my current situation, however, I must use integer ids.  Given this requirement, I am trying to find a thread-safe way of managing temporary integer object ids prior to saving new objects to the database.

Below is my attempt at doing this.  I have only included the elements of a class that pertain to managing the temporary object ids.  Am I missing anything important?

using Csla;

[Serializable()]

public class MyObject : Csla.BusinessBase<MyObject>

{

      //This is the static property used by all instances of this class.

      private static int _lastTempId = 0;

      //This is the object id of the instance.

      private int _id;

      protected override object GetIdValue()

      {

            //Return the object id of the instance

            return _id;

      }

      protected override void DataPortal_Create()

      {

            //Set the object id of the instance to _lastTempId - 1.  If, by some strange circumstance

            //there are enough objects of this type instantiated such that _lastTempId reaches the

            //minimum possible value for Int32, then _lastTempId is reset to 0.  If this is not done,

            //then Interlock.Decrement will cycle from -2147483648 to 2147483647.  I want

            //my object ids to be negative at all times.

            if (_lastTempId == -2147483648)

            {

                  System.Threading.Interlocked.Exchange(ref _lastTempId, 0);

            }

            _id = System.Threading.Interlocked.Decrement(ref _lastTempId);

      }

      protected override void DataPortal_Insert()

      {

            //Code to insert data into database contains an output parameter for the ScopeIdentity.

            //The object id is set to this ScopeIdentity, overriding the previous temporary id.

            _id = (int)myCommand.Parameters("@newId").Value;

      }

}

 

Any comments would be appreciated.

Bayu replied on Sunday, March 18, 2007

Hey,

There is a lot with your code that I wish to comment on, most noteworthy:
- compare against Integer.MinValue instead of that hard-coded value, prevents errors and is much more readable
- this static variable of yours (_lastTempID) is definitely going to cause trouble. As soon as your webserver (hosting your RemotingPortal or WebServicePortal) reboots it will be reset and you could have multiple clients with the same ID
- Decrementing and resetting the value is thread-safe (by using the Interlocked method) but the DataPortal_Create method is NOT (!) What could happen now is that two threads inspect the value and see that it equals Integer.MinValue, so they both enter the inner-block of the if-condition and you are in trouble. The resetting will occur thread-safe, also decrementing it will occur thread-safe, but their relative ordering is NOT threadsafe, consider this sequence:
    - thread 1: enters if-block (because it observes a MinValue)
    - thread 2: does the same (because it also observes the same MinValue, remember the dp_create is not thread-safe, so does could happen)
    - thread 2 is lucky and gets to reset the value AND it also decrements it to -1
    - thread 2 thus ends up with an object with ID = -1
    - thread 1: was still in the if-block and finally gets to reset the lastTempID ......... and ends up with an object with id equal to -1 too!

Have a look at the lock (VB: SyncLock) statement in C#, this will help you solve the above scenario.

Using the lock-statement then you would end-up with a thread-safe autonumbering system for 1 app.Note that this would still not be robust against web-server reboots, but even then. You never know for sure your app is the only app to ever access the DB. So why not use an autonumbering field in your DB for this purpose? This will be thread-safe and robust and will work across all users/apps that access the DB ...

In addition, if you somehow still really need to manage this autonumbering thing in your code: then better make it a persisted value in your DB which is locked, accessed and decremented using transactions.

Regards,
Bayu


Brian Criswell replied on Sunday, March 18, 2007

You should subclass the CSLA base objects that you are going to use so that you may add behaviour and share that behaviour among all of your business objects.  So you example would look a bit like this (off the top of my head, so syntax may not be correct).

public abstract class MyObjectBase<T> : Csla.BusinessBase<T>
    where T : MyObjectBase<T>
{
    private static int _tempId = 0;
    private static object _lockObject = new object();

    protected int GetTempId()
    {
       lock (_lockObject)
       {
          if (_tempId == int.MinValue)
          {
             _tempId = 0;
          }

          return --_tempId;
       }
    }
}

Your business object then looks like this
public class MyObject : MyObjectBase<MyObject>
{
    private int _id = GetTempId();
}

JabbaTheNut replied on Sunday, March 18, 2007

Thank you Bayu and Brian for your quick responses.

I can see that I have more to learn :)

Both of your points are well taken.  I believe that Brian's code example resolves the thread-safety issues that Bayu raised.  Additionally, I like the idea of subclassing BusinessBase to provide temporary ids across all business objects.  However, the issue that is most glaring to me now is the reboot and reset issue, which can cause objects of the same class to have the same ids.  I realize that ids can be managed through the database.  However, I was hoping to come up with a solution that would not require a hit to the database for every object created, regardless of whether or not it will be saved to the database.  Maybe this is not such a big deal. 

What are your thoughts?

I am seriosly contemplating moving to CombGuids.  However, I am particularly concerned about providing data to 3rd party vendors.  I am not yet comfortable that providing a CombGuid will be practical.

RockfordLhotka replied on Sunday, March 18, 2007

It isn't that hard though!

Interlocked.Decrement returns the resulting value, and it is threadsafe.

DataPortal_Create() is threadsafe, because it is running in an instance on a single thread. While other objects may be created at the same time, they'll run their own DP_Create() methods.

So you only need to worry about that one bit of shared state, and you can use Interlocked.Decrement() to use it:

private static int _lastId;

private void DataPortal_Create(Criteria criteria)
{
   _id = Interlocked.Decrement(_lastId);
}

No global locking required or anything fancy at all.

RockfordLhotka replied on Sunday, March 18, 2007

Sorry, I see that you are worried about the possibility that your appdomain will continue to exist long enough to create a few trillion objects of this particular type... I missed that on the first pass reading the question.

My response: are you serious? Big Smile [:D]

I realize that such a thing is hypothetically possible. And perhaps your app really could cause this to happen. But you really need to evaluate whether it is possible in reality for the app to try and create a few trillion Customer objects (for example) within the lifetime of a single appdomain.

If so, then do the extra work, and take the extra perf hit, to avoid the problem.

But 99.999% of the time this is impossible in any real sense. So how can you justify the complexity cost and perf hit to avoid something that can't actually happen?

JabbaTheNut replied on Sunday, March 18, 2007

Hi Rocky,

I agree with you.  I am not concerned about the application being in existence long enough to reassign a duplicate ID.  I was contemplating the possibility of reboots/resets, as posed by Bayu.  In these cases, the application would not necessarily have to completely cycle through to assign a duplicate id.  Although, I would concede that the probabilty of reassignment of a duplicate id due to reboot/reset is no doubt very low.

BTW,  I think it is so cool that you participate in these forums.  The fact that you take the time to address Newbie questions (like mine) is a testament to your commitment to the product.  I was initially exposed to CSLA at version 1.5.  I found it to be quite impressive and have been waiting for an opportunity to implement it.  Finally, at version 2.1.4, I have that opportunity.

RockfordLhotka replied on Sunday, March 18, 2007

The duplicate id issue can only happen in limited scenarios – probably not ones you will encounter – or at least ones you can avoid.

 

Remember that the id only needs to be unique within a given object graph – not application-wide. Certainly not cross-user, and not even within the context of a single user, except in ONE SPECIFIC COLLECTION.

 

So the only time you could get into trouble is if you have an object graph to which you are adding new items, and somehow you end up adding two items to one specific collection in that object graph with duplicate ID values.

 

This can’t happen if you are generating the temporary IDs on the client machine (smart client or web server), because any failure on that machine or appdomain would cause all those objects to have been lost too. In other words, the loss of _lastId automatically goes hand-in-hand with the loss of the whole object graph.

 

It CAN happen if you are generating the temporary IDs on a shared application server. In other words, your DataPortal.Create() call actually goes across the network to an app serve to create and initialize the new object. In that case, were the app server to fail and reload during the time a user is interacting with an object graph on the client, it would be possible to get a duplicate id within that object graph.

 

The only way to avoid that is to use an ID allocation scheme that relies on a persistent store. Remember that if that shared app server resets, anything in memory is gone. The only way to know the REAL _lastId value is if it was persisted somewhere. It is almost never performant to load one integer just to increment it, so what ID allocation algorithms have done for decades is to allocate a block of ID values.

 

In other words, the app server does a db call to allocate, say, 100 IDs. The db now has a “lastId” value of -100, but the app server, in memory, as _lastId of 0. The app server keeps decrementing that in memory value until it hits -100, then it goes back to the db to get another block. At that point the db has a “lastId” of -200 and the process repeats.

 

If the app server crashes, or even is shut down normally, when it restarts it goes to the db and asks for another allocation. This typically leaves holes, because the only “safe” lastId probably isn’t the REAL lastId. But for temp IDs no one cares really.

 

None of that really deals with threading – threading is the smallest part of the issue really. And honestly, most apps don’t encounter this issue, because they allocate the temp ID values on the client and avoid the whole issue.

 

Even if you are initializing new objects on an app server, you can allocate the temp ID on the client (in your factory method) and pass it to the app server through the criteria object. That is typically far simpler than implementing a persisted ID allocation algorithm.

 

Rocky

Bayu replied on Monday, March 19, 2007

RockfordLhotka:

The duplicate id issue can only happen in limited scenarios – probably not ones you will encounter – or at least ones you can avoid.

 

Remember that the id only needs to be unique within a given object graph – not application-wide. Certainly not cross-user, and not even within the context of a single user, except in ONE SPECIFIC COLLECTION.



Huh?

You mean that ID's only need to be unique within the scope that they are used by the app?
In that case I agree.

But ultimately (i.e. when objects are persisted in the DB) the requirements could very well be different, right? Then an object might at least require an ID that is unique within the table, perhaps even a globally unique ID.

I am not sure if you already hinted at this solution, but I guess the following could work wonderfully:
- you use a client-side ID assignment system (e.g. using Brian's base class example but then the variation suggested by Rocky where you assign IDs in the factory method, so all is client-side)
- this reaps all benefits indicated by Rocky
- however, as soon as an object is saved it gets assigned an (new) ID that satisfies the constraints posed on the DB schema

This would work very much like an auto-increment integer field in your DB, where you only recieve the ID upon actual insertion of the object in to the DB.

One potential issue would be with associated objects that need to store referential IDs to this particular object, but that is no different from when you would work with autoincrement fields either. If you would model your BOs after behavior, then I think this is even a non-issue.

Perhaps I am heading in the wrong direction here, then I guess I did not fully understand Rocky's last post. Let me know. ;-)

Regards,
Bayu


RockfordLhotka replied on Monday, March 19, 2007

As long as we're talking about temporary IDs then yes, the ID only needs to be unique within a very limited scope.

It is a set problem. Let O be the set of old objects (IsNew=false) loaded from a database. All the objects in O have real IDs that are (a) unique in the database and (b) unique in the client memory space.

Then let N be the set of new objects created on this client (IsNew=true). Each client has its own N, but those sets can never interact or conflict, because they exist in different scopes. By the time objects in N' could interact with those in N'', they've become members of O and thus have permanent unique IDs that are unique in the database.

But it gets even better, in most applications anyway, because the objects in N don't randomly interact with each other. For example, OrderLineItem objects typically only interact with ProductInfo objects that are members of O. The OrderLineItems collection interacts with OrderLineItem objects, and requires that they all have unique IDs, but those IDs only need to be unique within that specific collection.

In other words, in almost every case the temporary IDs only need to be unique within a very limited scope (like a collection). Even in a worst case scenario they only need to be unique within N, which is scoped by a given client appdomain.

Copyright (c) Marimer LLC