Someone here who works with db4o?

Someone here who works with db4o?

Old forum URL: forums.lhotka.net/forums/t/2444.aspx


RichardETVS posted on Wednesday, February 28, 2007

db4o is an object database. No row, no tables, no SQL, no need to make conversion between objects and data, the objects are saved in the database as objects (http://www.db4o.com ) .

 

I would like to know if someone, here, uses it in a production environment, with the CSLA framework, and what advices he or she could give.

 

Thanks :)

 

Cordially

 

Richard

phucphlq replied on Wednesday, February 28, 2007

I posted an example here

RichardETVS replied on Thursday, March 01, 2007

I am studying it right now ;) .

Do you use both (CSLA and db4o) in a production environment?

 

phucphlq replied on Thursday, March 01, 2007

I like DB4O, but it is slow, small, not exception, not unique field. I only write example on it, I have never used  in a production environment.

RichardETVS replied on Friday, March 02, 2007

ok thanks for the answer.

DansDreams replied on Monday, March 05, 2007

On the old forum there was quite a bit of discussion about db40, so you might want to try to find it.  Keep in mind that the discussion was a few years ago, but at that time there was a few features still lacking, like a completely robust concurrency model (admitted by the developers).

The bottom line BACK THEN AT LEAST was that it was a fascinating concept but not really quite ready for prime time in an enterprise environment.

I would be really interested if you find out those issues have been resolved.

Justin replied on Monday, March 05, 2007

From what I can tell they have resolved most of those issues and then some. One feature that stands out is transparent activation. It looks like they can instrument our objects so that field data isn't loaded until accessed. If it works as avertised it could eliminate the need to stay away from many data centric designs because of performance and scalablility issues.

As an example that was dicussed at length in another thread an Order object could have a full Customer object reference instead of borrowing fields into the Order using SQL joins undernieth as not all the Cutomer fields would be loaded from the db when you load the Order.

It also supports transactions,online backup, a few different query languages, a "server" mode so it doesn't have to be in-proc, and even a GUI "enterprise manager" now.

Too bad the don't support a royalty free license which pretty much excludes it for my development purposes at this point, otherwise I would spend more time to see if it actually can do some of these things properly.

Justin

ajj3085 replied on Tuesday, March 06, 2007

Justin:
If it works as avertised it could eliminate the need to stay away from many data centric designs because of performance and scalablility issues.


Performance and scalability issues haven't been listed as reasons to stay away from data centric designs.. at least not here.  Smile [:)]

Justin replied on Tuesday, March 06, 2007

ajj3085:

Performance and scalability issues haven't been listed as reasons to stay away from data centric designs.. at least not here.  Smile [:)]

Hmm, sure didn't sound like Rocky was describing it as poor design but instead impractical because of physical architecture issues, but maybe I misinterpreted this? (from thread http://forums.lhotka.net/forums/3/12361/ShowThread.aspx):

RockfordLhotka:

This is where, with real OO applications, people get into trouble.

 

Are you really proposing that, to get one field of data on Order, you are going to load the entire Customer object?

 

I’ve been down that road, early in my exploration of OO. It sounds so sexy. Total reuse. One object with one set of data. One object to rule them all … and in the darkness bind them. (sorry, I’m a hopeless LoTR geek…)

 

But in reality it is bunk. It simply doesn’t work for anything beyond a handful of users, if that. The performance and scalability ramifications are non-trivial, especially in any sort of distributed environment.

 

You can get away with a lot if the database is on the user’s workstation and so, by definition, there’s one user. But as soon as you start trying to load entire Customer objects to get one field of data for dozens or hundreds of users, and that data must travel even from the db to the app server, you are done for.

 

Rocky

 

 

ajj3085 replied on Tuesday, March 06, 2007

It seems to me that comment is why you wouldn't want to use an OODB vs. a traditional relational db, which is a different (although somewhat related) topic than the 'build business objects based on behavior, not data.'   Regardless of your datastore, you'd still want to avoid data centric designs for your business objects.

Justin replied on Tuesday, March 06, 2007

ajj3085:
It seems to me that comment is why you wouldn't want to use an OODB vs. a traditional relational db, which is a different (although somewhat related) topic than the 'build business objects based on behavior, not data.'   Regardless of your datastore, you'd still want to avoid data centric designs for your business objects.

Exactly wheres does that mention a OO vs relational DB? Now please explain why we should avoid a data centric design and  borrow fields from the Customer and put them on a Order if performance a scalability were the only issue? Do you do this at the table level too, replicating customer fields on the order table and keeping them in sync with the master customer table?

Almost every argument against a data centric design is based on performance issues which is understandable and exactly why I still use behaivor based design.To me behavior based design is a compromise in most situations and breaks encapsulation and reuse but must be done to implement a working and performant system today.

A system like db4o with transparent activation could remove many of the performance implications of a data centric design leaving you with choice for the best model given your particular application.

 

ajj3085 replied on Tuesday, March 06, 2007

Justin:
Exactly wheres does that mention a OO vs relational DB?  Now please explain why we should avoid a data centric design and  borrow fields from the Customer and put them on a Order if performance a scalability were the only issue? Do you do this at the table level too, replicating customer fields on the order table and keeping them in sync with the master customer table?

http://forums.lhotka.net/forums/permalink/12361/12659/ShowThread.aspx#12659

At that point, the discussion changes to using an OODB vs. a relation db.  The post you quoted was part of that thread of discussion.  Data centric design is perfectly fine at the database level.  You do things there you wouldn't do when doing behavior based design on the business layer.  The reverse is also true.

Justin:
Almost every argument against a data centric design is based on performance issues which is understandable and exactly why I still use behaivor based design.To me behavior based design is a compromise in most situations and breaks encapsulation and reuse but must be done to implement a working and performant system today.


That's not the arguments for behavior based design that I've seen here; the arguments are that the behaviors (which are dictated by use cases) are where the maintainablity problems come in, and behavior based design is meant to help with maintainablity issues (as is OOD).  The desire for reuse is counter to maintainability.  In the thread link above, you will find Rocky explain how we want reuse, but at the same time reuse is bad.  Behavior based design doesn't break encapsulation, because the behaviors are what is being encapsulated.  There's only one object that gives the behavior you want.

Justin:
A system like db4o with transparent activation could remove many of the performance implications of a data centric design leaving you with choice for the best model given your particular application.


Again, I don't think behavior based design is an attempt to gain performance.

Justin replied on Tuesday, March 06, 2007

ajj3085:
http://forums.lhotka.net/forums/permalink/12361/12659/ShowThread.aspx#12659

At that point, the discussion changes to using an OODB vs. a relation db.  The post you quoted was part of that thread of discussion.  Data centric design is perfectly fine at the database level.  You do things there you wouldn't do when doing behavior based design on the business layer.  The reverse is also true.

In the post I quoted Rocky was replying to this post: http://forums.lhotka.net/forums/permalink/12361/12663/ShowThread.aspx#12663 which was purely a question about OO logical design not about which persistence technology it would be implemented on. Rocky's response characterize this as "sexy" but impractical because of performance and you are somehow interpreting this as an argument against an OODBMS's? It seems to me it is describing limitations of current RDBMS technology backing OO designs, but perhaps Rocky would need to clarify.


ajj3085:

That's not the arguments for behavior based design that I've seen here; the arguments are that the behaviors (which are dictated by use cases) are where the maintainablity problems come in, and behavior based design is meant to help with maintainablity issues (as is OOD).  The desire for reuse is counter to maintainability.  In the thread link above, you will find Rocky explain how we want reuse, but at the same time reuse is bad.  Behavior based design doesn't break encapsulation, because the behaviors are what is being encapsulated.  There's only one object that gives the behavior you want.

Again in Rocky's response the only argument against the "sexy" design was performance issues, but even he describes it as "Total reuse" which has a direct impact on maintainablity no?

Your still dodging the question how does borrowing fields at a sql level in the Order-Customer example provide better maintainablility or is it only for performance reasons?

I could link many sources that point to data centric designs being a better philosphy just as you could about behaivor centric, but since it seems to be an open debate and I would rather hear your arguments as to why behaivor centric is always a better solution, especially how replicating the same properties across multiple classes improves maintainability.

ajj3085:

Again, I don't think behavior based design is an attempt to gain performance.

So now it has nothing to do with performance? Again how can you argue replicating properties leads to less maintence and has nothing to do with performance?

pelinville replied on Tuesday, March 06, 2007

I don't think the idea of "borrowed fields" or breaking data encapsulation has anything at all to do with one design philosophy over another. In the cases that you both point out this is done only for one reason, performance. Just because you are doing the "Behavior" thing doesn't mean you should ignore traditional data encapsulation.
 
There is only two reasons, I believe, CustomerName would be a field in Order. 
 
1. Because it should be based upon business rules.
2. Because it is more efficient/faster to retrieve that one piece of data than it is to retrieve the customer data as a whole and use only one piece of that data.
 
Those are the only two reasons I can think of doing it. The first one is valid and there isn't any extra risk outside the risk imposed by the rules themselves.
 
The second one has some risk associated with it and that risk has to be acknowledged and understood.  And this is done all the time in both Data and Behavior centered design.
 
I think the question is if the performance lost doing the “right” thing could be either eliminated or ignored which way should/would you design the objects?
 
I personally think that to “borrow” a field when there is no performance gain (or the gain is unnoticeable) is just silly.
 
And I know that using a customer increases the coupling of the objects.  Just insert the interfaces needed to make the coupling as loosy goosy as your heart desires. That still doesn’t mean that “borrowing” fields is the best thing to do, design wise.
 
But it is necessary many times so I don't think it is a bad thing.  I just don't like seeing it justified by a design credo.
 
And I have gotten into arguments about this before. In those arguments I didn't really understand what others where saying. My point then was that just about everybody does data centric design whether they admitted to it or not.
 
What I should have said is that "Just about everybody lets data affect their design."
 
So I agree with ajj that the support of behavior oriented design is not for performance reasons. But those that do push the behavior oriented design do say that the data in the object is only there to support the behavior and nothing else.
 
And that is where things get confusing, I believe. The fact that they focus on the behavior of a class means that having a few extra fields that also appear in other classes does not seem to bother them.  They also take for granted that using real, honest to god aggregation/composition and enforcing strict data encapsulation is a performance killer. And they understand that in reality breaking encapsulation often does not hurt maintainability of the application. Especially since in almost all cases the data storage is a RDBMS and THAT is what will ensure data integrity.  After all one of the main points of data encapsulation is to maintain data integrity.
 
The db4o OODMS does NOT have the built in data integrity goals of a RDBMS. It has object integrity goals. So borrowing fields is a bit of crazy talk. I suppose you can do it but with my, admittedly small, experience, the idea scares me. In db4o there is no way to get varCustomerName from the customer object without loading the entire customer object. It is relatively easy to retrieve the Customer object at the same time Order is retrieved. All you have to do is set the activation depth correctly and the customer object would be retrieved when you retrieve the order object.
So my advise to the original poster is this.  First db4o will hurt your brain at first if you are used to RDBMS.  With db4o the code, and the code alone, is responsible for data integrety.  This fact can kill you. 
 
Second problem, don't even hope to have a effecient ad-hoc reporting solution.  I shudder just thinking about trying to do that. And to have an efficient reporting solution even when you know exactly what needs to be looked at takes a lot of up front effort. And remember you cannot easily update things like zip or area codes because there are no set operations. If you are used to the powers of an RDBMS you are going to cry at times at what is lost with a OODBMS. 
 
But there will also be many times that you won't be able to wipe the smile off your face because everything just feels "right in the eyes of god".

Igor replied on Tuesday, March 06, 2007

IMO behavioural design is not about performance, and maintainability is not about data integrity.

Behavioural (= single responsibility) objects are simpler, the code is easier to understand (and THEREFORE to modify=maintain).
 
Borrowing fields and duplicating data do not constitute significant maintenance issues; convoluted and/or duplicated procedures do.

 

RichardETVS replied on Wednesday, March 07, 2007

pelinville:

If you are used to the powers of an RDBMS you are going to cry at times at what is lost with a OODBMS. 

I am not ;) . The more complex application I worked on used a home made system for data, who was too unstructured for a SQL database. Well, in fact, we used one, but all the real data was in blob fields, and we developed proprietary system to set, get and update the data. So to manage data with code is something I am familiar with. Years ago, I got a lot of experience is with MS Access, but it is not a real database.

Now, I take some precaution here. It is the main reason why I want a separate DAL, because if need be, I want to be able to switch easily form db4o to a RDBMS.

 

pelinville replied on Wednesday, March 07, 2007

RichardETVS:

pelinville:

If you are used to the powers of an RDBMS you are going to cry at times at what is lost with a OODBMS. 

I am not ;) . The more complex application I worked on used a home made system for data, who was too unstructured for a SQL database. Well, in fact, we used one, but all the real data was in blob fields, and we developed proprietary system to set, get and update the data. So to manage data with code is something I am familiar with. Years ago, I got a lot of experience is with MS Access, but it is not a real database.

Now, I take some precaution here. It is the main reason why I want a separate DAL, because if need be, I want to be able to switch easily form db4o to a RDBMS.

 

 

You are absolutely correct in that.  There are places where using a OODBMS seems to shine and make RDBMS look archaic. It sounds like you have found one of those situations. They tend to be systems where there is a lot of “doing” and not so much data manipulation or the data doesn’t have a natural structure.

 

What I meant by that is if you are doing a typical business app where there needs to be a lot of reporting, especially ad hoc type reports, or there is some kind of data mining, then avoid object databases. 

 

Another BIG problem is comparing and or combining the data in a OODBMS with another system.  Even another OODBMS.  There might be a way to do it but it sure doesn’t seem as easy as importing the tables, creating some new keys and joining (or using OLAP) until your little harts content.

 

But if you don’t need that kind of thing?  Cool.

pelinville replied on Wednesday, March 07, 2007

 

RichardETVS:

Now, I take some precaution here. It is the main reason why I want a separate DAL, because if need be, I want to be able to switch easily form db4o to a RDBMS.

I forgot to add that using a DAL with it is a bit problematic.  And it easy to understand why.

db4o returns objects.  A DAL expects to return, well, data.  I couldn't get my DAL to work with db4o without adding bunch of overhead. Basically getting the object, getting it's data by accessing the private members then populating a datatable or some other kind of structure then populating the object with said data structure!!!!! See any problem with that? Big Smile [:D]

And think about this.  db40 returns objects. CSLA kinda expects you to populate the object in the DataPortal_Fetch method of an object already created.  So out of the box the best you can do is retrieve the object from db4o then populate the object created with the object from db4o. Not very efficient either.

That second one isn't to hard to fix you just have to change how the DataPortal.Fetch operates so that it queries the OODBMS and directly and returns the result of the query.  This does have a few advantages. Since you can change the activation level you can relatively easily change how much of the object graph you retrieve.  No more having to create whole new sp's/queries and looping through result sets.  Just add an activation level to the criteria.

Oh, and the criteria object becomes pretty important also. If you use query by example the criteria can just contain the example of the object you want to retrieve.

Updating, however, is a royal pain.  Even though you have this object in all it's changed glory you can't just give db4o this "dirty" object and then save it. If you do this db4o will simply save a new object.  What you have to do is somehow find the current object in db4o and then update the object retrieved to with the object you have updated. What this ultimately means is you either have to keep that criteria object you used to get it in the first place and use it again to find and update the stored object.  Or you have to have an unchanging ID (I used QUIDS) that you can use to find the stored object. 

To update the old object with the new object I used code generation and made sure each object had a Mirror(o as BO) method.  All it did was update the private fields to match the values of the BO passed.

With updating you also have to set the depth into the object graph you want to update.

Inserting is about as easy as it gets. Simply call db.update(newObj) and you are done.

Deleting is like update in that first you must retrieve the original object from db4o then call db.Delete(objRetrieved)

Needless to say you are going to have to modify the dataportal a great deal.

Collections I never really got the hang of.  Luckily the default behavior handled my simply cases well.  But I recognized that there could be problems used to dealing with set based operations that limit the number of records returned.

I don't think that db4o is a good replacement for the majority of users of CSLA.  In fact I think it is a bad replacement for a rather large number of people. The main reason is that the vast majority of programs I see being discussed here are primarily concerned with data. All this "behavior" talk is really just trying to make a datacentric applications more maintainable by using the principles of OOD/OOP.  But the fact remains that what is really important to the customer is the data. All the functionality in the world is for naught if they cannot use the data to better their business. And the functionality is not really that different among most the CSLA because most business behave basically the same.

For example I have found that db4o is excellent for a current application that I am writing. It is a playing aid for the "Star Fleet Battles" board game.  Now there is a butt-ton of data for this app.  It has boggled my mind.  I initial started with writing it with the typical CRUD and CSLA solution I am so familiar with it.  But I soon gave up on it because of the very rigid constraints posed by the RDBMS.  It just got unwieldy.

A couple of months ago (this is a hobby project) I switch to db4o and in those two months I have gotten further than I did with the RDBMS in two years. The main reason is because I can apply real object oriented techniques (composition/aggregation, encapsulation, polymorphism) so much easier when all I have to worry about is the object design.  Being able to ignore the db schemas and the need to write queries to get the data I need makes this particular type of app so much easier to write.  Of course I don't have to worry about requests like "I need to see all cruiser class ships that have class I phasers in the front and rear firing arc". Actually I could do this pretty easy but you get the idea. Actually I can't really put my finger exactly why this type is easier with an OODBMS than it is with a CRUD based system.

 

OK, this has gone on long enough.

RichardETVS replied on Thursday, March 08, 2007

Ajj, there is something that disturb me, in this “behavior centric” approach. When I read “I have a case now where I have two objects which populate same tables, with one exception. » I have the feeling that your approach imposes a relational database. It has no senses with an object database, or I fail to see it. You need tables, rows…

 

I think that it is strange that in OO world who is labeled behavior centric, the data are a keystone, and must have a certain formalism. But again, may be it is me who fail to understand you.

 

Let’s try another example. Let’s say I would model a world for a Massive Multiplayer On-line Role Playing Game (wow, long to write, I usually type MMORPG, but I was not all who get it ;)  ) . My first choice would be a multi agents system, where each entity in the world (like Non Played Characters, monsters, deities, etc.) would be… entities. They would have state and behaviors. Now, granted, I would use patterns as “state” or “strategy” to give those entities a flexible and easy to maintain behavior. But the use cases, it seems to me, who impose an entity design. Is that a “data centric” design? I do not think so. We do not really care of only data, but about an entity with both its state (like emotions, friends / enemies, amount of money), and its behavior (to be able to fix objectives, to cooperate with other entities, to fight, to sell, etc.) . For me it is not data centric, even if it is entity centric.

 

Pelinville, I appreciate the details of your answer.

 

For the DAL, I want a DAL who get BO and send back BO, not data. It can take a BO as a parameter (like the Set method of db4o) and populate it with data or save it to the base. If the base is db4o, I saw how to do that in the example of phucphlq, who is in a previous message of this post. He pretty much does the things you gave as solutions. It does not use code generation (“To update the old object with the new object I used code generation and made sure each object had a Mirror(o as BO) method.  All it did was update the private fields to match the values of the BO passed.”). It would be very kind of you if you could give a code example of that ;) .

 

When you say “Needless to say you are going to have to modify the dataportal a great deal.” You mean the CSLA classes themselves? Why I should modify them?

 

I intend to use DataPotalXYZ methods in the BO to communicate with the DAL.

 

 

Cordially

 

Richard

ajj3085 replied on Thursday, March 08, 2007

RichardETVS:
Ajj, there is something that disturb me, in this “behavior centric” approach. When I read “I have a case now where I have two objects which populate same tables, with one exception. I have the feeling that your approach imposes a relational database. It has no senses with an object database, or I fail to see it. You need tables, rows…


I suppose it does.  I'm not sure how an oodbms would handle this scenario at all though.  What needs to end up happening when the Registration 'saves' itself to the database is really that 1) an existing SerialNumber object would need to update its state and 2) a new Contact object may have to be created and stored.  It would seem to me that you still need some sort of ORM (or maybe Object-Object Mapping?)...

 

RichardETVS:
I think that it is strange that in OO world who is labeled behavior centric, the data are a keystone, and must have a certain formalism. But again, may be it is me who fail to understand you.


Well the point is that data isn't keystone; data is important, it just does not drive the design of the objects.  Even in an OODBMS, do you not need to formalize the objects?  If you ask for a Customer data object, its the same structure as any other customer object, I would imagine.  Objects are very formlized concepts as well.

 

RichardETVS:
Let’s try another example. Let’s say I would model a world for a Massive Multiplayer On-line Role Playing Game (wow, long to write, I usually type MMORPG, but I was not all who get it ;)  ) . My first choice would be a multi agents system, where each entity in the world (like Non Played Characters, monsters, deities, etc.) would be… entities. They would have state and behaviors. Now, granted, I would use patterns as “state” or “strategy” to give those entities a flexible and easy to maintain behavior. But the use cases, it seems to me, who impose an entity design. Is that a “data centric” design? I do not think so. We do not really care of only data, but about an entity with both its state (like emotions, friends / enemies, amount of money), and its behavior (to be able to fix objectives, to cooperate with other entities, to fight, to sell, etc.) . For me it is not data centric, even if it is entity centric.


That sounds about right.  Behavior based design does not mean you don't care about state; you do, because without state you don't have very useful objects.  What would the behaviors even act upon?  Its just that you design objects based on what they do, not what they will contain.  Two objects may contain the exact same state (data) but perform completely different activities.


RichardETVS:
I intend to use DataPotalXYZ methods in the BO to communicate with the DAL.


That's the intended use of the DataPortal_XYZ methods.

Andy

Justin replied on Thursday, March 08, 2007

ajj3085:

I suppose it does.  I'm not sure how an oodbms would handle this scenario at all though.  What needs to end up happening when the Registration 'saves' itself to the database is really that 1) an existing SerialNumber object would need to update its state and 2) a new Contact object may have to be created and stored.  It would seem to me that you still need some sort of ORM (or maybe Object-Object Mapping?)...

 

It doesn't handle it, just like a ORM wouldn't magically map multiple objects that actually have properties backed by the same columns in one table. This is the complexity that is introduced in behaivor based designs, since those use-case specific objects now have to be re-normalized to a core entity storage mechanism( a table or a core data object) using application code you have to maintain.

 

If you choose a data centric design however with say db4o  you would simply get the proper Registration instance set the proper SerialNumber instance(from a repository of  SerialNumbers or generated?) and set the proper Contact instance retieved from the Contact store( or a new Contact instance), then take your Registration object and do a set with it, and if cascading updates are enabled the entire Registration and two referenced objects are persisted. Db4o does not distiguish between insert or update, if the in memory instance was retieved with a previous get then the set updates other wise it inserts.

 

ajj3085:
Well the point is that data isn't keystone; data is important, it just does not drive the design of the objects.  Even in an OODBMS, do you not need to formalize the objects?  If you ask for a Customer data object, its the same structure as any other customer object, I would imagine.  Objects are very formlized concepts as well.

 

It does drive the design of objects if they are used to be the persistent store, just like it drives table design in a RDBMS. Now do you apply contraints to your tables such a ContactName on the Contact table can not be null and cannot be more than 50 characters for instance or a Registration must have a valid ContactID FK? Are those things not not really just Business rules expressed in relational code and re-expressed in CSLA rules a second time? Why duplicate that logic, with an OODB you create a Contact object and those rules are expressed once for all to use, as all Contacts should share a common set of behaivors or rules correct?

 

Justin

 


 

 

ajj3085 replied on Thursday, March 08, 2007

Justin:
It doesn't handle it, just like a ORM wouldn't magically map multiple objects that actually have properties backed by the same columns in one table. This is the complexity that is introduced in behaivor based designs, since those use-case specific objects now have to be re-normalized to a core entity storage mechanism( a table or a core data object) using application code you have to maintain.


ORM isn't really that complex; its a matter of using a few objects and setting properties.  Compare it to the complexity of needing to keep track of extra, unneeded state and writing logic to selectively apply rules based on that state and it seems pretty simple. 

 

Justin:
If you choose a data centric design however with say db4o  you would simply get the proper Registration instance set the proper SerialNumber instance(from a repository of  SerialNumbers or generated?) and set the proper Contact instance retieved from the Contact store( or a new Contact instance), then take your Registration object and do a set with it, and if cascading updates are enabled the entire Registration and two referenced objects are persisted. Db4o does not distiguish between insert or update, if the in memory instance was retieved with a previous get then the set updates other wise it inserts.


In a data centric design i'd simply load the serial number row and set the contact id column (or contact reference).  Registration would not exist... it doesn't exist in my database after all.  Of course now I don't have any way to restrict RegistrationStaff from creating contacts willy nilly.  Nor can I prevent them from adding addresses to contacts whenever they please, nor can I stop them from creating companies whenever they want..

 

Justin:
It does drive the design of objects if they are used to be the persistent store, just like it drives table design in a RDBMS.


That would be the definition of data driven design, so that only makes sense.


Justin:
Now do you apply contraints to your tables such a ContactName on the Contact table can not be null and cannot be more than 50 characters for instance or a Registration must have a valid ContactID FK? Are those things not not really just Business rules expressed in relational code and re-expressed in CSLA rules a second time?


Yes, I don't abandon good database design.  As I said, there is no registration table; only a contact table and a serial number table, which has a nullable FK reference to contact. 


The relationships and indexes are there for information and performance reasons, not just to prevent invalid entries.  There are plenty of fields though that are nullable but which my business rules enforce as required fields.  That's because use cases can change, and whats required today may not be required tomorrow.    Primary key's aren't nullable because the database engine specifies so.  FK references tell us how the data is related, but doesn't tell us how its used.

Justin:
Why duplicate that logic, with an OODB you create a Contact object and those rules are expressed once for all to use, as all Contacts should share a common set of behaivors or rules correct?


If you were using an oodb, your object would be your database and you couldn't duplicate the rules even if you wanted to. 

Justin replied on Thursday, March 08, 2007

ajj3085:

ORM isn't really that complex; its a matter of using a few objects and setting properties.  Compare it to the complexity of needing to keep track of extra, unneeded state and writing logic to selectively apply rules based on that state and it seems pretty simple. 

Oh really how exactly does an ORM know to store your two different Contact objects in one contact table? I am sure you have to build this map somewhere and in some syntax? Seems alot simpler to me to just create a Contact and store a Contact and specify all contraints for a contact in one language one time.

ajj3085:

In a data centric design i'd simply load the serial number row and set the contact id column (or contact reference).  Registration would not exist... it doesn't exist in my database after all.  Of course now I don't have any way to restrict RegistrationStaff from creating contacts willy nilly.  Nor can I prevent them from adding addresses to contacts whenever they please, nor can I stop them from creating companies whenever they want..

 

If Registration's doesn't exist in your database where do you store it? Data centric OO designs have nothing to do with rows or columns, that relational design. Why would RegistrationStaff be able to save if that your business rule? You would just check your user context and make sure thier in the right role before the Contact would allow a save.

ajj3085:

That would be the definition of data driven design, so that only makes sense.

 

I know it makes sense your doing it too except your creating tables with some business contraints in SQL instead of data centric objects.

ajj3085:

Yes, I don't abandon good database design.  As I said, there is no registration table; only a contact table and a serial number table, which has a nullable FK reference to contact. 

 

That's good RELATIONAL database design problem is you also have to duplicate those rules in OO code a second time plus additional rules that relational databases don't express very well. 

ajj3085:

The relationships and indexes are there for information and performance reasons, not just to prevent invalid entries.  There are plenty of fields though that are nullable but which my business rules enforce as required fields.  That's because use cases can change, and whats required today may not be required tomorrow.    Primary key's aren't nullable because the database engine specifies so.  FK references tell us how the data is related, but doesn't tell us how its used.

I didn't mention indexes, they are part of the physical RDBMS implementation and independant of Foriegn Keys and other constraints in the logical domain. Why do you leave some fields nullable if they are not, I thought you followed good database design or are you making compromises to speed schema change maintenance?

ajj3085:

If you were using an oodb, your object would be your database and you couldn't duplicate the rules even if you wanted to. 

Yes I could, I can use composition with a OODB to and layer objects such that the data centric objects enforce few rules and behaivor centric object that contain those private data objects with thier own duplicate versions of those rules . Thats all your doing by implementing rules in the RDBMS and in the objects that use them then writing glue code to map tables to objects.

Justin

ajj3085 replied on Friday, March 09, 2007

Justin:
Oh really how exactly does an ORM know to store your two different Contact objects in one contact table? I am sure you have to build this map somewhere and in some syntax? Seems alot simpler to me to just create a Contact and store a Contact and specify all contraints for a contact in one language one time.


Yes, I have to build it.  The code to actually do the mapping is trivial.  I know how the data maps, and I just use some DAL objects which represent the tables.

Justin:
If Registration's doesn't exist in your database where do you store it? Data centric OO designs have nothing to do with rows or columns, that relational design. Why would RegistrationStaff be able to save if that your business rule? You would just check your user context and make sure thier in the right role before the Contact would allow a save.


I told you, the registration is done by linking a contact to a serial number.  OODbms or relational is irrelevent; in relational, the SerialNumber table has a FK reference to Contact.  In an OODBMS, the SerialNumber class would have a reference to a Contact. 

Justin:
I know it makes sense your doing it too except your creating tables with some business contraints in SQL instead of data centric objects.


You miss the point of my comment.. if your design your business objects to be business object AND the storage mechism, then you're doing data driven design.  That's not how I'm approaching my object design, although as I sometimes I fall back on old habits, as does everyone that is making the shift.


Justin:
That's good RELATIONAL database design problem is you also have to duplicate those rules in OO code a second time plus additional rules that relational databases don't express very well.


I don't recall this discussion being about switching to oodbs.  Again you miss the point; most of the rules aren't put into the relation db model; I could remove all the FK constraints and it wouldn't affect my application.  It would make the design of the db a bit more unclear though and you'd lose some of the performance gains.  Sometimes its necessary to duplicate some rules as well; you have to do this on the web quite a bit, because having the user postback everytime and then seeing errors is a poor user experience.  I guess though since you want to avoid duplication of rules, you would simply always code them in the UI, and I'm sure you never create constrains on your database either, and you always have to hit the DB before you discover the key given was invalid.


Justin:
I didn't mention indexes, they are part of the physical RDBMS implementation and independant of Foriegn Keys and other constraints in the logical domain. Why do you leave some fields nullable if they are not, I thought you followed good database design or are you making compromises to speed schema change maintenance?

Because whether or not a particular field is nullable or not changes based on the use case.  When sales creates a contact, they are free to enter as much or as little of the address as they choose.  Registration staff are not allowed to do so, and must follow up with the customer to get a minium of address details.

That's part of the point you are missing; the use case dictates what is required and what is not, NOT the database design.  If I relied only on the database and data centric design, I would not be able to enforce such rules.  The address class would not be able to why the user is creating it and it would not be able to enforce the rules properly. 

Justin:
Yes I could, I can use composition with a OODB to and layer objects such that the data centric objects enforce few rules and behaivor centric object that contain those private data objects with thier own duplicate versions of those rules . Thats all your doing by implementing rules in the RDBMS and in the objects that use them then writing glue code to map tables to objects.

I don't view myself as implementing rules in the database, because I'm not.  If an contact can have only one address, you'd probably have an AddressId in your contact table.  You could enforce contacts ALWAYS having an address by making that field not nullable.  If you suddenly can have more than one address for a contact, you suddenly lost any ability to enforce that a contact must have at least one address.  Yes, you can write a trigger and put the rule in the database, but that's not really the best place for it.  Rocky has some reasons why in his book, I'm sure you can find other reasons why that would be a bad idea.


You skipped my questions about how I could enforce security too.  Given a data driven design, how do I stop registration staff from just creating all the contacts they want?  How does the object know how and why its being used?  Add an internal method on contact that the registration object calls to tell it how its being used?  Then I need to add such a method any time I have a similar requirement.  Maintaining that class quickly becomes very difficult.  I know, i've been down that road already..

Justin replied on Friday, March 09, 2007

ajj3085:

Yes, I have to build it.  The code to actually do the mapping is trivial.  I know how the data maps, and I just use some DAL objects which represent the tables.
Trivial is your opinion, but it is complexity that does not need to exist at all in a data centric design with an OODBMS. Is designing tables and indexes and SQL trivial as well before you can even setup the map?

ajj3085:

I told you, the registration is done by linking a contact to a serial number.  OODbms or relational is irrelevent; in relational, the SerialNumber table has a FK reference to Contact.  In an OODBMS, the SerialNumber class would have a reference to a Contact. 

Right so why does Registration even exits if a SerialNumber can have just one Contact? It seem to be a superfluous object representing part of a contact and a serial number but it is stored as just a property of a SerialNumber.

ajj3085:

You miss the point of my comment.. if your design your business objects to be business object AND the storage mechism, then you're doing data driven design.  That's not how I'm approaching my object design, although as I sometimes I fall back on old habits, as does everyone that is making the shift.

I also think you are missing my point which is that you are doing data driven design but you are doing it in a non OO system is all. Why should we as application developers have to know two languages(OO and SQL) and then write a bunch of glue code to translate between the two, seems like a waste of time plus there is more machinery runnning that is unecessary with an OODB?

ajj3085:

I don't recall this discussion being about switching to oodbs.  Again you miss the point; most of the rules aren't put into the relation db model; I could remove all the FK constraints and it wouldn't affect my application.  It would make the design of the db a bit more unclear though and you'd lose some of the performance gains.  Sometimes its necessary to duplicate some rules as well; you have to do this on the web quite a bit, because having the user postback everytime and then seeing errors is a poor user experience.  I guess though since you want to avoid duplication of rules, you would simply always code them in the UI, and I'm sure you never create constrains on your database either, and you always have to hit the DB before you discover the key given was invalid.

The title of the thread is "Someone here who works with db4o?" I think it is very much about switching to an OODBMS, your the one who came in here talking about RDBMS's.

FK's and contraints in a RDBMS do not improve performance they actually decrease performance as they are enforcing rules that would otherwise not be checked. I think you are confusing indexes with contraints.

No I would rather not duplicate rules in multiple layers, maybe I might have to because of technilogical limitations such as using an RDBMS or your example of a web based client, but that is a compromise to get the job done. Although as far as post backs to enforce broken rules in a web client that can mitigated with ajax(Form field changes causes a xmlhttp request to the server to check the CSLA object) or implementing common rules such as required and max length along with data type as attributes that both the CSLA rules engine and a UI can enforce but defining them once.

ajj3085:
Because whether or not a particular field is nullable or not changes based on the use case.  When sales creates a contact, they are free to enter as much or as little of the address as they choose.  Registration staff are not allowed to do so, and must follow up with the customer to get a minium of address details.

Well with that design your really just compensating for the difficulty of implementing that rule in an RDBMS by not bothering to code it there. Although if you where to go into comp.databases.theory you would be swiftly scolded by the relational purists for designing your table incorrectly as a Pre Sales Contact and a Sales Customer are infact two different entities that your trying to mash into one table just because because the share some properties. Your case would be better handled with inheritance since a Customer is a Contact with more rules and possible data associated with it. RDBMS's do not handle inheritance and there for your stuck with replicating data and rules just like your behaivor based objects. Your just fudging it at the table level ignoring those rules there.

ajj3085:
That's part of the point you are missing; the use case dictates what is required and what is not, NOT the database design.  If I relied only on the database and data centric design, I would not be able to enforce such rules.  The address class would not be able to why the user is creating it and it would not be able to enforce the rules properly. 

The address shouldn't know why it being edited addresses follow rules for addresses, If the Registration requires something on it's referenced Address that should be a rule on the Registration not the Address, that doesn't mean you can't use an Address refrence instead of just putting Address fields on the Registration.

ajj3085:
I don't view myself as implementing rules in the database, because I'm not.  If an contact can have only one address, you'd probably have an AddressId in your contact table.  You could enforce contacts ALWAYS having an address by making that field not nullable.  If you suddenly can have more than one address for a contact, you suddenly lost any ability to enforce that a contact must have at least one address.  Yes, you can write a trigger and put the rule in the database, but that's not really the best place for it.  Rocky has some reasons why in his book, I'm sure you can find other reasons why that would be a bad idea.

So datatype and max length and nullablility and FK contraints are not rules? I guess I should assume they are not in your eyes since you use them yet you say you don't implement any rules in the db. Why isn't a trigger just as good a place to put a rule as OO code, is perhap because OO it is easier to implement? Why then do you code any SQL at all, is it becasue you have to? You don't if you use an OODB like db4o, but you do have to make data centric objects even if you still use behavior centric on top of them, but all your code is now in the same language. 

ajj3085:
 
You skipped my questions about how I could enforce security too.  Given a data driven design, how do I stop registration staff from just creating all the contacts they want?  How does the object know how and why its being used?  Add an internal method on contact that the registration object calls to tell it how its being used?  Then I need to add such a method any time I have a similar requirement.  Maintaining that class quickly becomes very difficult.  I know, i've been down that road already..

Did you just not read the second part of my response here?:

Justin:
If Registration's doesn't exist in your database where do you store it? Data centric OO designs have nothing to do with rows or columns, that relational design. Why would RegistrationStaff be able to save if that your business rule? You would just check your user context and make sure thier in the right role before the Contact would allow a save.

Justin replied on Thursday, March 08, 2007

pelinville:
 

I forgot to add that using a DAL with it is a bit problematic.  And it easy to understand why.

db4o returns objects.  A DAL expects to return, well, data.  I couldn't get my DAL to work with db4o without adding bunch of overhead. Basically getting the object, getting it's data by accessing the private members then populating a datatable or some other kind of structure then populating the object with said data structure!!!!! See any problem with that? Big Smile [:D]

And think about this.  db40 returns objects. CSLA kinda expects you to populate the object in the DataPortal_Fetch method of an object already created.  So out of the box the best you can do is retrieve the object from db4o then populate the object created with the object from db4o. Not very efficient either.

 

CSLA base DataPortal_XYZ method are designed around the disparity between Objects and an underlying storage engine that doesn not understand objects nor return them. As you said you can still emulate this by retrieving one instance and copying the fields to the new instance, and I would say this is no less effiecent and could be techincally more effiecient than copying data values from untyped datasets that where undernieth populated from SQL results set, there is in fact many more levels of abastraction to go through in ADO to SQL server. 

 

But if the Dataport_XYZ methods wher echanged slighty to behaive more like thier DataPortals.XYZ counterparts that could be eliminated aswell.

 

pelinville:
  Updating, however, is a royal pain.  Even though you have this object in all it's changed glory you can't just give db4o this "dirty" object and then save it. If you do this db4o will simply save a new object.  What you have to do is somehow find the current object in db4o and then update the object retrieved to with the object you have updated. What this ultimately means is you either have to keep that criteria object you used to get it in the first place and use it again to find and update the stored object.  Or you have to have an unchanging ID (I used QUIDS) that you can use to find the stored object. 

To update the old object with the new object I used code generation and made sure each object had a Mirror(o as BO) method.  All it did was update the private fields to match the values of the BO passed.

With updating you also have to set the depth into the object graph you want to update.

 

True in order to update without modifying CSLA to return the instances from the db instead of clones you would need a unique id(which you are gonna probablly need anyway for user criteria and perhaps interaction with other systems like web services). Then in the Update wouldn't you just do the reverse of the fetch by QBE to get the stored instance and copying the updated values to it and .Set it. Again though if CSLA had actually given you the stored instance on fetch you could have just .Set that instance back with no copy.

 

 

pelinville:
Needless to say you are going to have to modify the dataportal a great deal.

 

I would say you don't if you create datacentric objects that noone uses directly that basically mirror SQL tables, then all the BO's sit on top of them(just like they do with tables) and the DataPortal_XYZ copies between them as it does with ADO datasets and parameters now. I would say that would still buy you less layers of abstraction and by consecuence less CRUD and performance too. The real power would be in not hiding thoses underlying objects thereby removing yet another layer of abstraction, but as you say would require some rework of Dataportal and maybe more in CSLA.

 

I may try to actually experiment with this and convert ProjectTracker to db4o first by just doing the "hidden data objects" approach then by modifying CSLA to actually work with the persisted instances and see where it goes and where db40 is at.

ajj3085 replied on Wednesday, March 07, 2007

Justin:
In the post I quoted Rocky was replying to this post: http://forums.lhotka.net/forums/permalink/12361/12663/ShowThread.aspx#12663 which was purely a question about OO logical design not about which persistence technology it would be implemented on.


It looks to me like a response to the problem with OODBMS, if you read the post to which 12663 was a reply.  Keep going up the thread, you'll see the question changed to basicly 'what about using db4o to store business objects?'

Justin:
Rocky's response characterize this as "sexy" but impractical because of performance and you are somehow interpreting this as an argument against an OODBMS's? It seems to me it is describing limitations of current RDBMS technology backing OO designs, but perhaps Rocky would need to clarify.


The side discussion on oodbms makes the thread a bit more complicated to read.  It would appear that Rocky's response goes back to another question, namely 'should my Order object have just a CustomerName field, or to use a CustomerInfo object?'  The answer I think depends on the use case; if the order should just show a customer name typically, then no it doesn't make sense to only have a customer info property on order which holds all the other customer data.  I would think though that if the use case called for that other information always being available then it would make sense.

Justin:
Again in Rocky's response the only argument against the "sexy" design was performance issues, but even he describes it as "Total reuse" which has a direct impact on maintainablity no?

If you go beyond that however and back to the use case, you can see that if the use case dictated that all the other customer information always be available the 'sexy' design would now be appropriate.  In the use case given though, it seems only CustomerName is needed and given that, performance outweights the reuse "benefits."  Also remember that reuse is coupling, and so reuse must not be the only goal.  Rocky argues it shouldn't be a goal at all, because coupling reduces maintainability.  All of the OO design patterns are meant to increase maintainability by reducing coupling.

Justin:
Your still dodging the question how does borrowing fields at a sql level in the Order-Customer example provide better maintainablility or is it only for performance reasons?

In that case the argument is for a performance reason, but I'm sure we can come up with a scenario where maintanence is also a factor.  Lets say Customer is for a reporting use case, and the context in which customer can be created changes.  It may be that order can no longer create the customer object because it does know now enough about the reporting use case to do so (and nor should it).  Maintainability is suffering because you've bult order to rely on customer, and now that customer has changed, you're forced to revisit order.

Justin:
I could link many sources that point to data centric designs being a better philosphy just as you could about behaivor centric, but since it seems to be an open debate and I would rather hear your arguments as to why behaivor centric is always a better solution, especially how replicating the same properties across multiple classes improves maintainability.

My arguements would be the same as anyone else's advocated behavior based design, just as your arguments are the same as those advocating data centric design.   You seem to expect me to come up with a new argument supporting behavior based design, yet you've offered no new argument on data based design. 

Behavior based design is not 'replicating the same properties across multiple classes.'  Its not focusing on the properties at all.  I've also already told you why replicating properties is easy and thus has no value. 

Every single one of my properties, in every single one of my business objects can easily be regenerated.  I have four code snippits which allow me to do exactly that.  I spend hardly any time writing the property code, almost all of my time is dealing with business rules, the rest is with ORM code.

In data centric design, you have one place to deal with customer data.  Changing that customer class has a ripple effect on every class that uses it.  Changes to it almost always cause clients of it to break.  I did that design; our 'business objects' were no more than entity objects, which some behavior stuffed in.  And it did lead to a fragile application.  Changes did ripple out and break other functionality.  In the end, we did get it to work, but it was ugly and moving forward changes again caused massive problems.  I knew then that data based design was flawed, but at the time I didn't know the solution.

Since then, I have learned about behavior based design, and it HAS made my applications more resilent to changes.  Database schema changes do break code in certain places, but the fixes are trivial.  The impact on other classes is greatly reduced.

Justin:
So now it has nothing to do with performance? Again how can you argue replicating properties leads to less maintence and has nothing to do with performance?

Behavior based design has never been about performance.  That particular design recommendation Rocky made was specific to that use case and yes performance was a factor there.  Improving peformance of an application is not the goal of behavior based design; the goal is improved maintainablity. 

I never said it was a silver bullet, and that sometimes you can't make compromises when other factors outweight maintainability.  I never said it was always appropriate in every situtation either.  When building business applications however, its a very good idea to use behavior based design to achive good maintainability, because business applications do have a habit of changing quite a bit.

Justin replied on Wednesday, March 07, 2007

ajj3085:
It looks to me like a response to the problem with OODBMS, if you read the post to which 12663 was a reply.  Keep going up the thread, you'll see the question changed to basicly 'what about using db4o to store business objects?'


ajj3085:
The side discussion on oodbms makes the thread a bit more complicated to read.  It would appear that Rocky's response goes back to another question, namely 'should my Order object have just a CustomerName field, or to use a CustomerInfo object?'  The answer I think depends on the use case; if the order should just show a customer name typically, then no it doesn't make sense to only have a customer info property on order which holds all the other customer data.  I would think though that if the use case called for that other information always being available then it would make sense.

So I am confused here are you saying Rocky's response IS about problems using an OODBMS, yet right afterward you are saying it goes back to 'should my Order object have just a CustomerName field, or to use a CustomerInfo object?' and not about an OODBMS problems which is it? Yes I know the thread got into discussion on OODBMS's but that response was before that occurred and had no mention of OODBMS's.

You would think you would always need a "CustomerInfo" object with an ID and Name, unless you didn't care about uniquely identifying a Customer on the order in cases of non unique customer names. Of course my argument is why then use a CustomerInfo and not just the full Customer object, and the only argument against that thus far is performance considerations which is usually the driving reason for even creating XxxInfo objects.

ajj3085:
If you go beyond that however and back to the use case, you can see that if the use case dictated that all the other customer information always be available the 'sexy' design would now be appropriate.  In the use case given though, it seems only CustomerName is needed and given that, performance outweights the reuse "benefits."  Also remember that reuse is coupling, and so reuse must not be the only goal.  Rocky argues it shouldn't be a goal at all, because coupling reduces maintainability.  All of the OO design patterns are meant to increase maintainability by reducing coupling.

So is performance an adavantage to behaivor centric designs over data centric designs, yes or no? How many comparisions can you find between the two that do not mention performance ?

ajj3085:
In that case the argument is for a performance reason, but I'm sure we can come up with a scenario where maintanence is also a factor.  Lets say Customer is for a reporting use case, and the context in which customer can be created changes.  It may be that order can no longer create the customer object because it does know now enough about the reporting use case to do so (and nor should it).  Maintainability is suffering because you've bult order to rely on customer, and now that customer has changed, you're forced to revisit order.

Please be more detailed how in what situation is a Customer not exactly a Customer, why would you want to duplicate some of the Customers properties on a different Customer report object from an Order.Customer object if not for purely performance reasons. Is it because I can't edit a Customer object on a report? Then I guess we should make two different smartdates up too because your not allowed to edit an Order date on a report.

ajj3085:
My arguements would be the same as anyone else's advocated behavior based design, just as your arguments are the same as those advocating data centric design.   You seem to expect me to come up with a new argument supporting behavior based design, yet you've offered no new argument on data based design. 

No my point was in this thread was that db4o could potentially eliminate ONE criteria for choosing Data vs Behaivor centric, which is the performance implication of reusing full data objects instead of use case specific composite or abbreivated objects. Yet you seem to keep steadfast that performance is NEVER a consideration or reason, yet Rocky's reply  gave it as main reason in that use case?

ajj3085:
Behavior based design is not 'replicating the same properties across multiple classes.'  Its not focusing on the properties at all.  I've also already told you why replicating properties is easy and thus has no value. 

Every single one of my properties, in every single one of my business objects can easily be regenerated.  I have four code snippits which allow me to do exactly that.  I spend hardly any time writing the property code, almost all of my time is dealing with business rules, the rest is with ORM code.

In data centric design, you have one place to deal with customer data.  Changing that customer class has a ripple effect on every class that uses it.  Changes to it almost always cause clients of it to break.  I did that design; our 'business objects' were no more than entity objects, which some behavior stuffed in.  And it did lead to a fragile application.  Changes did ripple out and break other functionality.  In the end, we did get it to work, but it was ugly and moving forward changes again caused massive problems.  I knew then that data based design was flawed, but at the time I didn't know the solution.

Since then, I have learned about behavior based design, and it HAS made my applications more resilent to changes.  Database schema changes do break code in certain places, but the fixes are trivial.  The impact on other classes is greatly reduced.

Just as data centric can lead to coupling complexity just as behaivor centric can lead to de normaliztion of entities and less reuse also adding to complexity. I unlike you believe there is a place for both and the reasons for choosing one over the other are not completely decoupled from current persistence technology.

I would argue that it is a good thing that Changes in a Customer schema could result in consumers of the customer breaking if they referenced something that has changed, since all consumers of a Customer should know how to use a Customer properly otherwise they really didn't want to use a Customer in the first place or you added something to Customer that wasn't really part of a Customer.

Of course it is easy for you to change your properties it sounds as though you have decomposed all your BL of say a Customer separating all it's data from the enforcement of the state of that data, basically going back to the good ole days of data and code being two different things it two different places of the pre OO era. Again functional code will enforce this desgin for you without OO overhead you are not using.

ajj3085:
Behavior based design has never been about performance.  That particular design recommendation Rocky made was specific to that use case and yes performance was a factor there.  Improving peformance of an application is not the goal of behavior based design; the goal is improved maintainablity. 

I never said it was a silver bullet, and that sometimes you can't make compromises when other factors outweight maintainability.  I never said it was always appropriate in every situtation either.  When building business applications however, its a very good idea to use behavior based design to achive good maintainability, because business applications do have a habit of changing quite a bit.

Wait again you are contradciting yourself, it is never about performance, yet in this situation it was a factor? Is it a factor or is it not in some cases? If it is then the point I was trying to make here is valid in that db4o's transparent activation could potentially eliminate that as a factor in some cases, yes?

I completely disagree that any type of design is not about performance and maintainaiblity, if your behaivor centric design is easily maintianable but unusuably slow you have failed to make a good design. This is BTW why many data centric design have failed, beautiful to look at but unusable in the real world.

In case you didn't know the whole reason why activation levels and transparent activation are even features of db4o is to address the performance considerations of using object graphs instead of tables and join syntax.

Justin

ajj3085 replied on Wednesday, March 07, 2007

Justin:
So I am confused here are you saying Rocky's response IS about problems using an OODBMS, yet right afterward you are saying it goes back to 'should my Order object have just a CustomerName field, or to use a CustomerInfo object?' and not about an OODBMS problems which is it? Yes I know the thread got into discussion on OODBMS's but that response was before that occurred and had no mention of OODBMS's.


Check your links; they go to two different posts.  The latter post seems to move out of the oodbms discussion, the former most certainly is discussing oodbms.

Justin:
You would think you would always need a "CustomerInfo" object with an ID and Name, unless you didn't care about uniquely identifying a Customer on the order in cases of non unique customer names. Of course my argument is why then use a CustomerInfo and not just the full Customer object, and the only argument against that thus far is performance considerations which is usually the driving reason for even creating XxxInfo objects.

Why would you always need a customer info object, as long as the order knows the unique id for the customer?   There's no requirement at all the the Order object contain any other customer information at all.

The info objects aren't created for performance reasons.  If loading the same data in both Customer and CustomerInfo, I doubt that the actual load from the db is much different in either.  Instead, the info objects exist because the editing behaviors of the BusinessBase subclasses aren't needed.  Take INotifyPropertyChanged; it defines a behavior which is only used on BusinessBase, because readonly objects don't need such behavior.

ajj3085:
So is performance an adavantage to behaivor centric designs over data centric designs, yes or no? How many comparisions can you find between the two that do not mention performance ?

Your first question is absurd.  Following one vs. the other won't yield results were one design always leads to more performant code.  I'm sure some arguments are out there that will try to tell you that, but those arguments would be absurd as well.

ajj3085:
Please be more detailed how in what situation is a Customer not exactly a Customer, why would you want to duplicate some of the Customers properties on a different Customer report object from an Order.Customer object if not for purely performance reasons.

I have a case now where I have two objects which populate same tables, with one exception.  I have a business object for Contact.  Only sales people may use that object, and they can create, modify and merge contacts.  A first and last name is required as well as a company / department.

I also have a SoftwareRegistration.  The data is the same; contact details, including first and last name, department and address.  Only SoftwareRegistration users may create such an object; loading / editing is not allowed.  The registration links a software serial number to a contact.  That contact may not exist at all, and the SoftwareRegistration object (plus a few others) allow registration staff to create a contact if it doesn't exist, or to add an address to a contact if they find the user already exists, but the address on the registration form is not one of the contacts addresses.

ajj3085:
Is it because I can't edit a Customer object on a report? Then I guess we should make two different smartdates up too because your not allowed to edit an Order date on a report.

Yup, no need to use a customer object, which has all the code for editing when you only want to report on a customer.  Why would you need two smartdates?  You'd just make the property readonly if you wanted to prevent edits.  Not every property on an editable object is necessarly editable.

Justin:
No my point was in this thread was that db4o could potentially eliminate ONE criteria for choosing Data vs Behaivor centric, which is the performance implication of reusing full data objects instead of use case specific composite or abbreivated objects.

I don't think so; db4o could eliminate the need for ORM when going between the data layer and business layer, but I still think you'd design your business layer based on behaviors (use cases).

Justin:
Yet you seem to keep steadfast that performance is NEVER a consideration or reason, yet Rocky's reply  gave it as main reason in that use case?

Where did I say never?  Performance considerations may lead you to alter your design somewhat, but this would usually be done after the initial design proves to be inefficent.  Technically CustomerName might not even belong on Order, since that object doesn't need it for any behaviors which Order would implement.  You may add it though as a convience, so that if you need to show the CustomerName on the screen you don't need to load a full blown customer object. 

Justin:
Just as data centric can lead to coupling complexity just as behaivor centric can lead to de normaliztion of entities and less reuse also adding to complexity. I unlike you believe there is a place for both and the reasons for choosing one over the other are not completely decoupled from current persistence technology.

I've said before there are certainly places for data centric designs.  You really don't know what I believe, and I think you should refrain from attempting to tell me what I do or do not believe.  More classes does not necessarly mean more complexity; a few small simple objects can be simplier than one big, complex objects.  That bigger object is also harder to maintain typically.

Justin:
I would argue that it is a good thing that Changes in a Customer schema could result in consumers of the customer breaking if they referenced something that has changed, since all consumers of a Customer should know how to use a Customer properly otherwise they really didn't want to use a Customer in the first place or you added something to Customer that wasn't really part of a Customer.

I'm not sure that Order should break because of some database changes that have to do with customers.  Order should still be able to perform its tasks.  If order starts worrying about customer specific behaviors it will break, and you'll need to change it.  A better solution is that Order doesn't worry about those.. less code breaking means less chances of introducing new bugs due to change.

Justin:
Of course it is easy for you to change your properties it sounds as though you have decomposed all your BL of say a Customer separating all it's data from the enforcement of the state of that data, basically going back to the good ole days of data and code being two different things it two different places of the pre OO era. Again functional code will enforce this desgin for you without OO overhead you are not using.

Are you for real?  The data is part of the objects state; the data is kept with the rules, that's kinda the point.  How do you go from being able to code gen property declartions to keeping the data in a whole seperate object?

Justin:
Wait again you are contradciting yourself, it is never about performance, yet in this situation it was a factor? Is it a factor or is it not in some cases? If it is then the point I was trying to make here is valid in that db4o's transparent activation could potentially eliminate that as a factor in some cases, yes?

Again, I never said performance is never an issue.  Its just not the driving force behind the design.  In this situation, there is some need for a CustomerName property, and rather than load all customer details, just the name is retrived and added to the Order object for the sake of convience.  It probably wouldn't be part of an inital design, because Order has no need for the CustomerName field at all, the order would likely only need to know the Customer's id (and may or may not expose it, depending on the use case).  Its likely that the CustomerName does need to be displayed though and is added for the sake of convience and performance.

Justin:
I completely disagree that any type of design is not about performance and maintainaiblity, if your behaivor centric design is easily maintianable but unusuably slow you have failed to make a good design. This is BTW why many data centric design have failed, beautiful to look at but unusable in the real world.

Peformance is usually taken into account after an initial design (regardless of how you came to that design).  Performance is one of those things that's best addressed if it is a problem.  Object design is a process, of course it may need to be modified if it doesn't perform well.

Justin:
In case you didn't know the whole reason why activation levels and transparent activation are even features of db4o is to address the performance considerations of using object graphs instead of tables and join syntax.

I honestly don't know much about db4o; if it works as advertised, I don't think it will have an impact on what method to use to design business objects, I think it would make the persistence part of the business objects easier because it would eliminate the need for ORM.

DansDreams replied on Wednesday, March 07, 2007

ajj3085:

Justin:
In case you didn't know the whole reason why activation levels and transparent activation are even features of db4o is to address the performance considerations of using object graphs instead of tables and join syntax.

I honestly don't know much about db4o; if it works as advertised, I don't think it will have an impact on what method to use to design business objects, I think it would make the persistence part of the business objects easier because it would eliminate the need for ORM.

Justin, I think by "performance considerations of using object graphs" you would have to be talking about the inherent horrible inefficiencies of loading and persisting way more data than was necessary that previously plagued products like this.  This isn't an advantage of  this product, it's mitigation of a previous disadvantage.  We're on the same page with that, right?

I think it's probably mostly silliness to compare the two technologies in terms of performance.  The db4o I looked at a couple years ago was drastically inferior to even SQL 6.5 in several key issues regarding data integrity and performance.  Maybe they've "caught up" to SQL 2000 or 2005, maybe they haven't.

But I agree with ajj that it's not usually wise to have performance as a top initial consideration anyway unless there is a drastic difference at no additional cost, because it almost always will lead to only Premature Optimization.  And I think honestly you'd be hard pressed to make much of an argument of one technology being superior to the other as a general rule without extensive testing of a hundred different scenarios.

I agree the big question is really whether it makes the code significantly easier to write and maintain.  I think you could make a good argument that the elimination of ORM for CRUD functionality holds great promise for accomplishing that, but if you have to jump through hoops for concurrency, reporting, XML transformation, or whatever else that may be provided by a more mature relational DB, maybe it's not worth it. 

And in Orcas we'll have LINQ against SQL Server... what does that do to the balance even for the coding concerns?

Justin replied on Wednesday, March 07, 2007

DansDreams:

Justin, I think by "performance considerations of using object graphs" you would have to be talking about the inherent horrible inefficiencies of loading and persisting way more data than was necessary that previously plagued products like this.  This isn't an advantage of  this product, it's mitigation of a previous disadvantage.  We're on the same page with that, right?

I think it's probably mostly silliness to compare the two technologies in terms of performance.  The db4o I looked at a couple years ago was drastically inferior to even SQL 6.5 in several key issues regarding data integrity and performance.  Maybe they've "caught up" to SQL 2000 or 2005, maybe they haven't.

But I agree with ajj that it's not usually wise to have performance as a top initial consideration anyway unless there is a drastic difference at no additional cost, because it almost always will lead to only Premature Optimization.  And I think honestly you'd be hard pressed to make much of an argument of one technology being superior to the other as a general rule without extensive testing of a hundred different scenarios.

I agree the big question is really whether it makes the code significantly easier to write and maintain.  I think you could make a good argument that the elimination of ORM for CRUD functionality holds great promise for accomplishing that, but if you have to jump through hoops for concurrency, reporting, XML transformation, or whatever else that may be provided by a more mature relational DB, maybe it's not worth it. 

And in Orcas we'll have LINQ against SQL Server... what does that do to the balance even for the coding concerns?

I was referring to the inefficiencies of trying to persist an object graph to disk with previous products. I am not sure what your getting at with mitigation of a previous disadavantage than it being just another way to say they solved the problem.

Why are these issues not a problem with a in memory object model? They simply approached the problem with a similar solution that is they don't load the data from memory until its accessed, but this memory happens to be disk based instead of volatile ram. Does it work as advertised, I don't know but the design is sound.

As far as persisting too much data, well thats entirely up to your object model, maybe a more specific example could help me understand were your coming from on that.

They have seemed to improve performance over all dramatically including the fact they have a query optimizer now. Some operations they have shown to be many times faster than a traditional db, but I doubt they are faster in all areas of course.

I feel I could have some use today for this to replace things like SQL everywhere edition or Jet and perhaps Full SQL in ther future.

Justin 

 

DansDreams replied on Thursday, March 08, 2007

Justin:

I was referring to the inefficiencies of trying to persist an object graph to disk with previous products. I am not sure what your getting at with mitigation of a previous disadavantage than it being just another way to say they solved the problem.

Why are these issues not a problem with a in memory object model? They simply approached the problem with a similar solution that is they don't load the data from memory until its accessed, but this memory happens to be disk based instead of volatile ram. Does it work as advertised, I don't know but the design is sound.

We're saying the same thing really.  My point was just that reading their material I would imagine you could come to the conclusion that they've come up with some radical improvement in performance that makes the product superior to the alternative method we've all been using.  In reality, it may be simpler to code, but it's likely no more performant.

Do you know how the "until it's accessed" idea works.  Say I load an Order and the related Customer, but db4o is smart enough to know I really only need Customer.Name and only loads that from the database.  Now my UI allows the user to edit the Customer from the order screen, so I need the Customer more fully hydrated.  When and how does db4o do that?  Does it hydrate the existing object more fully or do I have to retrieve another reference to the same Customer?

That seems to have really powerful potential, but there are a lot of potential "gotchas".

Justin replied on Thursday, March 08, 2007

DansDreams:

We're saying the same thing really.  My point was just that reading their material I would imagine you could come to the conclusion that they've come up with some radical improvement in performance that makes the product superior to the alternative method we've all been using.  In reality, it may be simpler to code, but it's likely no more performant.

Do you know how the "until it's accessed" idea works.  Say I load an Order and the related Customer, but db4o is smart enough to know I really only need Customer.Name and only loads that from the database.  Now my UI allows the user to edit the Customer from the order screen, so I need the Customer more fully hydrated.  When and how does db4o do that?  Does it hydrate the existing object more fully or do I have to retrieve another reference to the same Customer?

That seems to have really powerful potential, but there are a lot of potential "gotchas".

The radical performance improvements they claim in certain queries has more to do with OODB's being inheritently better at certain operations that usually deal with deep heirarchies (this is nothing new and has been why OODB's have been used for many years for certain applications).

They have documentation on Transparent Activation which is "until it's accessed". It seems it is currently NOT implemented as I had thought, but still in design phase. From what I can tell they would actually instrument our objects (like a profiler would) injecting thier own code in the field getters so instead the get trying to retrieve the field value from a memory location(since it's not there yet) it is redirect to db4o to retrieve from its datastore, but once loaded it would go to the heap.

So until that is implemented you have to set the activation depth yourself, such that a Order object would be retrieved but it's Customer reference would not be activated. You would need to would need to explicitly call activate on it to get the member data.

So until that is actually implemented you could still see a performance disadvantage with using a full Customer object reference. Time will tell if they can acutally implement it eliminating that as a problem, you can get around it now with about as much work as doing a SQL join and loading into a composite behaivor based object(Order.CustomerInfo or just Order.CustomerName).

Justin

 

Justin replied on Wednesday, March 07, 2007


http://forums.lhotka.net/forums/permalink/12361/12671/ShowThread.aspx#12671 in reply to http://forums.lhotka.net/forums/permalink/12361/12663/ShowThread.aspx#12663 niether of which mention a database technology, but instead what would happen if you had an Order class with a property that was a reference to a Customer class vs borrowing some Customer properties and putting them on the order. I guesss we will have to disagree and Rocky and Richard will need to clairfy what they meant, athough it is hard to see how Rocky's response is not directed at the class design based on how they will have to be loaded from the datastore regardless of whether it is a OODBMS or RDBMS. Either way Rocky is pretty clear that it a a poor choice for performance reasons alone, and since that would hold true on an RDBMS how could that be an argument for an RDBMS and against an OODBMS?

ajj3085:
Why would you always need a customer info object, as long as the order knows the unique id for the customer?   There's no requirement at all the the Order object contain any other customer information at all.

Why would you put a Customer ID property on an Order, that is a property of a Customer not an Order, instead should'nt the Order have a property that is a reference to a Customer instance? A customer places an order not a customer id that is only a way for users to unquely identfy the customer, and in the case of an RDBMS a way for it to replicate a object reference. Do you generate surrogate keys in your databases? Why do you think so many use identity and GUID's as PK's, because RDBMS's don't support instance references directly you have to do that your self by coding a join, and hand coding which columns should be joined on the customer for performance reasons otherwise you would just do select * no?

ajj3085:
The info objects aren't created for performance reasons.  If loading the same data in both Customer and CustomerInfo, I doubt that the actual load from the db is much different in either.  Instead, the info objects exist because the editing behaviors of the BusinessBase subclasses aren't needed.  Take INotifyPropertyChanged; it defines a behavior which is only used on BusinessBase, because readonly objects don't need such behavior.

If they arent created for performance reasons then you could put all the Customer properties on the CustomerInfo and this would be a good design? If I used the full customer object how would this affect the code that display's and manipulates the Order in any way whatesoever except performance? How would that affect maintenence in any way? Why wouldn't I INotifyPropertyChanged on a read only object if the underlying datastore could notify when the data changed in the persistent store while I had it loaded(maybe another user edited it), why would you remove it if only to save on overhead of it being difficult to implement on current RDBMS's?

ajj3085:
Your first question is absurd.  Following one vs. the other won't yield results were one design always leads to more performant code.  I'm sure some arguments are out there that will try to tell you that, but those arguments would be absurd as well.

Why is is absurd or do you simply want to keep dodging it? I never said it would ALWAYS lead to more performance only that it is one factor that is a reality when trying to implement a data centric design, and Rocky seemed to express that as well.

ajj3085:
I have a case now where I have two objects which populate same tables, with one exception.  I have a business object for Contact.  Only sales people may use that object, and they can create, modify and merge contacts.  A first and last name is required as well as a company / department.

I also have a SoftwareRegistration.  The data is the same; contact details, including first and last name, department and address.  Only SoftwareRegistration users may create such an object; loading / editing is not allowed.  The registration links a software serial number to a contact.  That contact may not exist at all, and the SoftwareRegistration object (plus a few others) allow registration staff to create a contact if it doesn't exist, or to add an address to a contact if they find the user already exists, but the address on the registration form is not one of the contacts addresses.

Ok so if a interpret correctly SoftwareRegistration is equivalent to an Order and Contact is equivalent to a Customer? So why aren't you reusing the Contact? The security when it is allowed to edit is based on user context not which object references the Contact instance. What happens if you need to to fix a Contact on a SoftwareRegistration because of a misspelled name, do the Sales staff have to find by name and open and edit it? If so why can't they open the SoftwareRegistration and edit the referenced Contact there? Is address a separate object with its own table or are they denormalized with address fields in both the SoftwareRegistration and Contact objects and tables?

ajj3085:
Yup, no need to use a customer object, which has all the code for editing when you only want to report on a customer.  Why would you need two smartdates?  You'd just make the property readonly if you wanted to prevent edits.  Not every property on an editable object is necessarly editable.

What exactly does it change in the report if the code is there for editing but unused? If smartdate wasn't a struct it read only would not prevent it from being edited only from changing references I could make my Customer a struct too then I wouldn't need a whole different classes just to preven.

ajj3085:
I don't think so; db4o could eliminate the need for ORM when going between the data layer and business layer, but I still think you'd design your business layer based on behaviors (use cases).

Well that's your opinion but I still haven't heard a specific argument against the Order-Customer senario that didn't involve performance. You still haven't given any specific on how Using a full editible Customer object reference on a Order object has any maintenence disadvantages.

ajj3085:
Where did I say never?  Performance considerations may lead you to alter your design somewhat, but this would usually be done after the initial design proves to be inefficent.  Technically CustomerName might not even belong on Order, since that object doesn't need it for any behaviors which Order would implement.  You may add it though as a convience, so that if you need to show the CustomerName on the screen you don't need to load a full blown customer object.
 

Here:

ajj3085:
Behavior based design has never been about performance.

So why would you add CustomerName to an Order, oh it's convience but not performance? To me it more convienient to add a reference to a Customer object to an Order than to copy a property from a Customer and add sql code to join it from the Custoemr table, but it is less performant using an RDBMS.

ajj3085:
I've said before there are certainly places for data centric designs.  You really don't know what I believe, and I think you should refrain from attempting to tell me what I do or do not believe.  More classes does not necessarly mean more complexity; a few small simple objects can be simplier than one big, complex objects.  That bigger object is also harder to maintain typically.

Sorry to assume to much I should have phrased it differently perhpas, but when you said 

ajj3085:
Regardless of your datastore, you'd still want to avoid data centric designs for your business objects.
I took it to mean data centric designs have no merit in your opinion.

ajj3085:
I'm not sure that Order should break because of some database changes that have to do with customers.  Order should still be able to perform its tasks.  If order starts worrying about customer specific behaviors it will break, and you'll need to change it.  A better solution is that Order doesn't worry about those.. less code breaking means less chances of introducing new bugs due to change.

It should break if a new requirement is added to a customer such as Last anme and Company/dept being required when you place the order. I guess in your case you could have to now add the required rules to the Order object too, unless Customers don't really always have those requirements, then is it really a Customer or in fact two different entities?

ajj3085:
Are you for real?  The data is part of the objects state; the data is kept with the rules, that's kinda the point.  How do you go from being able to code gen property declartions to keeping the data in a whole seperate object?

Putting a CustomerName property on a Order object is separating data from rules, any rules the CustomerName must follow can now not be just contained in a Customer class they must be copied or perhaps put in a third class now. If you gen your properties from a base schema(the db table?) then you have obviously separated the rules that apply to those properties from the properties.

ajj3085:
Again, I never said performance is never an issue.  Its just not the driving force behind the design.  In this situation, there is some need for a CustomerName property, and rather than load all customer details, just the name is retrived and added to the Order object for the sake of convience.  It probably wouldn't be part of an inital design, because Order has no need for the CustomerName field at all, the order would likely only need to know the Customer's id (and may or may not expose it, depending on the use case).  Its likely that the CustomerName does need to be displayed though and is added for the sake of convience and performance.

You came in this thread with:

ajj3085:
Performance and scalability issues haven't been listed as reasons to stay away from data centric designs.. at least not here.

Does "haven't" not mean never as in the context of "here"(this forum). Yet performance and scalability have been listed and you still do not concede they are considerations for a design?

ajj3085:
Peformance is usually taken into account after an initial design (regardless of how you came to that design).  Performance is one of those things that's best addressed if it is a problem.  Object design is a process, of course it may need to be modified if it doesn't perform well.

I again disagree, performance must be taken into account from the begining it is an integral part of a good design, in fact wouldn't you say a sucessful design is one that has a balance between performance an ease of a maintenence? If I code my business apps in pure assembler it will probably be faster than the .net version but the cost in maintainability would be too great. The reason performance why it may not be a problem for us is because of the years of accumulated knowledge we all enjoy from our predecessors developing fast and easy to maintain code abstractions and patterns.

ajj3085:
I honestly don't know much about db4o; if it works as advertised, I don't think it will have an impact on what method to use to design business objects, I think it would make the persistence part of the business objects easier because it would eliminate the need for ORM.

Well it also eliminates the need for Surrogate PK's and FK's (there goes the Order.CustomerID property) and potentially performance issues that come with trying to use full datacentric objects that ORM's fall down on. You should look into it, check out the founder Carl Rosenberger post on comp.databases.theory, the relational guys don't like it to much and have thier own arguments.

Justin

Copyright (c) Marimer LLC