OT: Design question for db

OT: Design question for db

Old forum URL: forums.lhotka.net/forums/t/907.aspx


ajj3085 posted on Tuesday, August 15, 2006

Hi,

I have a table which will store different kinds of documents (which share much the same data).  Invoices, quotes, RMAs, and orders.  Its likely this list won't change.

So, should I have a simple char column, Q for quote, etc.  An int (which the BO knows)?  In the business layer, there doesn't seem a need to expose publicly the document type identifier... but at the same time if feels odd to hard code some values (although on the other hand they really shouldn't change ever).

Any comments?
Thanks
Andy

david.wendelken replied on Tuesday, August 15, 2006

Well, you're going to have to hard-wire something somewhere... :)

For something this simple, you might as well make it simple!

You might want to consider using an enumeration to define/store/document the codes.

(At least for non-database code.)

That gives you one stop shopping on defining what code means what.  The enum label.ToString() can be used to pass to the database, and doing it this way gives you a compile-time error instead of a run--time error if you mis-remember the code somewhere.

 

 

ajj3085 replied on Wednesday, August 16, 2006

Yes, I'd likely have an internal enum or something like that.

Thanks for the opinion.

JHurrell replied on Wednesday, August 16, 2006

My vote would be for an int column that specifies the ID of the document type with a foreign key to a document types table.

I think you should keep it in the database this way instead of using an enum in the BO.

The precedent for this has already been set by using a Roles table. We don't have any trouble hard coding things like IsInRole("Administrator") so I think that hard coding logic that identifies the type of document is okay.

- John


ajj3085 replied on Wednesday, August 16, 2006

John,

Normally I'd agree with you.  The role list can change over time, and thus its better to store said list in the db. 

In this particular case though, I'd be hard pressed to say that I think the list will ever change.  Document will be quote, order, invoice, rma, and credit memo.  That's it.   So my question, why have a DocumentType table in the db at all?

JHurrell replied on Wednesday, August 16, 2006

I can think of a couple reasons.

First off, while the document types will most likely never change, the things that you might want to do with those types might change. You may decide later to offer reporting or advanced modeling or forecasting. If you use a DocumentType table, you can better leverage this because you have a link between the documents and their type.

If you go down the enum route, you instead have a column or columns that identify the type. Those column(s) are really properties of a document.

This ties into the second reason in that it is possible that those enum type columns could be changed to an invalid type. While I agree it's not likely, let's say that a type is changed from "quote" to "abc". It's no longer really a quote. It's not anything. While your BOs should protect from this kind of change, they cannot protect against direct changes in the database.

If instead, you had a type column, the DocumentType parent table will not let you change a type to one that doesn't exist. It's just another level of control and security.

- John

SonOfPirate replied on Wednesday, August 16, 2006

I agree with John.  Having the foreign key constraint adds another level of protection to ensure data integrity.

Something else to consider.  Even though you feel that the list will change infrequently, if ever, think about what would be required to make such a change if you used the enumeration approach.  Re-compile and re-deploy!  Whereas, having the types in a lookup table means all you have to change is the database with no collateral changes to the application itself.

One other point that tends to come up in discussions like this - if you create a DocumentTypes table and go that route, do NOT create an enum to handle the coding requirements as this undermines the whole point.  That may seem obvious to most of us but I can't tell you how many times I've seen that done.

Hope that helps...

 

ajj3085 replied on Wednesday, August 16, 2006

SonOfPirate:
I agree with John.  Having the foreign key constraint adds another level of protection to ensure data integrity.


I could use Rules / UDTs in Sql server to the same effect.

SonOfPirate:
Something else to consider.  Even though you feel that the list will change infrequently, if ever, think about what would be required to make such a change if you used the enumeration approach.  Re-compile and re-deploy!  Whereas, having the types in a lookup table means all you have to change is the database with no collateral changes to the application itself.

If the list changes, I'll have to recompile anyway.  Invoice, Quote, etc. are all concrete classes in my business layer as well.  For roles I find this reasoning acceptable; add a new row, the existing system can handle it since you're just checking one piece of data against another.  But a new document type would almost certainly have new behavior, which cannot be acomplished just by dropping a row in the database.

SonOfPirate:
One other point that tends to come up in discussions like this - if you create a DocumentTypes table and go that route, do NOT create an enum to handle the coding requirements as this undermines the whole point.  That may seem obvious to most of us but I can't tell you how many times I've seen that done.

Agreed.  If I have a table, there's no need for the enum.  If i don't have the table, I have a need for an enum.. (although maybe not..)

Just to be clear, I'm not decided yet, just playing devils advocate to get some good reasons one way or another.

Thanks!
Andy

ajj3085 replied on Wednesday, August 16, 2006

JHurrell:
First off, while the document types will most likely never change, the things that you might want to do with those types might change. You may decide later to offer reporting or advanced modeling or forecasting. If you use a DocumentType table, you can better leverage this because you have a link between the documents and their type.


Not sure I follow; you'd be able to tell looking at the column what the type of document is.

JHurrell:
If you go down the enum route, you instead have a column or columns that identify the type. Those column(s) are really properties of a document.


How is this fundamentally different than having a DocumentTypeId column that identifies the document type?

JHurrell:
This ties into the second reason in that it is possible that those enum type columns could be changed to an invalid type. While I agree it's not likely, let's say that a type is changed from "quote" to "abc". It's no longer really a quote. It's not anything. While your BOs should protect from this kind of change, they cannot protect against direct changes in the database.


This is a valid point; I could counter by creating a UDT that only accepts the predefined values.

JHurrell:
If instead, you had a type column, the DocumentType parent table will not let you change a type to one that doesn't exist. It's just another level of control and security.


Hmmm. yes and no.  If you can change the type id, you can probably just insert a new row in DocumentType.  Also, the UDT could prevent invalid values as well.

JHurrell replied on Thursday, August 17, 2006

ajj3085:
Not sure I follow; you'd be able to tell looking at the column what the type of document is.


What I meant by that is there is nothing that would prevent a person from directly editing the database column and entering "abc" or any other text. Since there's no foreign key consrtaint, the column is essentially just another value. There's nothing about it that MAKES it a document type beyond whatever logic is contained in your BO.

ajj3085:
How is this fundamentally different than having a DocumentTypeId column that identifies the document type?


In one case, you have a database column that can contain a value which has no relation to any other objects in the database. In order to make sense of it, you'd need to have knowledge of the BOs.

In the case of a foreign key, you can JOIN the tables and you now have potentially valuable business knowledge that can be shared by other systems.

Granted, you could provide documentation that says, when DocumentType is 1 then it's a quote, when 2 then it's an order... but that's not elegant.

ajj3085:
This is a valid point; I could counter by creating a UDT that only accepts the predefined values.


So now you're encoding business logic/validations in two places. We all know that's not ideal.

ajj3085:
Hmmm. yes and no.  If you can change the type id, you can probably just insert a new row in DocumentType.  Also, the UDT could prevent invalid values as well.


A person COULD just enter a row in the DocumentType table. However, SQL Server does provide object-level security to restrict access to procedures and CRUD operations on tables. I think that these mechanisms are a better means of limiting input than constraints via BO and UDT.

- John

ajj3085 replied on Thursday, August 17, 2006

John,

Thanks for your continued participation in this thread.  Continuing my role as devil's advocate...

JHurrell:
In the case of a foreign key, you can JOIN the tables and you now have potentially valuable business knowledge that can be shared by other systems.

Granted, you could provide documentation that says, when DocumentType is 1 then it's a quote, when 2 then it's an order... but that's not elegant.


I would go with O for Order, Q for quote, etc.  I can certainly appricate where you're coming from, but I'm not sure I see how this is different than the external app looking for 'Quote' in the DocumentType  table (and remembering the DocTypeId) instead of Q in the Document table.

JHurrell:

ajj3085:
This is a valid point; I could counter by creating a UDT that only accepts the predefined values.


So now you're encoding business logic/validations in two places. We all know that's not ideal.


Again, for the sake of being a devil's advocate.. For yes / no fields, do we not usually choose the bit datatype?  For straight integer types, do we not pick int / smallint / bigint?  Would you choose a smallint if the application would NEVER exceed the storage capacity?  Is it a business rule that you only have one character out of a set of allowed characters or can you think of that as a data rule?  Is the fact that we pick an int (which disallows character data) a business rule, or just part of the data model?

JHurrell:

ajj3085:
Hmmm. yes and no.  If you can change the type id, you can probably just insert a new row in DocumentType.  Also, the UDT could prevent invalid values as well.


A person COULD just enter a row in the DocumentType table. However, SQL Server does provide object-level security to restrict access to procedures and CRUD operations on tables. I think that these mechanisms are a better means of limiting input than constraints via BO and UDT.


A UDT would be protected as well, just as the document table is.  Its still a data type, just a more specialized one, like the money datatype or varchar vs. nvarchar or bit.

Thanks again for the discussion.
Andy

SonOfPirate replied on Thursday, August 17, 2006

Maybe this will help with the thought process:

Where and how do documents get added to the database?  Presumably, these records are added by your business objects in your application via AddQuote or NewQuote and some sort of Apply or Save method.  How does the business object that is writing the object to the database know what letter to use for the DocType?

If you go the route of a UDT in the database, then you've defined the list of valid values in the database and presumably hard-coded them somewhere into your application.  Not good practice.

However, using a lookup table with a foreign key reference makes it so the list of DocTypes is in one place.

I am assuming that your object model is designed so that each of your "document" BOs inherits from a common base class.  In other words, you have declarations similar to:

public class Invoice : Document

public class Quote : Document

etc. Since they are all coming from the same table, this would allow you to simplify the data access code and common methods and properties to them all (presumably matching the table columns).

One of the properties in the base Document class would be an abstract (MustOverride in VB) read-only property name something like "DocumentType".  In each of your derived classes, you would provide the actual type.  This property would then be called by your data access code to establish the correct doc type when the item is added to the database.  For example:

public class Invoice : Document
{
    public override DocumentType DocumentType
    {
        get
        {
            return DocumentTypes["Invoice"];
        }
    }
}

The DocumentType class would expose an ID or Code property (or whatever) which is used as the primary key field in the database and establishes the foreign key relationship.

Again, the point I was trying to make had (mostly) to do with where the DocType calues would be defined and only defining them in one location rather than in both your application code and database.  You could certainly make the argument that it would be easier to implement DocumentTypes as an enumeration in code and return DocumentTypes.Invoice in the above example, but then we are back to the earlier posts or looking at implementing a UDT that still has to have knowledge of these values in order to function.

Hope that make sense.

SonOfPirate replied on Thursday, August 17, 2006

Another approach to the PhoneType question is to add a Public bit column that is true (1) if the record is to be shown to users and false (0) if not.  Then, your views and sproc can filter the returned resultset based on the particular function being performed.

ajj3085 replied on Thursday, August 17, 2006

I thought about this approach too; I was trying to plan for the future, but I suppose other 'non public' types needn't be seperated into their own category.  I think I'll go this route because 'YAGNI'.

Andy

SonOfPirate replied on Thursday, August 17, 2006

Yea, I'd thought about having "Invoice" as the key for the collection when I typed it.  But, at some point you are going to have some way of identifying the doc type in code.  To me, the more flexible and robust solution which I have used many times (because most of our apps are extensible) would be to use the type name rather than a hard-coded string.  In other words, in code, your DocType is the type of the object you are dealing with.  In your DB's DocTypes table, you'd have a column for ObjectType and another for the primary key (ID) field referenced from your Documents table.  In your data access code, use this.GetType().FullName (Me.GetType().FullName in VB) to set the value of the parameter.

Then, if you add another doc type to your application, as you said, you are going to create a new class in code then add a new record with the new class' type name in the DocTypes table and you're good to go.

What I find unique about this approach is that we would typically have a separate table for each document type you've described as each contains unique data.  So, this is a bit of a different scenario although the logic behind lookup table vs. enumeration vs. UDT is more generic and common outside of this application.  I think ultimately it is going to come down to your overall data access strategy.  I've had arguments, I mean discussions, with many other developers about what should and shouldn't be in a stored procedure.  You can debate what logic goes where until the cows come home because everyone has their own way of doing things and opinions on what is right.  What it boils down to is what you are doing elsewhere and what works best for you.  I think if you considered using a lookup table for PhoneTypes then there is a possiblity that you already have other lookup tables and going that route would maintain the consistency of the approach throughout the application.  If you are more inclined to use a UDT or stuff all kinds of logic in your stored procedures, then go that route.  I think consistency is sometimes more important than the approach we chose for an individual task.

But, I deal with a lot of contract developers and external customers so the training curve for new developers, maintainance and upgrade time and effort, etc. are all considerations to our designs.  Consistency and standardization are big factors in those areas.  Plus, recognizing the abilities of (or lack there of) our developers also plays a role.  Not everyone knows what a UDT is let alone how to program and/or make use of one.

Just some things to think about.

ajj3085 replied on Thursday, August 17, 2006

Pirate,

Some great points.  You've hit across the reason I decided to post this question here.  Creating PhoneTypes for me is a no-brainer. 

What goes in that list has really no bearing on business rules anywhere (except that a phone must have a type), and its common for the user's to want more types of phone numbers.  For example, I had started with Fax, Cell, Home and Office.  The users said 'oh, add a Lab phone number as well.'  Lists like that, which have little impact on business rules, and / or are likely to be updated frequently or by the users, I have no problem with using a lookup table.

For something which users will never change, is only needed by code, and which are likely to never, ever charge, well I started to question the idea of making a seperate table for those kinds of things.  I honestly don't think we'll ever have more than five different document types.

To talk about why I have a Document table vs. one table for each type.. the users currently use QuoteWerks, and really the only difference between a Quote, Order, or Invoice is a label saying what the document is, and how the document numbers are generated.  This part of the behavior works just fine for them, which means that mostly that the documents in the new system will behave almost identically.  For the 'extra' data that isn't common, I'll likely have the seperate table, Invoice, but it would only have an FK to the Document table, plus the 'extra' data.

Andy

ajj3085 replied on Thursday, August 17, 2006

Pirate,

First, yes, you're right about my object model.  Invoice inherits from Document.  New invoices are created by factory methods, etc.

SonOfPirate:
If you go the route of a UDT in the database, then you've defined the list of valid values in the database and presumably hard-coded them somewhere into your application.  Not good practice.

However, using a lookup table with a foreign key reference makes it so the list of DocTypes is in one place.

followed by

SonOfPirate:
public class Invoice : Document
{
    public override DocumentType DocumentType
    {
        get
        {
            return DocumentTypes["Invoice"];
        }
    }
}

See, this is what i'm getting at; "Invoice" is harded coded in the property getter anyway... so what is the purpose of getting an int ID value instead of just returning "Invoice"?

One said of me says that the UDT approach is similar to choosing the appropiate data type.  Its just like smallint, which has a limit on the set of integers which the datatype can contain, the DocumentTypeUDT is a char, which limits the set of alphabetic characters which it can contain.

ajj3085 replied on Thursday, August 17, 2006

Just wanted to add another question to this..

I have a table, PhoneType, which contains the data about the type of phone number (cell, lab, office, etc.).  I'm going to add more types, ShipTo, SoldTo, BillTo. 

The catch is that the new ones are used ONLY internally, while the existing ones are exposed to the user.  The user, when entering phone numbers, selects the type from a drop down. 

I don't want the new types to be in that list.  The new ones are used internally only, and are fixed (there's a SoldTo tab, for example, and the number there is automatically marked as a SoldTo phone number).

My thought is to create a PhoneTypeGroup table, and adding a PhoneTypeGroupId FK to the PhoneType table.  The PhoneTypeList business object would always filter the list to those acceptable for the user to select; my other BOs would get a list specific and sort out exactly which type is the bill type.  My gut doesn't really like that idea (looking for a certain string from a list of items, to find the id) but my brain says to trust the database (because I control who can change what Smile [:)]).

Copyright (c) Marimer LLC