newbie questions

newbie questions

Old forum URL: forums.lhotka.net/forums/t/986.aspx


yipchunyu posted on Wednesday, August 23, 2006

Just in the very beginning stage of our new content managment sys.
some questions for all of you.
1. I tired to build a class diagram inside visual studio 2005.  (just as the case in page 342 of the book" Expert VB 2005 Business object").  in the book, the solid-line represent collaboration between objects and dashed lines seems to represents possible to navigate.  However, in VS2005, it represents different things ie. inheritance and association.  So, any advice on this issue?
2. I don't truely understand the best practice to handle attachments in our cases.  Should we stored in DB and act as a property in the business object (the whole attachment)?  or should it just a pointer to a file system?

RockfordLhotka replied on Wednesday, August 23, 2006

I used the class diagram tool to create the basic class diagrams, which are in the code download and available from www.lhotka.net/cslanet/download.aspx.

However, in the book I merely used a similar notation. It turns out that the class diagram in VS is incapable of illustrating all the relationships and concepts I needed to show in the book, and so the book's diagrams are "inspired" by the class diagram in terms of graphics, but are not created by the class diagram tool in VS.

yipchunyu replied on Wednesday, August 23, 2006

Hi Rocky,
If i use Visio to build the object model.
should i use "generation", "Binary assosication", "Composition", "link", "dependency" ... to represent the relationship presented the chart shown in page 342?

RockfordLhotka:
I used the class diagram tool to create the basic class diagrams, which are in the code download and available from www.lhotka.net/cslanet/download.aspx.

However, in the book I merely used a similar notation. It turns out that the class diagram in VS is incapable of illustrating all the relationships and concepts I needed to show in the book, and so the book's diagrams are "inspired" by the class diagram in terms of graphics, but are not created by the class diagram tool in VS.

ajj3085 replied on Wednesday, August 23, 2006

Well, there are advantages and disadvantages to storing attachments in the database vs. on the file system.  I opted to store attachments in the database to easy backup / restore requirements.  All the data is in one place. 

yipchunyu replied on Wednesday, August 23, 2006

ajj3085:
Well, there are advantages and disadvantages to storing attachments in the database vs. on the file system.  I opted to store attachments in the database to easy backup / restore requirements.  All the data is in one place. 

but will not be an issue if the file size is not small in average (5MB ~ 10MB) as using the business objects concept with CSLA?

ajj3085 replied on Thursday, August 24, 2006

It depends on your environment.  This choice has little to do with Csla, since with or without Csla, you still have to determine how fast 10 meg can get from the database to the client machine.  Will it be faster or slower than opening a file on a share?  You'll have to do some testing.

SonOfPirate replied on Thursday, August 24, 2006

The general rule of thumb as recommended by DBA's is to NOT store binary information inline with your records in a database.  While I have seen and made arguments for doing so, the preferred route is to store a reference to the file located in the physical file system.  Of course, there are always exceptions - that's why it's a rule of thumb.  If you have a solid reason for embedding, then do so; otherwise, lean towards using the file system.

My understanding of the rationale has to do with performance scanning and retrieving records.  Like you inferred, the larger the attachment, the larger the record.  Obviously this plays a role in the performance you see in your DB when trying to retrieve records.

Also consider that in most cases, especially with web apps, all you need is the reference to the file at first glance anyway.  Attachments are typically shown as a link so you don't need the binary column with the rest of the data anyway.

And, in all of our applications (preference), our attachments are defined in a separate table that references the parent (document) because they are optional have a many-to-one relationship with the parent doc.  While this does reduce the impact on the parent table, you still have to perform a join to the attachments table to retrieve the list.

Ultimately, much like almost everything else we do, it is up to you.

Hope that helps.

ajj3085 replied on Thursday, August 24, 2006

SonOfPirate:
And, in all of our applications (preference), our attachments are defined in a separate table that references the parent (document) because they are optional have a many-to-one relationship with the parent doc.  While this does reduce the impact on the parent table, you still have to perform a join to the attachments table to retrieve the list.


This is what I do as well.  If you don't need the attachment, you don't need to join at all, and your performance should not suffer.  Aside from that, I'm not sure what the arguments against storing binary data in the database are. 

SonOfPirate:
If you have a solid reason for embedding, then do so; otherwise, lean towards using the file system.


I'd say lean toward storing in the database; you cannot maintain the integrity of data which is not stored in the database, and files are pieces of data which can easily be modifed by anyone (assuming permissions).  Also, you have more overhead for backup, a huge PITA should you need to move the file store for some reason, etc.

SonOfPirate replied on Thursday, August 24, 2006

Well, I did say the rule of thumb came from DBA's.........

ajj3085 replied on Thursday, August 24, 2006

Fair enough Smile [:)]

What are their reasons though?  Just curious..

SonOfPirate replied on Thursday, August 24, 2006

I would have to ask again, but what I remember has to do with performance in general with large data types (binary, image, ntext, etc.).  There was also an issue of data-portability from platform to platform should that need arise.

From a development standpoint it also makes life easier to deal with a file system, obviously, because your not contending with streaming content and can allow the OS to handle retrieving the file.  Changes to the file system are easily handled by using a config setting to indicate the physical path and only storing relative paths in the database.  But you are right about data integrity as someone can obviously remove the file with a reference remaining in the db (my biggest argument with the DBA's!)

Oh, another issue presented by the DBA's was the size and time required to backup the database when 2/3 of its size are blobs that could be stored externally.  The contents of the file will most likely change much less frequently than the database contents, so separating them would separate the backup requirements as well.

I know there was more and am sure there are a ton of articles online listing the pro's and con's.  Not saying I agree 100%; as I said, I've made arguments for doing so as well.  It's just the rule of thumb I've been told.  I think the application dictates more than the DBA (at least for me) if we're going to follow the rule of thumb or not.  I typically look at the size and quantity of the files and use that as a guideline.  If you are dealing with a LOT of files and/or files that have substantial size, then why bog your database down with the task of streaming the file when you can leave it to do its job (managing data) and let the file system manage the files?

Everything in programming is subjective, eh?

 

xal replied on Thursday, August 24, 2006

A well designed DB can have different FileGroups and files. You could easily create an mdf for the purpose of holding large blobs of binary data and that way you wouldn't suffer fragmentation or bottlenecks when working with regular data.
Of course, DBAs don't want to have to maintain this structure, so the will want you to put the binary information directly into disk. Big Smile [:D]

Andrés

yipchunyu replied on Monday, August 28, 2006

thx all for your advice.

guyroch replied on Monday, August 28, 2006

I agree with the general spirit of this thread in that a DB is design to store data and a file system is design to store files.  With that said I would definitly store the attachements on the file system.  In fact I had that very similar use case a few months ago.  I had attachements stored into the database and that did put a pay load on performance and had to revert back to storing the attachements on the file system instead of the DB.  Take it from me, store the attachements on the file system.

yipchunyu replied on Tuesday, August 29, 2006

a related question.
In the above case, the attachment(s) is one of the property of the object 'document' (in my design).  Must I seperate it as a new classes or use some other way to handle this? (assuming that I just stored the file reference in another table in DB)

ajj3085 replied on Tuesday, August 29, 2006

What is the attachement now?  A byte array?  I would say seperate it into its own class which knows how to 'persist' itself.  Then if you ever want to change how you store the attachment (file vs. db)  you'll have one class to change and its parent will be none the wiser.

Copyright (c) Marimer LLC