OT: Database design question

OT: Database design question

Old forum URL: forums.lhotka.net/forums/t/610.aspx


albruan posted on Thursday, July 13, 2006

I received a signed contract today to rewrite an old dBase III/IV application that has been in use since 1991 at a large consulting engineering/project management firm.  Having gone through the logic and the data tables in the existing product, I'm ready to begin design work using SQL Server 2005 for the back-end.  The problem is I'm not really sure which is the best approach to take with it.

Currently, when a new project is entered into the system, an entry is made into a Projects table and four table templates are copied and renamed according to the Project ID.  One table contains information pertaining to drawings for the project ; another one contains text describing the scope of the work to be done; a third table contains area descriptions for areas within the disciplines (Architectural, Foundation, Structural, Piping, etc.) involved in performing the work; and the fourth table contains departmental- and drawing-specific qualifications for the drawings.  As new drawings are entered into the system, data is written to a system-wide Drawing table and at least three of the four copied tables.  After a certain amount of time...just guessing here, but I believe it would be shortly after completion and final sign-off on the project...they can elect to archive the data for the project.  This involves copying those four tables that were created from the templates onto a floppy disk, setting the Archived field for that project to True in the Projects table, and deleting the tables from the directory on the file server; the project can be restored by reversing the operation.

My first thought was to keep the database down to five or six tables and using the DwgInfo, Scope, Areas and Qualifications tables for all the projects; I quickly rejected that due to the enormous task in removing data from those tables when a project is archived.  I next thought about approaching this by simply adding additional tables to the Projects database as new projects were created and then using SQL Server's bcopy function to copy the data in/out of the tables when the project was being archived/restored.  My third option is to create an entirely new database for each project entered into the system and using the TransactionalDataPortal to keep everything in sync; in that case, I could tell SQL Server to detach the database when it's to be archived and make calls to the System.Io namespace to handle copying the database to another location.  The only problem I see with the third option is having to write all the SQL code to handle creating a new database, creating the tables and indices for it, and creating all the stored procs to operate on the data within the new database.

Any opinions on which would be the best option?

Allen B. Anderson

david.wendelken replied on Thursday, July 13, 2006

Archiving the data in an MDF file may be a bad idea, because it's tied to a specific version of SqlServer.  Depending upon when they need to restore it, it may not be possible.  They may have lost their SqlServer 2005 CD and SqlServer 2011 may not be able to read it...

Using separate MDF files for the live system is also problematic when it comes to cross-project reporting. 

From your comments, I'm guessing that each project may have LOTS of data?  Have you looked into using the partitioning tables option in the database?  It's like having separate tables in the same table.  I've used it very handily in an Oracle environment and SqlServer 2005 appears to have implemented the core functionality.  Will be testing it in the next month, but can't say for sure whether it's up to snuff or not.  Of course, you would need to automate the creation of the table partitions, but that shouldn't be too hard.

Other than the volume of data (which is what I'm guessing your comment about archiving being an enormous task refers to), what's the big deal about archiving out the data into xml or comma-delimited files?

albruan replied on Thursday, July 13, 2006

David,

Thanks for your response.  I hadn't thought of partitioning the tables until I read your post.  It sounds like a super idea except that it can only be performed in the Developer Edition or the Enterprise Edition.  I seriously doubt I can get them to spring for a minimum of USD $24,000+/- for a copy of the Enterprise Edition and I don't want to recommend using the Developer Edition as that would be a violation of Microsoft's licensing for the product.  My choice of database for them is the Express Edition as it should be able to easily handle all the data the dBase system has been handling.  Perhaps I could use MySQL as it has the ability to partition tables; the only question there is whether or not MySQL Connector works with .NET 2.0.

I did some more delving into using bcopy with SQL Server as mentioned in my original post and see that MS added a SqlBulkCopy class to .NET 2.0.  Unfortunately, it only permits bulk copying of data INTO SQL Server and not from SQL Server to other formats.

Maybe the best thing to do is revisit my first option and keeping the database down to five or six tables.  That thought occurred to me after reading your last sentence...simply execute a "SELECT ... USING XML AUTO", writing the results out to a disk file, and then executing another SQL command to delete the records for that particular project.  I'll have to do some exploring, but it may be that I can then use the new SqlBulkCopy class to restore the project data into the database should the need arise.

Thanks again for your response!  Smile [:)]

Allen B. Anderson

ajj3085 replied on Thursday, July 13, 2006

I think you'll find that 'archiving' would be best achevied by creating a flag column on the relevent tables (project level, i'd think) and simply filtered on 'unarchived' projects by default.   Be sure to put an index on that column as well as any others which would be appropriate.

Sql server can handle quite a bit of data in its tables before you need to physically seperate the data for performance reasons.  I don't think I've come across any project which required me to do so.

Just my $0.02.
Andy

Copyright (c) Marimer LLC