Update/delete database speed

Update/delete database speed

Old forum URL: forums.lhotka.net/forums/t/7725.aspx


JonM posted on Saturday, October 03, 2009

I'm using SQL Server 2008 and have optimized my database as much as I can. The problem is that large updates take a long time. retrieving a 100,000+ rows works very well. Updating 100,000 can take several minutes. The real problem is that the retreive is one sql command but the update/deletes are all seperate commands. Does anyone have a workaround to make this faster?

DancesWithBamboo replied on Saturday, October 03, 2009

Check out the System.Data.SqlClient.SqlBulkCopy object.  I insert about 100,000 records in just a few seconds using it.

Marjon1 replied on Sunday, October 04, 2009

The previous suggestion may work for you, but you could also attempt to use the new table valued parameters in 2008. This would allow you to pass the entire collection to be inserted / updated as a single call to a stored procedure. I've personally not used them before, but they appear a lot easier to use than trying to send XML to a stored procedure and then attempting to parse it.

Here are a couple of links I found on the topic:

http://www.sqlteam.com/article/sql-server-2008-table-valued-parameters
http://technet.microsoft.com/en-us/library/bb510489.aspx

I'm curious as to what optimizations you've made to get that number of records back and what sort of performance timings you are achieving. Returning that number of records has always been an issue for us.

tiago replied on Sunday, October 04, 2009

JonM:
I'm using SQL Server 2008 and have optimized my database as much as I can. The problem is that large updates take a long time. retrieving a 100,000+ rows works very well. Updating 100,000 can take several minutes. The real problem is that the retreive is one sql command but the update/deletes are all seperate commands. Does anyone have a workaround to make this faster?

I'm not sure I follow you. But anyway the fastest way to do a lot of things in SQL Server or Oracle is using CLR Stored Procedures. You write your code in C#/VB, build an assembly and put it inside the database. Your code runs much faster:
- it runs locally (in the same computer where the database is)
- it uses the local context (no need to open database connection, etc)

You can debug your CLR stored procedure just by hitting F5 just like you do with normal .NET code. You can even call your CLR stored procedures from a T-SQL SP.

There are some examples in www.codeproject.com - just search for "CLR Stored Procedure".

I did try it successfully on SQL Server. It's quite easy to do and it's faster in several use cases.

JonM replied on Monday, October 05, 2009

Most of what I need is mass updates of CSLA objects. I need to load 100,000+ objects. Make changes to all of them. Then I need to save them. The SqlBulkCopy is cool but doesn't really apply. I don't see how embedding code into the CLR is going to work. The data needs to be loaded into the UI where the user can interact with it and then save it. This still requires getting all of this data into the SQL process. It seems like this would be a common performance issue.

rsbaker0 replied on Monday, October 05, 2009

I don't think it's all that common to have a user directly make individual updates to 100,000+ objects.

At what I would expect would be a reasonable update rate for a database client submitting individual updates, with indivdual updates at 100 per second or so, it's going to take more than 15 minutes. (Even at 1,000/second, the user will still sit there for almost 2 minutes).

If the user is making the *same* update to a large number of objects, then I would do it with a SQL script or equivalent (e.g. stored procedure).

msk replied on Monday, October 05, 2009

As others have suggested, using CSLA as you are you describe may not be the best approach. CSLA is designed around taking data entered by the user and storing it in the database. Presumably your users are not entering or editing 100,000+ objects directly. If there is an automated process that you can move to your database layer and run in SQL or SQL CLR as suggested, you may be able to use a CommandBase object to trigger it.

Fintanv replied on Monday, October 05, 2009

JonM:
Most of what I need is mass updates of CSLA objects. I need to load 100,000+ objects. Make changes to all of them. Then I need to save them. The SqlBulkCopy is cool but doesn't really apply. I don't see how embedding code into the CLR is going to work. The data needs to be loaded into the UI where the user can interact with it and then save it. This still requires getting all of this data into the SQL process. It seems like this would be a common performance issue.

I am not sure what you mean by saying that "SqlBulkCopy ... doesn't really apply".  In a prior life I wrote a CSLA based app that pulled down a lot of data from the DB, allowed the user to make changes, ran a bunch of calculations (3-5 year forcasting application), and then used the SqlBulkCopy to push the 40 to 50,000 changes back to the database.  Worked a treat...

DancesWithBamboo replied on Monday, October 05, 2009

Fintanv:

I am not sure what you mean by saying that "SqlBulkCopy ... doesn't really apply".  In a prior life I wrote a CSLA based app that pulled down a lot of data from the DB, allowed the user to make changes, ran a bunch of calculations (3-5 year forcasting application), and then used the SqlBulkCopy to push the 40 to 50,000 changes back to the database.  Worked a treat...



Exactly what I was going to say. 

rsbaker0 replied on Monday, October 05, 2009

Doesn't SQLBulkCopy just insert data?

DancesWithBamboo replied on Monday, October 05, 2009

Yes, it only inserts.  But you insert into a temporary table and then Merge the data into the real table using a sproc.  Depending on your use case you can also truncate the "real" destination table and then perform an insert right into it.

When updating 10s of thousands of records, the goal is to get the data to the db server in one large chunk.  After it is there you can use many techniques to update your data quickly.  The thing that will kill you is round-tripping to the database for each record update.

Other options to get data to the db server are:
- bulk update
- xml
- flat file

Fintanv replied on Tuesday, October 06, 2009

Ah yes the temp table and the merge proc.  That's the bit I left out, thanks for filling it in.  Put it down to 'bit rot' in the old noggin :-)

Copyright (c) Marimer LLC