Inserting 100000 records

Inserting 100000 records

Old forum URL: forums.lhotka.net/forums/t/11684.aspx


neo_u posted on Thursday, November 08, 2012

Hello,

I'm trying to make transactional insert with Csla.

To get 100000 it doesn't take a long time, but when I'm trying to insert them - it takes a long long time, while I can do it by transactional insert in a 10 seconds.

I found out that some part of time is stolen by Clone() methods, so I disabled cloning on save. However it doesn't solve the problem. Still to insert 100k records I need to wait long time.

Also I used TransactionalTypes.TransactionScope on DataPortal_Update method of my list of objects.

 

I thought that sending DalManager to method Child_Update will solve my problem, but still no luck, at all :(

 

Can someone help?

 

Thanks in advance.

sergeyb replied on Thursday, November 08, 2012

You can comment out the attribute, and just create transaction scope in code so that you can set desired timeout.  However, I think you need to change the app so that you do not have to insert that much data from the client.  You would be better off using SP to create the data on the server.  Calling insert 1000000 repeatedly will never be fast.  My 2 cents

skagen00 replied on Thursday, November 08, 2012

Why are you mass-inserting 100,000 records? Are you stress-testing or is this an actual use case?

Clone() is super-expensive, totally agree.  We had a complex process in our application and 80% of the CPU time accoding to ANTS was Clone() calls!

And I agree with Sergey - it's going to take a long time to save 100,000 BO's...

 

neo_u replied on Thursday, November 08, 2012

Yes. It's a test application.

The main idea that's inserting about 100k records will be done only once. On first launch.

On second and other launches it will insert/update about 1000-3000 records.

StoredProcedure - is not way out.

It's a research app. I'm trying to find out it is possible to use Csla with WCF service to synchronize to databases or it's better to write db-to-db synchornization service without Csla, but with custom validation of data.

 

PS: I forgot to tell you, that I'm using SQLite database.

JonnyBee replied on Thursday, November 08, 2012

On general terms I favor BulkInsert for data loading. SQLite has no bulk insert to my knowledge but some good tips here:

http://www.mostlydevelopers.com/blog/post/2012/02/19/SQLite-Bulk-Insert.aspx

Could you post a sample project here for us to play with/test? 

Troncho replied on Thursday, November 08, 2012

Hi Jonny, Hi everybody.

The project I'm working on demands a real use case which involves a monthly upload of about 2,800,000 (yes, almost 3 million) of contibutor tax related records. The source is a delimited ASCII text file.

I've come up with a very fast working solution, implementing a mix between Sql BulkCopy and a special implementation of the IDataReader (for the Sql BulkCopy) that works with IEnumerables obtained by reading the ASCII file via Linq 2 Files.

I am not using any kind of transaction related control, as I can always delete the imported records and start over.

I'm also using a Background worker to do the upload and free the UI (which checks progress via async events).

The only catch is that, for now, it only works with a WinForms UI which uses a derived CSLA Business Class for some initial property validation rules. This derived class, declares 3 public methods (Upload, Delete, Cancel). This methods can be called directly from the UI and they open direct connections to the database:

using (var conn = ConnectionManager<System.Data.SqlClient.SqlConnection>.GetManager(
NrgNet.DalLinq.Database.NrgNet))

I'm not using any DataPortal operations, so far.

One way it could fit with CSLA is by using some special intermediate operation which places (uploads) the source file IN the server (DB server or App server with a very fast connection to the DB server), and then, via a CommandBase class, I could call the BulkCopy. I'll leave this design for next year's second half, when I upgrade to CSLA 4 and MVC UI.

BTW, this solution is working just fine. Running tests with the UI working directly on the DB server, uploading 2.8 million records to a table with 2 indexes takes about 1:30 minutes...

If my solution can prove useful for anybody, please tell me how do you suggest I upload this part of the project so I can share my results with you.

Troncho

Copyright (c) Marimer LLC