So, I'm implementing an object transformation that involves changing basically every object in a particular table. It can't be done with a single direct script, and stored procedures are out. The entire update must happen in a transaction.
The naive implementation is just to load a BLB, apply the transformation to each object in the list, and then save the list. However, in the extreme case I could have 100,000 or so objects.
When I populate a BLB, (or ERLB for that matter), I'm working with IDataReader driven object factory, so in theory I could basically duplicate this logic and just "scroll" through the objects once, updating each one as a go, rather than loading the entire list into memory first. This seems like a better approach.
How have other folks handled similar situations? I know that you could also do this in a stored procedure, but one of back-end databases doesn't support these.
My first instinct it to create a query and fire it to back
end. I think that any other solution will be many times slower.
Usually, whatever you can do in stored procedures you can do in a dynamic
script. I would be weary of fetching 100,000 rows just to send them right
back.
Sergey Barskiy
Principal Consultant
office: 678.405.0687 |
mobile: 404.388.1899
Microsoft Worldwide Partner of the Year | Custom
Development Solutions, Technical Innovation
From: rsbaker0
[mailto:cslanet@lhotka.net]
Sent: Thursday, October 02, 2008 2:08 PM
To: Sergey Barskiy
Subject: [CSLA .NET] Techniques for working with large number of
objects?
So, I'm implementing an object transformation that involves changing
basically every object in a particular table. It can't be done with a single
direct script, and stored procedures are out. The entire update must happen in
a transaction.
The naive implementation is just to load a BLB, apply the transformation to
each object in the list, and then save the list. However, in the extreme case I
could have 100,000 or so objects.
When I populate a BLB, (or ERLB for that matter), I'm working with
IDataReader driven object factory, so in theory I could basically duplicate
this logic and just "scroll" through the objects once, updating each
one as a go, rather than loading the entire list into memory first. This seems
like a better approach.
How have other folks handled similar situations? I know that you
could also do this in a stored procedure, but one of back-end databases
doesn't support these.
sergeyb:My first instinct it to create a query and fire it to back end. I think that any other solution will be many times slower. Usually, whatever you can do in stored procedures you can do in a dynamic script. I would be weary of fetching 100,000 rows just to send them right back.
I don't disagree -- it's certainly not something I would do often. However, in this case, I'm changing the encryption method for the passwords for the entire user table, and the encryption algorithm cannot be easily implemented in the database.
The legacy application we're porting does this via cursor, scrolling through the user records and updating each one.
I see… I would be inclined then to chunk the work
somehow so that you do not have to fetch all 100,000 rows. Depending on
logging schema, how wide the table is, and transaction support this may
take quite some time and a lot of resources. Maybe like “Select Top
1000 * From users where IsPasswordConverted=0” and put that inside a
loop. Just an idea…
Sergey Barskiy
Principal Consultant
office: 678.405.0687 |
mobile: 404.388.1899
Microsoft Worldwide Partner of the Year | Custom
Development Solutions, Technical Innovation
From: rsbaker0
[mailto:cslanet@lhotka.net]
Sent: Thursday, October 02, 2008 2:31 PM
To: Sergey Barskiy
Subject: Re: [CSLA .NET] RE: Techniques for working with large number of
objects?
sergeyb:
My first instinct it to create a query and fire it to back end. I think that any other solution will be many times slower. Usually, whatever you can do in stored procedures you can do in a dynamic script. I would be weary of fetching 100,000 rows just to send them right back.
I don't disagree -- it's certainly not something I would do
often. However, in this case, I'm changing the encryption method for
the passwords for the entire user table, and the encryption algorithm cannot be
easily implemented in the database.
The legacy application we're porting does this via cursor, scrolling through
the user records and updating each one.
Ouch! We all know what happens to the best laid plans...
It turns out that (at least with SQL Server 2000), if you have started a transaction and open a DataReader on it, then you can't do any updates -- or really practically anything else in the transaction actually -- as long as the DataReader is open.
At least this is what I appear to be seeing, at any rate. This would seem to rule out any sort of cursor-style approach to this.
In theory, I could scroll through the objects in a separate database transaction, but that seems perilous.
Your chunking idea is a good one, but I would need to add something to the schema. I'll look at doing that down the road. Thanks!
rsbaker0:It turns out that (at least with SQL Server 2000), if you have started a transaction and open a DataReader on it, then you can't do any updates -- or really practically anything else in the transaction actually -- as long as the DataReader is open.
Not quite true, actually. You cannot do anything on the connection while the reader is open (and that has nothing to do with whether there is a transaction started or not), but there is nothing preventing you from opening another connection to do something else...
Copyright (c) Marimer LLC