SELECT COUNT(*)...

SELECT COUNT(*)...

Old forum URL: forums.lhotka.net/forums/t/5229.aspx


COBRASoft posted on Monday, August 11, 2008

Hi,

I'm quite new to CSLA.Net 2.0 and higher, so sorry if I say something stupid :)

I've seen the 'SELECT COUNT(*) FROM ...' structure to know if an object exists or not. Why not using a SPROC here and use the EXIST statement instead of the COUNT? EXIST returns after finding 1 row, COUNT keeps counting until all are found. Should be faster.

Also, when used in a SPROC, it would be easy to extend this functionality to check for child tables aswell. You could even go further and put this check in the same SPROC as the delete and return if the delete did happen. This makes the need of triggers unneeded and also doesn't require to load all child objects before deleting a parent object, again speed improvement. Here is an example...

Greetings,

Sigurd

CREATE PROCEDURE ArticleDelete
@ARTL_ID
uniqueidentifier
AS
BEGIN
SET NOCOUNT ON
DECLARE @Found bit
SELECT @Found = 0
DECLARE @Result bit
SELECT @Result = 0

IF EXISTS(SELECT * FROM Article WHERE ARTL_ID = @ARTL_ID)
   
SELECT @Found = 1
IF @Found = 0 AND EXISTS(SELECT * FROM ArticleLanguage WHERE ALLG_ARTL_ID = @ARTL_ID)
   
SELECT @Found = 1
IF @Found = 0 AND EXISTS(SELECT * FROM ArticleInfo WHERE ALIF_ARTL_ID = @ARTL_ID)
   
SELECT @Found = 1

IF @Found = 0
   BEGIN
      DELETE FROM ArticleInfo WHERE ALIF_ARTL_ID = @ARTL_ID
      DELETE FROM ArticleLanguage WHERE ALLG_ARTL_ID = @ARTL_ID
      DELETE FROM Article WHERE ARTL_ID = @ARTL_ID

      SELECT @Result = 1
   END

SELECT
@Result
END

RockfordLhotka replied on Monday, August 11, 2008

This is in reference to the sample implementation in ProjectTracker? Thanks for the tip, that sounds useful.

I freely (and frequently) admit to being a developer, not a database guy. Which is why I am a strong supporter of using stored procedures (even in this era of ORMs).

I know my SQL skills are lame. I am just good enough to get the functionality I want, and I work with database experts who can fix my sprocs (and table designs, etc) to make them sing.

In my view, we're in a period of increasing specialization, much like the medical industry. It is unrealistic to expect an x-ray expert to do surgery, or for a surgeon to understand the deep nuances of x-ray results. Both experts have overlap in their skills, but neither is qualified to do the other's job.

I'm an OO specialist. I dabble in the UI (all 8 of the technologies in .NET), and I dabble in data access (all 7+ of those technologies in .NET) and I dabble in database design (at least that's just SQL - whew!). And while dabbling I do learn a lot, but I don't even pretend to be an expert in those areas.

And I don't even try to get into BizTalk/Sharepoint/MOSS/Office automation/etc. That way leads to madness :)

COBRASoft replied on Tuesday, August 12, 2008

Hey Rocky,

I'm far from an expert either, but since I'm the only developer in my company, I have to try to be a bit all-rounder. Customers ask crazy things these days. WinForms, WebForms, WebServices, SharePoint and Office integration with support for 1000 users... All within the coming 6 months and starting from scratch Big Smile [:D]

Greetings,
Sigurd

rsbaker0 replied on Tuesday, August 12, 2008

COBRASoft:

...

I've seen the 'SELECT COUNT(*) FROM ...' structure to know if an object exists or not. Why not using a SPROC here and use the EXIST statement instead of the COUNT? EXIST returns after finding 1 row, COUNT keeps counting until all are found. Should be faster.

...

If the WHERE clause is testing the primary key for equality, why would the database "keep counting"? There can be at most one entry anyway. Of course, I'm not sure if all database engines are smart enough to optimize this way.

We have to support various versions of MSSQL, Oracle, and Access, so the sproc approach is out and you can't generally do direct EXISTS tests in direct SQL except in your WHERE clause.(Some db's will let you use exists in the SELECT list)

Another approach similar to what you suggest would be to just attempt to retrieve the object key itself as a scalar. (e.g. SELECT KEY FROM TABLE WHERE KEY=xxx). If you get the value back, the object exists, otherwise you'd get NULL, meaning the object doesn't exist.

ajj3085 replied on Tuesday, August 12, 2008

I think he's refering to what the db will actually do.  With EXISTS, it will scan the index until it finds the specified key, and since it's an index its very easy to figure out if the key exists or not.  But with count, it MAY have to look through the entire index... indexes aren't necessarly unique.  Now, maybe Sql Server is smart enough when handing PK columns to know there can only be one... but maybe not.

Exists AFAIK is also part of the Sql-92 standard, so it should be available in all the majors db.

COBRASoft replied on Tuesday, August 12, 2008

Hi,

I don't know so much of other database engines, but I've tested on a very large SQL Server 2005 database with millions of records. The EXIST clause was defintely faster. The difference is even more noticable when you go through the 'foreign key tables aswell like in my sample. This trick is even faster than triggers (a lot faster).

Greetings,
Sigurd

rsbaker0 replied on Tuesday, August 12, 2008

COBRASoft:

Hi,

I don't know so much of other database engines, but I've tested on a very large SQL Server 2005 database with millions of records. The EXIST clause was defintely faster. The difference is even more noticable when you go through the 'foreign key tables aswell like in my sample. This trick is even faster than triggers (a lot faster).

Greetings,
Sigurd

I just tried COUNT(*) in SQL 2000 on a table with 2.5 million rows and if you are doing a COUNT(*) on a primary key value, it does a clustered index SEEK with an 1 as the both the estimated number of executes and resulting rows.

select count(*) from trans where transnumber='990407'

The query plan for the EXISTS was almost identical (but in fact just a hair  more complex) as the COUNT(*):

SELECT case WHEN EXISTS (SELECT * FROM TRANS WHERE TRANSNUMBER='990407') then 1 else 0 end

So, for clustered primary keys in SQL Server 2000, COUNT(*) and EXISTS for a primary key value will perform almost identically.

I certainly would agree that EXISTS will do better for a nonunique or (especially) unindexed query.

 

 

 

 

matt tag replied on Friday, August 15, 2008

some other questions about your stored procedure:

1. shouldn't the IF @Found = 0 statement be @Found = 1?  You want to delete only if you've found records.

2. Why bother checking to see if the records exists before deleting?  The DELETE FROM statements do their own search/scan through the records.  By searching/scanning yourself with IF EXISTS, you're actually doing 2 search/scans. Wouldn't this entire sproc be faster if it were just the 3 delete statements?

3. To optimize even further - your ArticleLanguage and ArticleInfo tables can be set with cascading delete referential integrity.  Then the entire stored proc would consist of the single line

delete from Article where ARTL_ID = @ARTL_ID

Perhaps I'm misunderstanding something in what you're trying to accomplish...

COBRASoft replied on Friday, August 15, 2008

Hi,

The idea is to search for any related records, if at least 1 is found, no delete may happen becuase of concurrency.

The idea is this. You have a 'contact' table, a 'customer' table, a 'supplier' table, a 'user' table, a 'userrights' table, ... and so on. All of them have a 1-1 relationship with 'contact', 'userrights' has a 1-n relation with 'user'. What if I want to delete a customer? This customer could be a supplier or a user, but it is certainly a contact. With triggers this would already become more complex and you don't have much control what trigger would be fired first for the integrity check with those other tables. In my sproc you can safely check the smallest tables first (e.g. user). If found, no reason to search further, the delete of the 'contact' record must be canceled, so you gain speed, if nothing is found, the delete can be done.

Triggers can become a pain in th *ss. Specially when you have more than 300 tables in your database.

This sproc is not optimized completely, you could turn of the integrity checks, so the double scan won't happen.

Greetings,
Sigurd

Copyright (c) Marimer LLC