SQL SERVER: Why We Should Avoid COUNT (*) : Connect SQL

Wednesday, January 12, 2011

SQL SERVER: Why We Should Avoid COUNT (*)

A blog reader asked that why should we avoid COUNT () and why it is declared forbidden in an early post SQL SERVER: 9 Most Forbidden Things . Actual point was that we should avoid to use COUNT () to check records existence. How few developers check record existence:

USE AdventureWorks

-- Declare a variable to store intermediate results

DECLARE @TotalCount INT

-- Count total records and place resultant figure in variable

SELECT @TotalCount=COUNT(*) FROM Sales.SalesOrderDetail

IF @TotalCount >0

-- If records exists then

SELECT 1

If we place SET STATISTICS IO ON before query to check logical reads we will get following results.

Table 'SalesOrderDetail'. Scan count 1, logical reads 228, physical reads 0, read-ahead reads 0

Now modify above statements as follow to get same results and check how costly the query is.

IF EXISTS(SELECT 1 FROM Sales.SalesOrderDetail)

SELECT 1

Table 'SalesOrderDetail'. Scan count 1, logical reads 2, physical reads 0, read-ahead reads 0

Difference between logical reads of both queries is huge and that is the reason we should avoid count (*)

If your target is to count all records for some other options use following query

USE AdventureWorks

SELECT sys.partitions.rows

FROM sys.partitions

WHERE index_id = 1

AND object_id = OBJECT_ID('Sales.SalesOrderDetail')

Table 'sysrowsets'. Scan count 1, logical reads 5, physical reads 0, read-ahead reads 0

I urge you to execute above query and COUNT (*) query for largest table of your own database and check the difference of logical reads.

6 comments:

Shantanu KumarJanuary 12, 2011 at 7:41 PM
Is it possible to use this technique in criteria queries?
ReplyDelete
Replies
UnknownJanuary 13, 2011 at 12:29 AM
Why can't SQL Server optimize our count(*) query to perform as well as your version?
ReplyDelete
Replies
Adrian HillsJanuary 13, 2011 at 9:02 AM
One thing to bear in mind is accuracy - the sys.partitions approach may not be 100% accurate depending on when the metadata was last updated by SQL Server. If you don't need it to be 100% accurate then it's a great way to do it!
ReplyDelete
Replies
aasim abdullahJanuary 15, 2011 at 7:27 AM
Thanx Adrian for mentioning this important point.
ReplyDelete
Replies
Gabriel LopesMarch 30, 2011 at 11:55 PM
Im sorry, but i dont see this improvement in a real-life sittuation, this is: to check if a record exist in some conditions (it is not usuall for me to check if a table has records, it will always have, i want to know about the exact condition).

That said, i tried to check if i could use your tip to improve the performance, i used the following code

trial 1
SELECT COUNT(*) FROM dbo.Orders where CustomerID = 'VINET'

trial 2
IF EXISTS(SELECT 1 FROM dbo.Orders where CustomerID = 'VINET') SELECT 1

And the result on the execution plan is 50% - 50% !!!!

Both effort will reside in the index seek in that case, i think your tip will only work when there is no need to read anything from the database to check the conditions, is this right? Or do i have a better way to check for records?

Thanks and regards
ReplyDelete
Replies
JimOctober 7, 2011 at 4:44 PM
Thanks for the article. Sorry Gabriel I saw great improvement with the tip.

Trial 1
SET STATISTICS IO ON
SELECT COUNT(*) FROM dbo.myTable where ID = 113
(1 row(s) affected) count = 430,115
Table 'properties'. Scan count 1, logical reads 592, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Trial 2
SET STATISTICS IO ON
IF EXISTS(SELECT 1 FROM dbo.myTable where ID = 113)
SELECT 1

Table 'properties'. Scan count 1, logical reads 3, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
ReplyDelete
Replies

Add comment

All suggestions are welcome

Connect SQL

Wednesday, January 12, 2011