A blog reader asked that why should we avoid COUNT (*) and why it is declared forbidden in an early post SQL SERVER: 9 Most Forbidden Things . Actual point was that we should avoid to use COUNT (*) to check records existence. How few developers check record existence:
USE AdventureWorks
GO
-- Declare a variable to store intermediate results
DECLARE @TotalCount INT
-- Count total records and place resultant figure in variable
SELECT @TotalCount=COUNT(*) FROM Sales.SalesOrderDetail
IF @TotalCount >0
-- If records exists then
SELECT 1
If we place SET STATISTICS IO ON before query to check logical reads we will get following results.
Table 'SalesOrderDetail'. Scan count 1, logical reads 228, physical reads 0, read-ahead reads 0
Now modify above statements as follow to get same results and check how costly the query is.
IF EXISTS(SELECT 1 FROM Sales.SalesOrderDetail)
SELECT 1
Table 'SalesOrderDetail'. Scan count 1, logical reads 2, physical reads 0, read-ahead reads 0
Difference between logical reads of both queries is huge and that is the reason we should avoid count (*)
If your target is to count all records for some other options use following query
USE AdventureWorks
GO
SELECT sys.partitions.rows
FROM sys.partitions
WHERE index_id = 1
AND object_id = OBJECT_ID('Sales.SalesOrderDetail')
Table 'sysrowsets'. Scan count 1, logical reads 5, physical reads 0, read-ahead reads 0
I urge you to execute above query and COUNT (*) query for largest table of your own database and check the difference of logical reads.
Is it possible to use this technique in criteria queries?
ReplyDeleteWhy can't SQL Server optimize our count(*) query to perform as well as your version?
ReplyDeleteOne thing to bear in mind is accuracy - the sys.partitions approach may not be 100% accurate depending on when the metadata was last updated by SQL Server. If you don't need it to be 100% accurate then it's a great way to do it!
ReplyDeleteThanx Adrian for mentioning this important point.
ReplyDeleteIm sorry, but i dont see this improvement in a real-life sittuation, this is: to check if a record exist in some conditions (it is not usuall for me to check if a table has records, it will always have, i want to know about the exact condition).
ReplyDeleteThat said, i tried to check if i could use your tip to improve the performance, i used the following code
trial 1
SELECT COUNT(*) FROM dbo.Orders where CustomerID = 'VINET'
trial 2
IF EXISTS(SELECT 1 FROM dbo.Orders where CustomerID = 'VINET') SELECT 1
And the result on the execution plan is 50% - 50% !!!!
Both effort will reside in the index seek in that case, i think your tip will only work when there is no need to read anything from the database to check the conditions, is this right? Or do i have a better way to check for records?
Thanks and regards
Thanks for the article. Sorry Gabriel I saw great improvement with the tip.
ReplyDeleteTrial 1
SET STATISTICS IO ON
SELECT COUNT(*) FROM dbo.myTable where ID = 113
(1 row(s) affected) count = 430,115
Table 'properties'. Scan count 1, logical reads 592, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Trial 2
SET STATISTICS IO ON
IF EXISTS(SELECT 1 FROM dbo.myTable where ID = 113)
SELECT 1
Table 'properties'. Scan count 1, logical reads 3, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.