This post has been republished via RSS; it originally appeared at: SQL Server Support articles.
The engineer asked me a question, how does the optimizer estimate the row for index scan(range scan). he observed 100-109 estimated row for index scan but he wants to know the reason.
As my understanding, if there is no predicated, the estimated row should equal total of index row count. Then I requested the plan and test query.
Here is my test script
create table ta(c1 int ,c2 int, c3 varchar(1000))
go
create table tb(c1 int ,c2 int, c3 varchar(1000))
go
with cte
as
(
select ROW_NUMBER() over(order by o.object_id) id from sys.columns o
)
insert ta
select top 1200 id ,id ,REPLICATE('a',1000) from cte
----insert 1,200 rows into ta
go
create index ic1 on ta(c1)
go
insert tb
select top 1100 * from ta
----insert 1,000 rows into tb
go
create index ic2 on tb(c1)
go
select tx.c1, tx.c2
from
(
select row_number() over(order by ta.c1) as id,
ta.c1,tb.c2 From ta inner loop join tb on ta.c1=tb.c1
) tx where tx.id =1
I can see both tables inserted 1200 and 1000 rows.
To analyze this sample, I want to introduce the method to analyze SQL CE detail.
Firstly, I check the index statistics on the table 'ta':
select * from sys.stats where object_id=object_id('ta')
To verify which stats uses by SQL CE, we could turn on TF2363 and xevent:
dbcc freeproccache
dbcc traceon(2363,3604)
CREATE EVENT SESSION [XeNewCE] ON SERVER
ADD EVENT sqlserver.query_optimizer_estimate_cardinality(
ACTION(sqlserver.sql_text)),
ADD EVENT sqlserver.query_optimizer_force_both_cardinality_estimation_behaviors
ADD TARGET package0.event_file(SET filename=N'D:\XeNewCE.xel',max_file_size=(50),max_rollover_files=(2))
run the query again and include real plan.
check node 8 from xml plan:
<RelOp AvgRowSize="11" EstimateCPU="0.0012867" EstimateIO="0.00460648" EstimateRebinds="0" EstimateRewinds="0" EstimatedExecutionMode="Row" EstimateRows="100" EstimateRowsWithoutRowGoal="1027" EstimatedRowsRead="1027" LogicalOp="Index Scan" NodeId="8" Parallel="false" PhysicalOp="Index Scan" EstimatedTotalSubtreeCost="0.00353495" TableCardinality="1027" StatsCollectionId="1">
We can see this node gets the TableCardinality--total table row 1027 from statistics.
TableCardinality="1027" StatsCollectionId="1"
The StatsCollectionId="1" can use to search TF 2363 result:
TF 2363 trace:
============================
Begin selectivity computation
Input tree:
LogOp_Join
CStCollBaseTable(ID=1, CARD=1027 TBL: ta) --this is card result 1027
CStCollBaseTable(ID=2, CARD=1027 TBL: tb)
ScaOp_Comp x_cmpEq
ScaOp_Identifier QCOL: [tempdb].[dbo].[tb].c1
ScaOp_Identifier QCOL: [tempdb].[dbo].[ta].c1
Plan for computation:
CSelCalcExpressionComparedToExpression( QCOL: [tempdb].[dbo].[ta].c1 x_cmpEq QCOL: [tempdb].[dbo].[tb].c1 )
Loaded histogram for column QCOL: [tempdb].[dbo].[ta].c1 from stats with id 2 --this indicates the stats ID referenced from table
Loaded histogram for column QCOL: [tempdb].[dbo].[tb].c1 from stats with id 2
Selectivity: 0.00097371
Stats collection generated:
CStCollJoin(ID=3, CARD=1027 x_jtInner)
CStCollBaseTable(ID=1, CARD=1027 TBL: ta)
CStCollBaseTable(ID=2, CARD=1027 TBL: tb)
End selectivity computation
Then why the final estimated row changed to 100?
<RelOp AvgRowSize="11" EstimateCPU="0.0012867" EstimateIO="0.00460648" EstimateRebinds="0" EstimateRewinds="0" EstimatedExecutionMode="Row" EstimateRows="100" EstimateRowsWithoutRowGoal="1027" EstimatedRowsRead="1027" LogicalOp="Index Scan" NodeId="8" Parallel="false" PhysicalOp="Index Scan" EstimatedTotalSubtreeCost="0.00353495" TableCardinality="1027" StatsCollectionId="1">
We can see the xml plan includes EstimateRowsWithoutRowGoal="1027". this introduced in new SSMS version (18.x) and you may easy to identify the rowgoal applied or not.
On this case, the row goal impacted the estimated row result. We can disable the rowgoal by using the query hint:
select tx.c1, tx.c2
from
(
select row_number() over(order by ta.c1) as id,
ta.c1,tb.c2 From ta inner loop join tb on ta.c1=tb.c1
) tx where tx.id =1
option(use hint('DISABLE_OPTIMIZER_ROWGOAL'))
More information, please reference:
https://blog.sqlauthority.com/2020/02/07/sql-server-row-goal-and-performance/