Lesson Learned #69: Azure SQL Database – How are the statistics updated automatically by SQL engine

This post has been republished via RSS; it originally appeared at: Azure Database Support Blog articles.

First published on MSDN on Jan 29, 2019
Hello,

Today I worked on a service request that our customer needs to know how the statistics are automatically updated by SQL Engine and what is the impact of this process in terms of consumption resources and distribution of the statistics.

STEP 1 – Created the table and statistics.

I have created an example table with name Example1 that contains two fields ID and Name.

[code language="SQL"]
CREATE TABLE [dbo].[Example1](
[Id] [int] NOT NULL IDENTITY,
[Name] [varchar](200) NULL,
CONSTRAINT [idcontraint_name] PRIMARY KEY CLUSTERED
(
[Id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)
)

CREATE STATISTICS [xxExample1] ON [dbo].[Example1]([Name])
[/code]

STEP 2 – I inserted around 40000 rows.




[code language="SQL"]
DECLARE @T AS INT = 0
WHILE @T < = 40001
BEGIN
SET @T = @T +1
INSERT EXAMPLE1 (ID,NAME) VALUES(@T,'A' + CONVERT(VARCHAR(20),@T))
END
[/code]

STEP 3 – If we run the DBCC SHOW_STATISTICS, I have empty result set.




[code language="SQL"]
dbcc show_statistics('example1',[idcontraint_name])
dbcc show_statistics('example1',[xxExample1])
[/code]

STEP 3 – Running the sys.dm_db_index_operational_stats I saw 40002 inserted correctly on the index.




[code language="SQL"]
select * from sys.dm_db_index_operational_stats(null,null,null,null) where object_id=object_id('example1')
[/code]

STEP 4 – Running the sys.dm_db_partition_stats I saw 40002 inserted correctly on the index.




[code language="SQL"]
SELECT * FROM sys.dm_db_partition_stats where object_id=object_id('example1')
[/code]

STEP 5 – As we explained, if you run any query and the number of updates are greater of 20% of the total of the rows and this query is using any field that the statistics is based, you will get an update statistics.




[code language="SQL"]
select name from Example1 where name = 'A200' order by Id
[/code]


[code language="SQL"]
dbcc show_statistics('example1',[idcontraint_name])
dbcc show_statistics('example1',[xxExample1])
[/code]

STEP 6 – On the same way, you will get an updated/delete statistics when you update data if this is more 20% of total rows.




[code language="SQL"]
update EXAMPLE1 SET name = 'B2'+NAME where id >100 AND ID <= 8200
[/code]

For this example I changed the compatibility level of the database to 110, please, review this URL because depending on the compatibility model you could have a different way to calculate automatically the statistics.

Also, I would like to highlight two points:

  • Depending on the number of rows of the table, SQL Engine could choose update the statistics using only a sample instead of all the rows of the table and depending on this distribution we may have an incorrect distribution data.

  • Rebuild indexes or update the statistics will generate a new execution plan for your queries.




Enjoy!

Leave a Reply

Your email address will not be published. Required fields are marked *

*

This site uses Akismet to reduce spam. Learn how your comment data is processed.