This post has been republished via RSS; it originally appeared at: Azure Database Support Blog articles.First published on MSDN on Jan 29, 2019
Today I worked on a service request that our customer needs to know how the statistics are automatically updated by SQL Engine and what is the impact of this process in terms of consumption resources and distribution of the statistics.
STEP 1 – Created the table and statistics.
I have created an example table with name Example1 that contains two fields ID and Name.
CREATE TABLE [dbo].[Example1](
[Id] [int] NOT NULL IDENTITY,
[Name] [varchar](200) NULL,
CONSTRAINT [idcontraint_name] PRIMARY KEY CLUSTERED
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)
CREATE STATISTICS [xxExample1] ON [dbo].[Example1]([Name])
STEP 2 – I inserted around 40000 rows.
DECLARE @T AS INT = 0
WHILE @T < = 40001
SET @T = @T +1
INSERT EXAMPLE1 (ID,NAME) VALUES(@T,'A' + CONVERT(VARCHAR(20),@T))
STEP 3 – If we run the DBCC SHOW_STATISTICS, I have empty result set.
STEP 3 – Running the sys.dm_db_index_operational_stats I saw 40002 inserted correctly on the index.
select * from sys.dm_db_index_operational_stats(null,null,null,null) where object_id=object_id('example1')
STEP 4 – Running the sys.dm_db_partition_stats I saw 40002 inserted correctly on the index.
SELECT * FROM sys.dm_db_partition_stats where object_id=object_id('example1')
STEP 5 – As we explained, if you run any query and the number of updates are greater of 20% of the total of the rows and this query is using any field that the statistics is based, you will get an update statistics.
select name from Example1 where name = 'A200' order by Id
STEP 6 – On the same way, you will get an updated/delete statistics when you update data if this is more 20% of total rows.
update EXAMPLE1 SET name = 'B2'+NAME where id >100 AND ID <= 8200
For this example I changed the compatibility level of the database to 110, please, review this URL because depending on the compatibility model you could have a different way to calculate automatically the statistics.
Also, I would like to highlight two points:
- Depending on the number of rows of the table, SQL Engine could choose update the statistics using only a sample instead of all the rows of the table and depending on this distribution we may have an incorrect distribution data.
- Rebuild indexes or update the statistics will generate a new execution plan for your queries.