Lesson Learned #483: Optimizing Performance in Azure SQL Database ReadScale Out Databases

This post has been republished via RSS; it originally appeared at: Azure Database Support Blog articles.

Introduction:

 

Today, I worked on a service request that our customer has a Business Critical database with 4 vCores and Read-Scale Out enabled. Our customer noticed several performance issues using Read-Scale Out database and I would like to explain several lessons learned found during the troubleshooting steps.

 

Initial Script Execution and Synchronization:

The investigation began with the execution of a script on the primary database, which populated approximately 7,864,320 rows. Synchronization between the primary and secondary databases was successfully completed, ensuring consistency in data distribution.

 

 

DROP TABLE IF EXISTS [dbo].[Notes] CREATE TABLE [dbo].[Notes]( [ID] [int] NULL, [NAME] [varchar](200) NULL, [id2] [int] IDENTITY(1,1) NOT NULL, CONSTRAINT [PK_Notes] PRIMARY KEY CLUSTERED ([id2] ASC)) ------------------------------------------------- -- Create the store procedure -------------------------------------------------- CREATE OR ALTER PROCEDURE dbo.GiveNotes @N AS INT = null AS SELECT count(Name),name FROM Notes where ID<@n group by Name -------------------------------------------------- -- Insert data -------------------------------------------------- INSERT INTO Notes (ID,Name) SELECT RAND()*(100000 - 1) + 1, 'Info:'+convert(varchar(200),RAND()*(100000 - 1) + 1) -------------------------------------------------- -- Execute this query several times - -------------------------------------------------- INSERT INTO Notes (ID,Name) SELECT RAND()*(100000 - 1) + 1, 'Info:'+convert(varchar(200),RAND()*(100000 - 1) + 1) FROM Notes

 

Observations on Secondary Database:

  • Running the following script (we never executed this query previously in the primary) it was observed that temporary automatic statistics were created (sys.stats (Transact-SQL) - SQL Server | Microsoft Learn ) , indicating potential areas for optimization and needed for this workload and will be deleted in case of any reconfiguration/failover. You could see them using the column is_temporary in sys.stats.
  • Analysis of the execution plan highlighted a recommendation for index creation to improve query performance.

 

declare @n as int = 0 set @n=RAND()*(100000 - 1) + 1 EXEC dbo.GiveNotes @n

 

 

Jose_Manuel_Jurado_0-1715264460989.png

 

Jose_Manuel_Jurado_1-1715264507315.png

 

Identifying Performance Bottlenecks:

 

Further analysis revealed that the query execution was consuming a significant amount of CPU resources on the secondary database. To pinpoint the underlying issues, a query was executed to identify the top CPU-intensive operations.

 

 

-- Which Queries are taking the most time/cpu to execute SELECT TOP 5 total_worker_time, total_elapsed_time, total_worker_time/execution_count AS avg_cpu_cost, execution_count, (SELECT DB_NAME(dbid) + ISNULL('..' + OBJECT_NAME(objectid), '') FROM sys.dm_exec_sql_text([sql_handle])) AS query_database, (SELECT SUBSTRING(est.[text], statement_start_offset/2 + 1, (CASE WHEN statement_end_offset = -1 THEN LEN(CONVERT(nvarchar(max), est.[text])) * 2 ELSE statement_end_offset END - statement_start_offset) / 2 ) FROM sys.dm_exec_sql_text([sql_handle]) AS est) AS query_text, total_logical_reads/execution_count AS avg_logical_reads, total_logical_writes/execution_count AS avg_logical_writes, last_worker_time, min_worker_time, max_worker_time, last_elapsed_time, min_elapsed_time, max_elapsed_time, plan_generation_num, qp.query_plan FROM sys.dm_exec_query_stats OUTER APPLY sys.dm_exec_query_plan([plan_handle]) AS qp WHERE [dbid] >= 5 AND DB_NAME(dbid) IS NOT NULL AND (total_worker_time/execution_count) > 100 --ORDER BY avg_cpu_cost DESC; --ORDER BY execution_count DESC; ORDER BY total_worker_time DESC;

 

 

Jose_Manuel_Jurado_2-1715264624721.png

 

Proposed Solutions and Implementation: To address the performance challenges, several actions were proposed:

  1. Removing temporary statistics generated on the secondary database.
  2. Creating the recommended index on the primary replica to enhance query execution.
  3. Executing necessary scripts on the primary replica to ensure optimal performance.

Actions done:

 

  • If the indexes is needed, the first topic is to remove statistics using the command: DROP STATISTICS NOTES.[_WA_Sys_00000002_5AEE82B9_readonly_database_statistics]

 

select * from sys.stats inner join sys.tables on sys.stats.object_id = sys.tables.object_id where is_temporary=1

 

  • Create the index needed in primary replica CREATE NONCLUSTERED INDEX [NOTES_IX1] ON [dbo].[Notes] ([ID] ASC) include (name)
  • Execute the script in the primary replica if any additional statistics is needed to create or any other topic:

 

declare @n as int = 0 set @n=RAND()*(100000 - 1) + 1 EXEC dbo.GiveNotes @n

 

  • We could see that right now the primary has all the details and secundary will not needed to create additional statistics and will use the index to improve the query performance. 

Jose_Manuel_Jurado_3-1715265885910.png

 

Leave a Reply

Your email address will not be published. Required fields are marked *

*

This site uses Akismet to reduce spam. Learn how your comment data is processed.