Lesson Learned #487: Identifying Parallel and High-Volume Queries in Azure SQL Database

This post has been republished via RSS; it originally appeared at: New blog articles in Microsoft Community Hub.

Today, I worked on a service request that our customer needs to monitor and optimize queries that can significantly impact performance. This includes both queries running in parallel and those that, although executed with a single thread (MAXDOP 1), have a high volume of executions. In this article, we will explain a query that helps us identify these two types of queries in an Azure SQL Database.

 

The provided SQL query uses Dynamic Management Views (DMVs) in SQL Server to group and analyze execution statistics for queries. Let's break down and explain each part of this query:

 

WITH QueryStats AS ( SELECT query_hash, SUM(total_worker_time) AS total_worker_time, SUM(total_elapsed_time) AS total_elapsed_time, SUM(execution_count) AS execution_count, MAX(max_dop) AS total_dop FROM sys.dm_exec_query_stats GROUP BY query_hash ) SELECT qs.query_hash, qs.execution_count, qs.total_worker_time, qs.total_elapsed_time, qs.total_dop, SUBSTRING(st.text, (qs_statement.statement_start_offset/2) + 1, ((CASE qs_statement.statement_end_offset WHEN -1 THEN DATALENGTH(st.text) ELSE qs_statement.statement_end_offset END - qs_statement.statement_start_offset)/2) + 1) AS query_text FROM QueryStats qs CROSS APPLY (SELECT TOP 1 * FROM sys.dm_exec_query_stats qs_statement WHERE qs.query_hash = qs_statement.query_hash) qs_statement CROSS APPLY sys.dm_exec_sql_text(qs_statement.sql_handle) AS st ORDER BY qs.total_worker_time DESC;

Breakdown of the Query

  1. Common Table Expression (CTE): QueryStats

    • This part aggregates statistics from the sys.dm_exec_query_stats view, which contains performance statistics for cached query plans.
    • query_hash: A hash value used to identify queries that are similar in structure.
    • total_worker_time: Total CPU time used by the query.
    • total_elapsed_time: Total time taken for the query to execute.
    • execution_count: Number of times the query has been executed.
    • total_dop: Maximum degree of parallelism (DOP) used by the query.
  2. Main Query:

    • qs.query_hash, qs.execution_count, qs.total_worker_time, qs.total_elapsed_time, qs.total_dop: These columns are selected from the CTE QueryStats.
    • query_text: Extracts the text of the query using the substring function, which extracts a portion of the query text from the sys.dm_exec_sql_text.
  3. CROSS APPLY:

    • qs_statement: Retrieves the top row from sys.dm_exec_query_stats  where the query_hash matches, providing detailed information for each query hash.
    • st: Retrieves the SQL text of the query using sys.dm_exec_sql_text with the SQL handle from qs_statement. 
  4. ORDER BY:

    • The final result is ordered by total_worker_time in descending order, showing the most CPU-intensive queries at the top.

How This Helps Identify Performance-Impacting Queries

  • Parallel Queries: Queries with a high total_dop value indicate parallel execution. While parallel execution can speed up individual queries, it can also lead to resource contention, especially if many queries run in parallel simultaneously.
  • High-Volume Queries: Queries with a high execution_count value but low total_dop are typically executed with MAXDOP 1. If these queries are executed frequently, they can still significantly impact the overall performance of the database.

By identifying both types of queries, database administrators can take actions such as query optimization, indexing, or adjusting the degree of parallelism settings to improve overall database performance.

Leave a Reply

Your email address will not be published. Required fields are marked *

*

This site uses Akismet to reduce spam. Learn how your comment data is processed.