Monitoring Synapse serverless SQL pool query history

Posted by

This post has been republished via RSS; it originally appeared at: New blog articles in Microsoft Community Hub.

This article will walk you through the different ways of how to monitor serverless SQL requests. By reviewing this article, you will understand what to see, compare and how to troubleshoot the query execution behavior. At the end of this article, I will also share how to create a solution to retain historical data from SQL requests forever. 

 

Historic SQL Serverless query analysis is one of the crucial needs of data engineers. Serverless monitoring supports 5 main ways to analyze query history and performance. This includes Synapse Monitor Hub, DMVs, QPI library, Azure Log Analytics, and Azure Data Explorer (not covered on this article).

 

Let's start by reviewing Monitor hub.

 

Monitor hub

Monitor hub is a main feature of Synapse Studio that provides options to help you monitor the historical details about activities that happened in your workspace. e.g. pipeline runs, view the status of IRs, view Synapse Spark jobs, view SQL requests (Serverless and Dedicated), and more.

 

Monitor hub is divided into three groups, Analytics pools, Activities and Integration. We can quickly find the status and additional details about the built-in from these views (Analytics pool and Activities).

 

We will now focus on how to check initial information in Synapse serverless SQL. Using the left menu select Analytics pools > SQL pools > Built-In

 

scirqueira_0-1673016689346.png

 

After clicking on the Built-in link, we can see a bit more details as shown in the following screenshots.

 

scirqueira_1-1673016719035.png

 

 

scirqueira_2-1673016719037.png

 

The SQL request history for the built-in serverless SQL pool is shown at Activities > SQL Requests.

 

scirqueira_3-1673016719038.png

 

By default, the initial monitoring screen shows only the previous 24 hours of the submitted queries.

 

From this view we can edit, add, remove and re-order the most useful columns which will help us to have an initial visualization of the built-in SQL ended requests.

 

For the selected columns below, we can start our monitoring by looking at the following results: Query text of the SQL requests, submitted time, ended time, running time, data processed per each query, submitter and query status (only completed and failed).

 

scirqueira_4-1673016764098.png

 

If you want to extend the monitoring time range, you can also edit the filter where shows the historical query execution per more than 24 hours, 7 days, 30 days and customized time range.

 

scirqueira_5-1673016764100.png

 

Paying attention in the screenshot below, I'm writing this article as of 12/17/2022, if we try to go back in the time range, we can only see 45 days of availability of historical executions.

 

scirqueira_6-1673016764100.png

 

 

After editing the filter, my latest instance of the query execution was in 11/07/2022 as we can see below.

 

scirqueira_7-1673016764102.png

 

If you want to see if a query execution is performing worse than in the last execution for two or more months ago it will not be possible. Using the monitor hub we can only see the maximum of 45 days of historic query. The next section of this article is to go through the steps on how to store the query history forever.

Let's continue…

 

DMVs

Dynamic Management View (DMVs) allow us to create an out-of-the-box solution to monitor and analyzing crucial metrics available by using T-SQL.

 

All the DMVs below will be the same as in SQL Server and some columns will be irrelevant for serverless SQL ended requests.

 

 

 

 

 

SELECT * FROM sys.dm_exec_connections SELECT * FROM sys.dm_exec_sessions SELECT * FROM sys.dm_exec_requests

 

 

 

 

 

 

Public Reference: 

sys.dm_exec_connections (Transact-SQL) - SQL Server

sys.dm_exec_sessions (Transact-SQL) - SQL Server

sys.dm_exec_requests (Transact-SQL) - SQL Server

 

My colleague  wrote the following article in Azure Synapse Analytics Blog - 

Monitoring Synapse serverless SQL open connections, where you can see how to check Serverless Open connections and how to summarize the connections per program, login and database in the built-in SQL Pool.
 

Some more DMVs examples for reference:

 

The T-SQL query is recommended to check the current running queries by joining some useful DMVs.

 

 

 

 

 

SELECT 'Running' AS [Status] ,Transaction_id AS [Request ID] ,'SQL On-demand' AS [SQL Resource] ,s.login_name AS [Submitter] ,s.Session_Id AS [Session ID] ,req.start_time AS [Submit time] ,req.start_time AS [Start time] ,'N/A' AS [End time] ,req.command AS [Request Type] ,SUBSTRING(sqltext.TEXT, (req.statement_start_offset / 2) + 1, ( ( CASE req.statement_end_offset WHEN - 1 THEN DATALENGTH(sqltext.TEXT) ELSE req.statement_end_offset END - req.statement_start_offset ) / 2 ) + 1) AS [Query Text] ,req.total_elapsed_time AS [Duration] ,req.total_elapsed_time AS [Running Duration] FROM sys.dm_exec_requests req CROSS APPLY sys.dm_exec_sql_text(sql_handle) sqltext JOIN sys.dm_exec_sessions s ON req.session_id = s.session_id

 

 

 

 

 

 

Looking for a historic query? Use the following T-SQL code which is the same query which runs behind the scenes in the Monitor hub

 

Note: From Microsoft Support perspective we share the DMV "sys.dm_exec_requests_history" for ticket purposes to help customers better troubleshooting the SQL requests.

 

 

 

 

 

 

SELECT [Status] ,Transaction_id AS [Request ID] ,'SQL On-demand' AS [SQL Resource] , distributed_Statement_Id AS [Distributed Statement ID] , query_hash AS [Query Hash] ,login_name AS [Submitter] ,start_time AS [Submit time] ,start_time AS [Start time] ,end_time AS [End time] ,command AS [Request Type] ,query_text AS [Query Text] ,total_elapsed_time_ms AS [Duration] ,data_processed_mb AS [Data processed in MB] ,[Error] , error_code AS [Error Code] FROM sys.dm_exec_requests_history

 

 

 

 

 

 

 

Field reference returned from "sys.dm_exec_requests_history".

scirqueira_8-1673017010442.png

 

Note: In the field reference example returned by querying this internal DMV, the distributed_statement_id is one of the most important column to reference for Microsoft support troubleshooting. If you are facing any issue in the serverless SQL engine, please reach the Synapse support team out by opening a service request and providing the statement id for further investigation.


You can find detailed information of how to find the statement id for support ticket purposes in the following article: How to provide statement id in the serverless SQL pool support ticket?

Running this query from SSMS, we can see the following results:

scirqueira_9-1673017019646.png

 

Data retention in sys.dm_exec_requests_history depends on a series of internal factors, if this information is very important to you, the best way to achieve this goal is to keep it in a Log Analytics as shown in the Log Analytics session on this article, or by creating an out of box solution to store the historical query execution forever (I will discuss this in a future article).

Additionally, the following DMV can be used to check how much data was processed during the current day, week, or month. 

 

 

 

 

 

SELECT * FROM sys.dm_external_data_processed

 

 

 

 

 

 

Public Reference:

https://learn.microsoft.com/en-us/azure/synapse-analytics/sql/data-processed

 

QPI Library

Query Performance Insights (QPI) for Synapse serverless SQL pool is an open-source and free set of T-SQL views that you can set up on your database and get some useful information about your queries. You can simply install this library by executing this script.

 

Reviewing this script code you can see that it is just referencing all DMVs mentioned above on this article with some improvements.

 

 wrote an article where you can find all needed information to troubleshoot performance, check out Troubleshooting performance on serverless Synapse SQL pool using QPI library.

 

The interesting point here is that by using the "qpi.recommendations" view mentioned in the article from Jovan, you can also find the recommendations for optimizing your database schema.

 

Additionally, I am sharing below another really good article related to database optimization also written by Jovan, check out

Optimize database schema in serverless SQL pools using QPI library. In this article, Jovan is talking about the important view which returns the recommendations explaining how to optimize the schema (column types) in your serverless database.

 

Here are some other examples about the serverless QPI Views:

 

Monitoring the queries that are running on your serverless SQL pool:

 

 

 

 

 

SELECT * FROM qpi.queries

 

 

 

 

 

 

 

scirqueira_10-1673017111754.png

 

Looking for query history, we can use the following QPI view:

 

 

 

 

 

SELECT * FROM qpi.query_history WHERE request_id LIKE '{<Statement id here>}' ORDER BY start_time DESC

 

 

 

 

 

 

scirqueira_11-1673017111754.png

 

Note: Same data from DMV sys.dm_exec_requests_history with some improvements.

 

Based on the best practices for serverless SQL, use this view to get the best improvement recommendations to apply to your database:

 

 

 

 

 

 

SELECT * FROM qpi.recommendations

 

 

 

 

 

 

scirqueira_12-1673017111755.png

 

By running the following function, you can compare the characteristics of two query executions based on Distributed Statement IDs:

 

 

 

 

 

SELECT * FROM qpi.cmp_queries('{<Statement id here>}', '{<Statement id here>}')

 

 

 

 

 

 

scirqueira_13-1673017111756.png

 

Log Analytics

Last but not least, you can setup a Log Analytics workspace and connect Synapse with Log Analytics to send built-in serverless SQL requests logs.  You can also setup a configurable retention period for data store and start to run customized KQL queries for better monitoring of your serverless SQL queries.

 

Check out this article on how to setup a Log Analytics Workspace and connecting the Synapse workspace to send log to Log Analytics. This article features the rich capability of how to monitor serverless SQL ended requests metrics and logs by using workbooks. #enjoy

 

Public reference: Monitoring serverless SQL ended requests by using Log Analytics.

 

Conclusion

The idea of this article was to share with you some straightforward ways of how to monitor your Serverless SQL Pool and help you proceed with further analysis through your query history and performance considerations. 

 

Coming soon, I will release an article on how to create an out-of-box solution to store serverless SQL Ended requests data forever. 

 

Follow me or subscribe to this blog channel to learn more.

 

SIDNEY CIRQUEIRA

Synapse Analytics Support Engineer - Americas

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.