SQL Server on Azure VMs: I/O analysis (preview)

This post has been republished via RSS; it originally appeared at: Microsoft Tech Community - Latest Blogs - .

Analyzing I/O problems just got easier for SQL Server on Azure VMs

 

It is not easy to understand what's going on when you run into an I/O related performance problem on an Azure Virtual Machine. It is a common, but complex problem. What you need is to figure out what's happening at both the host level and your SQL Server instance where often, correlating host metrics with SQL Server workloads can be a challenge.

 

We developed a new experience that helps you do exactly that.

 

When you visit the Storage blade of your SQL virtual machine resource page on Azure portal, you will see two new tabs:

  • I/O analysis
  • I/O related best practices

 

EbruErsan_0-1717426872473.png

 

 The I/O analysis tab will tell you if you are having a performance issue stemming from IOPS and/or throughput throttling, caused by exceeding virtual machine or data disks limits. It will further show you the exact metric(s) and time where this issue shows itself down to the disk or VM level. Once you identify the problem on this tab, you might want to go to our documentation by following the "Learn more" link on the page as shown in the image above. The documentation details each scenario, what might have caused the problem, and provides guidance on how to resolve it. 

 

How does it determine if there is a problem or not? It uses Azure metrics to understand what is going on in the system. I/O analysis checks metric health data for the last 24 hours. You will see the Azure metric charts on the page.

EbruErsan_1-1717426872477.png

 

If you click on these charts, they will take you directly to the Azure metrics page as it is simply leveraging what's available in Azure already.

EbruErsan_2-1717426872487.png

 

It first looks to see if there is disk latency above a certain threshold. Throttling might occur but it is not considered problematic unless it results in a latency condition. Once latency is detected, it then analyzes Azure Metrics and shows you which one(s) demonstrate the problem.

 

The Azure metrics are: 

 

You can find detailed information about the metrics and the algorithms used in the documentation.

 

Detecting Latency
In the example below, you see that it detected an issue (disk latency was over the threshold for a certain amount of time). The problem occurred on May 20th at 12:56pm UTC. If there are multiple spikes in the chart, I/O Analysis helps you pinpoint the issue. Two metrics show why the latency occurred.

In this case, it is a throughput problem both at VM level and disk level. The disk related charts have a chart line for each disk you have in your virtual machine, labeled with the LUN number. In the below graphic, you can see there is an I/O latency issue due to throughput for ‘VM Uncached Bandwidth Consumed Percentage’ and ‘Data Disk Bandwidth Consumed Percentage’.

 

EbruErsan_3-1717426872489.png

You can then explore the details further by expanding the VM level metrics and / or disk level metrics sections below the Disk Latency chart.

EbruErsan_4-1717426872490.png

 

For this scenario, expanding the VM level metrics reveals the following data on cached and uncached IOPS and throughput health, where you again can go to the metrics data, for further analysis, by clicking the chart.

 

EbruErsan_5-1717426872492.png

 

You may also want to explore the disk level, by expanding the disk level metrics section as shown below.

EbruErsan_0-1717428112239.png

 

 

I/O related best practices tab checks to make sure your system is following the configuration best practices relating to I/O on a SQL Server on Azure VM. Poorly configured systems tend to lead to performance problems, which often get exposed under workload pressure. Running an assessment will give you recommendations with various severity ratings based on risk and impact.

 

We recommend you implement them starting with the highest severity.

 

EbruErsan_7-1717426872502.png

 

PowerShell script

If you prefer scripting to using Azure portal, you can also use the I/O Analysis PowerShell script to analyze the I/O performance of your SQL Server VM.

 

We would love to hear your feedback. Please feel free to comment here.

 

 

Leave a Reply

Your email address will not be published. Required fields are marked *

*

This site uses Akismet to reduce spam. Learn how your comment data is processed.