Lesson Learned #460: DataIO vs Buffer Pool

This post has been republished via RSS; it originally appeared at: Azure Database Support Blog articles.

In the past week, we have dedicated our efforts to a service request where our customer asked questions:

Whether DataIO signifies the utilization of memory by the Buffer Pool.
The rationale behind Azure SQL Database not releasing the memory allocated in the buffer pool.

In the following discussion, I would like to share with you the details of this explanation.

The first thing is to know, how buffer pool works: Azure SQL Server employs a sophisticated approach to manage its buffer pool, a critical component in the overall performance of the database system. The lifecycle of data within the buffer pool, especially in response to query execution, involves several key stages:

1. Receiving the Query:

When a query is submitted to Azure SQL Server, it first assesses whether the required data pages are available in the buffer pool. The buffer pool is a cache in the server’s memory, holding copies of data pages read from the database files on storage.

2. Loading Data into the Buffer Pool:

Data Retrieval: If the data pages needed for the query are not in the buffer pool, Azure SQL Server retrieves them from the storage storage. This process involves reading the physical data pages and loading them into the buffer pool, residing in the server's RAM.
Efficient Data Access: Subsequent queries requiring the same data pages can quickly access them from the buffer pool, significantly reducing the need for disk I/O operations and speeding up query processing.

For this scenario, let's try to perform an example, with the following table:

CREATE TABLE [MSxyzTest].[_x_y_z_MS_HighCPU]( [Id] [bigint] NOT NULL, [TextToSearch] [varchar](200) NULL, PRIMARY KEY CLUSTERED ( [Id] ASC )WITH (STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, OPTIMIZE_FOR_SEQUENTIAL_KEY = OFF) ON [PRIMARY] ) ON [PRIMARY] GO DECLARE @i bigint = 1; WHILE @i <= 4000000 BEGIN INSERT INTO MSxyzTest._x_y_z_MS_HighCPU (Id, TextToSearch) VALUES (@i, NEWID()); -- NEWID() genera un valor aleatorio para el ejemplo SET @i = @i + 1; END;

Once we have the data we are going to run a CHECKPOINT to save the data to the storage and run DBCC DROPCLEANBUFFER to clean the buffer pool.

Why Execute `CHECKPOINT`:

Purpose: The CHECKPOINT command in SQL Server is used to manually trigger the process of writing all dirty pages (pages in the buffer pool that have been modified but not yet written to disk) and log records to disk. By doing so, it ensures that the data is stored persistently.
Use Cases: This command is particularly useful in testing environments for simulating a scenario where SQL Server needs to recover from an unexpected shutdown. Executing a CHECKPOINT ensures that all the recent changes are safely written to disk.

Why Execute `DBCC DROPCLEANBUFFERS`:

Purpose: DBCC DROPCLEANBUFFERS is a command used to clear the buffer pool, specifically by removing the clean buffers. It's primarily used to test the performance of queries without the benefit of the cache, mimicking a scenario where queries have to retrieve data from the disk.
Use Cases: This is especially useful in a development or testing environment when you want to measure the performance of queries and ensure that your tests are not artificially benefitted by data already being cached in the buffer pool.

Important Considerations:

Non-Production Environments: Both CHECKPOINT and DBCC DROPCLEANBUFFERS should ideally be executed only in non-production environments. They are valuable tools for testing and understanding how queries and data management processes behave under certain conditions.
Impact on Performance: It's critical to note that these commands can have significant implications on performance:
- CHECKPOINT can cause a surge in disk I/O as it writes a large amount of data to disk.
- DBCC DROPCLEANBUFFERS can lead to a temporary performance degradation. When the buffer pool is cleared, subsequent queries will not benefit from cached data, resulting in increased disk I/O and slower query performance as data must be read from disk again.
Caution in Production: In a production environment, these commands can disrupt the normal optimization and efficiency of the database. Clearing the buffer pool or forcing checkpoints can lead to unnecessary strain on the system, impacting the performance of the database for all users.

The next action we are going to run the following query: select COUNT(*) from [MSxyzTest].[_x_y_z_MS_HighCPU] and we could see two things:

Query might took time in this execution.
And running SELECT * from sys.dm_db_resource_stats the data_io is high and avg_memory_usage_percent has been increased. This is expected because the data is not in the memory and we need to bring the memory.

If we run the following query we could see the distribution in the buffer pool about our table and space usage.

DECLARE @TableObjectId INT; SET @TableObjectId = OBJECT_ID('[MSxyzTest].[_x_y_z_MS_HighCPU]'); SELECT bd.page_type, o.name AS ObjectName, i.name AS IndexName, bd.file_id, bd.page_level, COUNT(*) AS NumberOfPages, SUM(bd.row_count) AS TotalRowCount, AVG([bd].[references]) AS AvgReferences, MIN([bd].[references]) AS MinReferences, MAX([bd].[references]) AS MaxReferences, AVG(bd.row_count) AS AvgRowCount, MIN(bd.row_count) AS MinRowCount, MAX(bd.row_count) AS MaxRowCount, AVG(bd.read_microsec) AS AvgReadMicrosec, MIN(bd.read_microsec) AS MinReadMicrosec, MAX(bd.read_microsec) AS MaxReadMicrosec, AVG(CAST(bd.is_modified AS INT)) AS AvgIsModified, MIN(CAST(bd.is_modified AS INT)) AS MinIsModified, MAX(CAST(bd.is_modified AS INT)) AS MaxIsModified, SUM(CAST(bd.is_modified AS INT)) AS SumIsModified FROM sys.dm_os_buffer_descriptors AS bd INNER JOIN sys.allocation_units AS au ON bd.allocation_unit_id = au.allocation_unit_id INNER JOIN sys.partitions AS p ON au.container_id = p.partition_id INNER JOIN sys.objects AS o ON p.object_id = o.object_id LEFT JOIN sys.indexes AS i ON p.index_id = i.index_id AND p.object_id = i.object_id WHERE o.object_id = @TableObjectId GROUP BY bd.page_type, o.name, i.name, bd.file_id, bd.page_level;

So, as expected if we run again the query: select COUNT(*) from [MSxyzTest].[_x_y_z_MS_HighCPU] and we run SELECT * FROM sys.dm_db_resource_stats the results will be memory in use and data io low.

So, the first answer will be that data io will be when we need to bring data from storage to the memory. So, for this reason, it is very important to use buffer pool and have the most frecuent data in the memory to improve the performance and SQL Server will not release the memory until the following aspects.

For example, let's update 10000 rows, Update top(1000) [MSxyzTest].[_x_y_z_MS_HighCPU] set TextToSearch ='Example _x_2' running the same query about buffer descriptions we could see an indicators that some pages in memory have been modified. If we run only DBCC DROPCLEANBUFFER only the pages that have not been modified will be release, we need to run a checkpoint to free these pages from the buffer pool.

3. Managing Data in the Buffer Pool:

Least Recently Used (LRU) Algorithm: Azure SQL Server uses an algorithm, typically a variation of the LRU strategy, to manage the data in the buffer pool. This algorithm helps in determining which data pages should be retained in the buffer pool and which can be evicted when more space is needed.
Dirty Pages: When data in the buffer pool is modified (due to update operations, for example), these pages are marked as ‘dirty.’ Dirty pages indicate that the in-memory version of the page is different from the disk version.

4. Writing Data from Memory to Disk (Checkpoint Process):

Checkpoint Operation: To maintain database integrity and ensure that data modifications are saved to disk, Azure SQL Server periodically runs a checkpoint operation. This process involves writing all dirty pages from the buffer pool to disk.
Triggering Factors: Checkpoints can be triggered by several factors, including a set interval of time, a certain number of transactions, or a manual command.

Finally, we have other situations that Azure SQL Server needs to release memory from the buffer pool and here the counter called Life Page Expectancy is our friend to analyse if we need more memory or not for our current workload needs.

5. Releasing Memory in the Buffer Pool:

Dynamic Memory Management: Azure SQL Server dynamically manages the memory allocated to the buffer pool. When the system is under memory pressure, or when the buffer pool is not fully utilized, it can release memory back to the OS or reallocate it within SQL Server.
Data Eviction: As new data pages are needed for other queries and operations, older, less frequently accessed pages (that are not dirty) may be evicted from the buffer pool to make room.

Conclusion:

The buffer pool in Azure SQL Server plays a pivotal role in optimizing query performance by reducing the reliance on slower stroage operations. Its management involves a delicate balance between holding frequently accessed data in memory for rapid access and ensuring data persistence and integrity through regular checkpoints. This efficient handling of the buffer pool is central to the performance and reliability of Azure SQL Server databases.