Lesson Learned #159: Compressing data and LOB data type in Azure SQL Managed Instance

This post has been republished via RSS; it originally appeared at: Azure Database Support Blog articles.

Today, I worked on a service request that our customer asked to compress data because they are reaching the limit of the database size and they don't want to scale up to the next database tier to have more available space. 

 

Besides the multiple things that we have to reduce the size and as our customer needs to have a quick solution and the CPU usage is not too much, we suggested to use compression. 

 

In this situation, we have a table that has a XML column that is using the almost space of the database and we put focus to try to compress the data. 

 

In my first proof of concept before sending this recomendation, I saw a thing that we need to know before compressing the data when you have a LOB data types. 

 

SQL Server will not compress data when the size of the data takes more than the maximum size of data page (8096 bytes), so, for this reason, we need to analyze if this solution applies. Let me show you an example: 

 

We have two tables Compressed and NotCompressed with this layout:

 

create table Compressed (ID INT IDENTITY(1,1), DETAILS XML ) create table NotCompressed (ID INT IDENTITY(1,1), DETAILS XML ) ALTER TABLE Compressed REBUILD PARTITION = ALL WITH (DATA_COMPRESSION = PAGE); GO ALTER TABLE NotCompressed REBUILD PARTITION = ALL WITH (DATA_COMPRESSION = PAGE); INSERT INTO Compressed (details) values(replicate('x',900)) INSERT INTO Compressed (details) values(replicate('x',9000)) INSERT INTO NotCompressed (details) values(replicate('x',9000)) INSERT INTO NotCompressed (details) values(replicate('x',90000)) -- Run multiple times the following queries -- INSERT INTO Compressed (details) Select details FROM Compressed INSERT INTO NotCompressed (details) Select details FROM NotCompressed

 

After running multiple times the INSERT INTO...SELECT we could see the rows with more than 8096 bytes are not compressed. My suggestion is to use the sp_estimate_data_compression_savings for knowing the savings that you are going to have. 

 

In this situation, it is important, to know how much data that you have in your table. Also, if you are using replication, for example, from OnPremise-Azure or Azure-OnPremise please review the following details 

 

Enjoy!!!

Leave a Reply

Your email address will not be published. Required fields are marked *

*

This site uses Akismet to reduce spam. Learn how your comment data is processed.