It just works – Backup compression on TDE databases

This post has been republished via RSS; it originally appeared at: SQL Server articles.

There’s not much logic to native database backups in SQL Server – read bits from the data file, write bits to the backup file – so the performance of a backup is essentially a factor of I/O speed.  Back in SQL Server 2008, backup compression was introduced in order to reduce not only the size of backup files, but the amount of I/O generated by a backup operation. By compressing the data before writing it to the backup file, we’re able to reduce the amount of data that gets written to the backup file, which in turn reduces the amount of I/O required to complete the backup operation. This takes a bit more CPU, but the reduction in I/O typically outweighs the extra CPU, so the backup is faster.

Unfortunately, with encrypted databases the compression ratio is near 0. Because there’s not much repetition in encrypted data, the data can’t effectively be compressed. This means if you have backup compression turned on for a database with transparent data encryption (TDE), you’ll be getting all the overhead of backup compression with none of the benefit. For this reason, up until SQL Server 2016, we did not support backup compression with TDE.

 

In SQL Server 2016, we introduced a new backup compression algorithm to be able to support compressed backups of TDE databases. There’s no magic to this algorithm, we simply decrypt the data before compressing it, then encrypt the compressed data before we write it to the backup file. The steps for a compressed backup of a TDE database are as follows:

  1. Read one block of data from the database file. 
  2. Decrypt a 64KB extent of data.
  3. Compress the extent.
  4. Encrypt the extent.
  5. Write the newly encrypted extent to a buffer.
  6. Repeat steps 2 to 5 until the entire block has been written to the buffer.
  7. When the buffer fills up, flush the buffer to the backup file.

The size of the block that gets read from the database file and the buffer that gets written to the backup file is based on the MAXTRANSFERSIZE parameter of the backup command. Keep in mind that after compression, the block of data that was read from the disk should be smaller than MAXTRANSFERSIZE, so we can typically fit more than one read block into the buffer before writing to the backup file. When we restore the backup, we do the same steps in reverse: read from the backup file in chunks based on MAXTRANSFERSIZE, decrypt the data, decompress it, then re-encrypt it and write the blocks to the database file . When we do this in reverse, we may need to stitch blocks together if they span multiple buffers, so we need a little extra space to do this. For this reason, we need the buffer to be bigger than 64KB, thus we need MAXTRANSFERSIZE to be bigger than 64KB. The default value for MAXTRANSFERSIZE varies based on the backup target, be it disk, URL or virtual device (VDI), and of course users can specify their own MAXTRANSFERSIZE. It was decided that the best way to guarantee the correct MAXTRANSFERSIZE was to require users to include this parameter with a value greater than 64KB in order to enable the new compression algorithm on backups of TDE databases.

 

This makes sense given the technical requirements, but it’s is not at all intuitive to users. There’s a very good chance that folks out there do not know about this obscure configuration parameter, and if they do and they want this new backup compression, they have to modify all their existing backup scripts to add the MAXTRANSFERSIZE parameter. We thought it would be a good idea to just make backup compression work, no matter whether you specify this MAXTRANSFERSIZE parameter or not. And so, starting with SQL Server 2019 CU5, if you have backup compression turned on at the instance level, or specify WITH COMPRESSION on a backup command, you will get this optimized compression algorithm for a TDE database even if you specify a smaller MAXTRANSFERSIZE, or don’t specify MAXTRANSFERSIZE at all .  We do still need MAXTRANSFERSIZE to be greater than 64KB, so if you specify a smaller size for this parameter, we will automatically increase it to 128KB behind the scenes in order to effectively compress the encrypted data. If for some reason you need MAXTRANSFERSIZE to be smaller than 128KB for any databases that have TDE enabled, you’ll need to ensure that backup compression is turned off by using WITH NO_COMPRESSION on the backup command or turning it off at the instance level.

REMEMBER: these articles are REPUBLISHED. Your best bet to get a reply is to follow the link at the top of the post to the ORIGINAL post! BUT you're more than welcome to start discussions here:

This site uses Akismet to reduce spam. Learn how your comment data is processed.