This post has been republished via RSS; it originally appeared at: New blog articles in Microsoft Community Hub.
Introduction and Motivation
Many customers want to deploy application and database schema updates to a production environment with zero or minimal downtime. The SQL Server dev team has done a lot of work to reduce or eliminate locking during schema updates (i.e. add a column), but there are still situations where such schema changes can have a direct and negative impact on production workloads.
One model that has emerged to facilitate minimal downtime deployments is the Blue/Green model. This technique was described in Continuous Delivery by Jez Humble and David Farley and Martin Fowler’s Blue Green Deployment model. The basic concept is that you have two parallel systems; the blue system is the current production, while the green system is kept up to date with the DML changes on the blue. By making this green update process asynchronous, blocking schema changes can be made on green and then the data changes to the blue database can be caught up afterward.
For application servers, you can either have a parallel set of blue and green servers, unless you can simultaneously deploy green application updates to blue servers. The challenge is how to seamlessly migrate users from the blue to the green version of the application, since in most cases, the application maintains some kind of state. Obviously, there may be timing issues and conflicting write scenarios, but this is outside of the scope of this blog. In this blog we will focus on the database tier.
Michael J Swart, describes a blue/green type deployment, but his company does not actually use separate blue and green databases, but instead use an “aqua” database with versioned SQL objects. Martin Fowler suggested potentially making the database read-only during the transition, but that would be very difficult to achieve, so instead Michael’s company decided to combine multiple versions into a single database. This technique requires very careful planning of schema changes and may even preclude certain types of changes.
In this blog we want to explore the ability to do a full blue/green set of databases and how we can enable change propagation from blue to green databases using an asynchronous update method with minimal performance impact on the blue server.
One option worth exploring that may offer an alternative to handling change data and schema changes is a combination of Azure Data Factory (ADF) - Native Change Data Capture (CDC) and ADF Data Flow - Schema Drift features.
In developing this Proof of Concept (POC) application, we considered using Availability Groups (AGs), Replication, Change Data Capture (CDC) and Change Tracking (CT). The synchronous nature of AGs and the fact that the secondary (green) is read only disqualifies using them in this application. Replication and CDC both turn on verbose logging and require a logreader etc. therefore we decided that CT was the lighter weight solution that we would focus on.
The key requirement of the POC was to allow green schema changes without breaking the data change synchronization from blue. Not only do we need to be able to handle any schema differences between tables in blue and green, but we need to be able to pause the update activity during impactful schema changes.
During a review of the initial solution, it became clear that there was a requirement of additional column mapping capabilities from blue to green, i.e. the ability to replace a column with a primary key from a lookup table using the original column value, the ability to rename columns, change column types and split columns between two tables.
The result of our POC work is a .NET 7 C# console application that uses an application configuration file and an optional Excel file to drive the synchronization and mapping process from Blue to Green. Although .Net 7 is cross platform, the OLEDB driver used to read Excel is Windows only.
Once the switch over from Blue to Green happens, the intent behind this design is to start the process again by using a backup of the new Blue database, restore it over the old Blue database making it the new Green database. This would likely be easier than trying to synchronize schema by applying all of the change scripts applied to the Green and then synchronizing the data.
Application Installation
There is no installation required, just unzip the application into a folder. The installation includes an application configuration (app.config) file and optional Excel file. The app.config includes blue (source) and green (target) connection strings, the last synchronization version starting point, optional Excel input and several other parameters:
The input Excel file allows five mapping operations from blue to green (operation input column):
N = Name change.
V = Write CT current Version into a new column.
T = Type change.
S = Split column.
L = Replace column with Lookup.
The input spreadsheet has the following columns:
The application assumes that Change Tracking has been turned on and at least one table has been enabled for change tracking (see script in Appendix A). At the app.config configurable wait interval, the application checks if there have been any data changes made in the blue environment and if so, it will bulk insert the changes into a staging table in the green database, from where it will apply the changes to the green table(s). In this way, synchronization between Azure SQL DB databases is also supported.
Tables are processed in dependency order, so if there are foreign key (FK) constraints defined, the “dimension” table insert, and update operations will be processed first and then the referenced “fact” tables will be processed. For dimension tables with references, the row deletes from those tables will be deferred until after the dependent fact tables are processed, to avoid FK delete conflicts.
To avoid processing changes more than once, when starting a synchronization, the application gets the current change tracking version and only processes changes between the previous sync and this version. This avoids the need to use read only snapshot isolation (RCSI) during change processing. There is also some additional processing done on the green database, to avoid duplicate inserts and updates of non-existing records, to ensure the set operations succeed and are not rolled back.
To run the application, make the appropriate changes to the app.config file, the optional Excel file and run the application. While the application is running the following single key commands are available:
X – Exit
E – Reload Excel Mapping File
T – Shows Statistics at a Table level – i.e. DML operations per table.
S – Shows Total Statistics – i.e. Total DML operations.
P – Pause synchronization
R – Resume synchronization
The application creates an XEvent listener on the green database to monitor schema changes to tables. When a change tracked table is modified the application automatically reloads the metadata for the green table and re-evaluates the column mappings. To prevent application failures, you should pause synchronization during green schema modifications.
Every time there is a synchronization, a console message is written that includes the starting and ending version and the number of inserts, updates, and deletes. The number of deletes has two numbers, the total deletes, and the number of deferred deletes (in brackets).
The total operations are displayed when the applications exits but are also available during application execution using the T command for totals by table or the S command for overall totals.
The sample application can be downloaded here: CTSync application.
Note: The sample application is a Proof of Concept to explore the feasibility of the technique proposed. It comes with no warrantees or support, therefore use it at your own risk. There are additional details in the User’s Guide included in the package. For information related to support for this tool, please refer the license agreement that is included in the download.
Feedback and suggestions
If you have feedback or suggestions for improving this team asset, please contact the Azure Databases SQL Customer Success Engineering Team. Thank you for your support!