Filtering tables from Azure Database for MySQL read replica servers

This post has been republished via RSS; it originally appeared at: New blog articles in Microsoft Tech Community.

In certain scenarios, users may want to limit the tables that are replicated to a read replica server to improve performance and save costs. For example, many reporting/analytics workloads require only subset of the tables rather than the entire database schema(s). Similarly, a database schema for a web application often includes audit or logging tables designed to capture user behavior and to accommodate audit and compliance requirements. These tables have heavy write activities and often account for 80-90% of the instance size. While the tables are essential to ensuring that the application meets compliance requirements, they aren’t required for read-only functionality. In these examples there’s no need to replicate anything other than the required tables to meet the business requirements.

 

However, when you set up an Azure Database for MySQL read replica, all the tables from your primary instance are copied to the read replicas by default. While the Azure Database for MySQL service doesn’t support filtering a table or a database schema from a read replica, it does support the replicate_wild_ignore_table parameter, a native MySQL capability, which can prevent one or more tables from being included in MySQL replication.

 

To accomplish this goal, we simply need to:

  1. Set up MySQL Data-in replication, which allows you to synchronize data between two Azure Database for MySQL instances using the binary log (binlog) file position-based or GTID-based replication that is native to MySQL.
  2. Limit the tables replicated to the secondary replica instance via the Azure Portal or the Azure CLI.
  3. Provision the required Azure Database for MySQL read replica from the secondary instance.

The rest of this blog post discusses how best to leverage the replicate_wild_ignore_table parameter and MySQL replication to prevent one or more tables from being replicated to the Azure Database for MySQL read replica.

 

Note: This blog post applies to both Single Server and Flexible Server.

 

Reference architecture using Azure DB for MySQL read replica

 

With Azure Database for MySQL, the appropriate reference architecture to offload the read-only workload is shown in Figure-1 below.

 

Fig1a.png

 

In this architecture, myrw-1 is the source instance of Azure Database for MySQL, and it supports all Data Manipulation Language (DML) activities. This instance has four Azure Database for MySQL read replica servers (myrr-1, etc.) available for offloading read-only workloads, and by default, each of the read replica servers has a compute and storage configuration that is identical to the source instance.

 

In this example, the source instance is in East US2, with read replica servers in East US and Central US. Each read replica server is updated asynchronously using the MySQL engine's native binary log (binlog)-based replication technology.

 

Note: Flexible Server doesn’t currently support cross-region read replicas or replication on a High Availability (HA) configuration. As a result, Flexible Server read replicas would be located in the same region as the primary instance. Cross-region read replicas and support for replication on an HA Flexible Server are on the product roadmap for future delivery.

 

Set up MySQL Data-in replication

 

First, we need to set up MySQL Data-in replication for the primary instance, myrw-1. In so doing, we’ll create the secondary instance, myrw-2, as shown in Figure 2 below.

 

Fig2a.png

 

Note: For more information, see How to configure Azure Database for MySQL Data-in Replication.

 

Limit tables replicated to the secondary instance

 

Next, we need to limit the tables that are replicated from the myrw-1 instance to the myrw-2 instance. Let’s assume that the myrw-1 instance has two databases (Sales and Album), and that each database includes some tables that are required by the application or for read-only operations and some that aren’t, as listed in the following table:

 

Database

Required tables

Tables that aren’t required

Sales

City, Country, CountryLanguage

SalesAudit, SalesLog, ComplianceLog

Album

Album, Track

TrackAudit, TrackLog, ComplianceLog

 

As you can see, read operations don’t require the SalesAudit, SalesLog, and ComplianceLog tables in Sales database. Similarly, the TrackAudit, TrackLog, ComplianceLog tables in the Album database aren’t required.

 

To filter these tables from replication to the myrw-2 instance, use the replicate_wild_ignore_table parameter, which by default has the following value:

 

mysql.%,information_schema.%,performance_schema.%,sys.%

 

To skip these tables during replication, either specify each table separately or use “%” pattern matching via the Azure Portal or the Azure CLI.

 

Filter tables via the Azure portal

 

In the Azure portal, on the Server Parameters blade under Settings, update the default value of the replicate_wild_ignore_table parameter by appending the default value with the tables that you want to exclude from replication, as shown below:

 

mysql.%,information_schema.%,performance_schema.%,sys.%,Sales.SalesAudit,Sales.SalesLog,Sales.ComplianceLog,Album.TrackAudit,Album.TrackLog,Album.ComplianceLog

 

Alternately, you can use “%” pattern matching in the parameter value to avoid specifying the SalesLog and ComplianceLog tables in the Sales database and the TrackLog and ComplianceLog tables in the Album database, schema as all these tables ending with “Log”. In this case, the parameter value would be:

 

mysql.%,information_schema.%,performance_schema.%,sys.%,Sales.SalesAudit,Sales.%Log,Album.TrackAudit,Album.%Log

 

Filter tables via the Azure CLI

 

You can also use the Azure CLI to update the replicate_wild_ignore_table parameter. Below are the CLI commands that you can use to list and update the replicate_wild_ignore_table parameter in Azure Database for MySQL Server.

 

To show the current value of the replicate_wild_ignore_table parameter, run the following command:

 

Single Server

Flexible Server

az mysql server configuration show -g <resource-group> -s <server-name> -n replicate_wild_ignore_table --query value

az mysql flexible-server parameter show -g <resource-group> -s <server-name> -n replicate_wild_ignore_table --query value

 

To update the value of the replicate_wild_ignore_table parameter, run the following command:

 

Single Server

Flexible Server

az mysql server configuration set -g <resource-group> -s <server-name> -n replicate_wild_ignore_table --value <new-value>

az mysql flexible-server parameter set -g <resource-group> -s <server-name> -n replicate_wild_ignore_table --value <new-value>

 

Restart the Azure Database for MySQL instance by running the following command:

 

Single Server

Flexible Server

az mysql server restart -g <resource-group> -n <server-name>

az mysql flexible-server restart -g <resource-group> -n <server-name>

 

Provision read replicas from the secondary instance

 

The secondary instance, myrw-2, now includes only the tables that are required for read-only operations; the remaining tables have been excluded from replication to the secondary instance. You can now use the myrw-2 instance to provision the Azure Database for MySQL read replicas via the Azure Portal or Azure CLI, which will offload the read-only workload, as shown in Figure 3 below.

 

Fig3a.png

 

Finally, remember that the Azure Database for MySQL read replicas will by default have identical Compute and storage configuration as primary instance. If, after filtering the tables, your read replica instance is significantly smaller than the primary instance, then you can consider reducing the Compute size to reduce cost. It’s recommended to monitor the replication_lag_in_seconds metric regularly to ensure that the replication lag between the primary and read replica isn’t impacted because of the reduced Compute size.

 

Conclusion

 

Using the steps discussed above, you can easily filter tables that aren’t required for your read-only operations from replication. This configuration will help improve replication performance and potentially provide an option for reducing costs by lowering the Compute requirement for your read-replicas.

 

If you have any feedback or questions, please leave a comment below or email our team at AskAzureDBforMySQL@service.microsoft.com.

 

Thank you!

Leave a Reply

Your email address will not be published. Required fields are marked *

*

This site uses Akismet to reduce spam. Learn how your comment data is processed.