Performance Benchmark – Azure Synapse Analytics (Data Warehouse)

This post has been republished via RSS; it originally appeared at: New blog articles in Microsoft Tech Community.

Today, any organization who is ready to spin Azure service always curious of performance, load, design etc. The story is no different with data. With Azure Synapse DWH organization want to make sure they choose right skus and design to benefit customer and themselves. Any organization can come up with following questions 

 

  • Are we choosing right sku for current and future workload? 
  • Are we utilizing the resources well?
  • What about the latency?
  • Based on upcoming request, what will be the impact on allocated resources aka load testing?
  • How to make sure queries are performing well on DWH design?
  • How to build baseline matrix?
Well, one of the ways is to monitor (and setup alerts) the service on  production environment and adjust the sku. However this may impact the user experience  and become hard practice to follow. This blog post try to overcome such problems in advance to avoid any surprises on production environment. The focus will be on building a framework to simulate the load with multiple sql queries and get answer for above questions.

 

Reference Architecture
In this end to end architecture, we'll simulate the load from different regions with different queries. The outcome will help us understand if we choose right sku and setup good  DWH design like table distribution, index, cache etc . Here is the reference architecture of the framework
 
Architecture diagramArchitecture diagram

All scripts and step by step guide can be found here. However, below are the steps to build the framework along with few pictures.  

 
Setting up environment
  1. Azure Synapse Analytics (Data warehouse)
  2. Setup Data warehouse
  3. Download and configure Apache Jmeter
  4. Build Java Management Extension (aka jmx) file
  5. Move scripts to Azure storage
  6. Execute work load
JDBC Request DiagramJDBC Request Diagram

PowerShell Outcome DiagramPowerShell Outcome Diagram

Analyse the Test Results

  1. Investigate result over Azure portal dashboard
  2. Understand the bottleneck and performance
           1) Table Distribution
           2) Partition strategy
           3) Index
           4) Cache
           6) Views
  3. Apache Jmeter Report

PBI Dashboard DiagramPBI Dashboard Diagram

Query Execution DiagramQuery Execution Diagram

 

REMEMBER: these articles are REPUBLISHED. Your best bet to get a reply is to follow the link at the top of the post to the ORIGINAL post! BUT you're more than welcome to start discussions here:

This site uses Akismet to reduce spam. Learn how your comment data is processed.