Database templates in Azure Synapse Analytics

This post has been republished via RSS; it originally appeared at: New blog articles in Microsoft Tech Community.

Azure Synapse Analytics is a limitless analytics service that brings together data integration, enterprise data warehousing, and big data analytics. It gives you the freedom to query data on your terms, using either serverless or dedicated resources—at scale. Azure Synapse brings these worlds together with a unified experience to ingest, explore, prepare, manage, and serve data for immediate BI and machine learning needs.

 

One of the challenges that users in key industry areas face is how to describe and shape the mass of data that they are gathering. Most of this data is currently stored in data lakes or in application-specific data silos. The challenge is to bring all this data together in a standardized format enabling it to be more easily analyzed and understood and for ML and AI to be applied to it.

 

Azure Synapse solves this problem by introducing industry-specific templates for your data, providing a standardized way to store and shape data. These templates provide schemas for predefined business areas, enabling data to be loaded into a database in a structured way.

 

Database templates in Azure Synapse are industry-specific schema definitions that provide a quick method of creating a database known as a lake database. As with any database, it contains a well-defined schema for a business solution. This schema includes tables, columns, and relationships that represent transactional, operational, and business semantic data. Azure Synapse is used to perform big data operations on a lake database via either SQL or Spark compute pools. By using database templates, we are leveraging decades of specific industry experience to form an extensible schema. This schema can be expanded or derived from your own business terminology and taxonomies. Data processing pipelines in Azure Synapse provide hundreds of connectors (e.g., SAP ECC, Dynamics CRM, Magento), making it easy to connect to different source systems. The data in the lake database is stored in , and creates the foundation of an enterprise data lake, where data from the different sources are combined for analytics and reporting.

 

Finally, database templates have been built with an ecosystem in mind. Customers and partners can rapidly build analytics-infused industry use cases by customizing and extending the standard templates using the database editor in Azure Synapse. Here’s what some of the early adopters have noted:

 

“Having an opportunity to test and use Azure Synapse database templates, our team at dunnhumby were impressed by the breadth of data domain coverage, along with Azure Synapse’s feature and tools for development, engineering, and delivery of outstanding data science.

As a global leader in retail data science and knowing how hard navigating data fundamentals can be - we can see the benefits for Azure Synapse customers in helping them to rapidly unlock the value in their data assets, enabling them to evolve and scale their insight capabilities.

Azure Synapse database templates can be a key enabler in breaking down data silos and unlocking potential value in enterprise data.” - David Jack | dunnhumby Chief Technology Officer

 

“We have further deepened our Qlik Data Integration capabilities with Azure Synapse with the availability of database templates. Our retail customers now have increased ability to access and transform SAP’s complex, application-specific data structures from any SAP source, and into formats optimized for analytics within Azure Synapse” – Matt Hayes, VP SAP Business at Qlik

 

Currently, Azure Synapse includes database templates for Retail, Consumer Goods, Banking, Fund Management, and Property and Casualty Insurance, with more industry-specific templates to be added in the near future:

 

Figure 1: Database templates in Azure SynapseFigure 1: Database templates in Azure Synapse

 

Getting started with database templates

Here’s how the database templates can be used from Azure Synapse Studio (the browser-based editor for Azure Synapse).

 

1. Select a database template from the Azure Synapse gallery. Let’s use Retail as an example:

 

Figure 2 - Database template - RetailFigure 2 - Database template - Retail

 

 

2. You’ll see a set of eight tables pre-populated in a lightweight data model editor to get you started. From here you can add, remove or edit, tables, columns and relationships. Click on “Create Database” when done:

 

Figure 3 – Database template (Retail) : Enrich data model with a library of tablesFigure 3 – Database template (Retail) : Enrich data model with a library of tables

 

3. In the database designer, you can further refine your data model by making edits to the tables specified previously. You can also add tables from an existing data lake or create brand new ones; the former being ideal if you already have a data lake in use. The Properties tab allows you to specify the name of your lake database, data format (delimited text or parquet) and storage account settings. The default storage account is the one that comes with your Synapse workspace. Click on “Publish All” to commit your changes:

 

Figure 4 - Publish to Data LakeFigure 4 - Publish to Data Lake

 

That’s it! In a few minutes you’re up and going with a lake database and ready to load it with data. The data can be loaded via Synapse Pipelines; an ELT engine with the lake database being the target, and your operational data being the source. With hundreds of data connectors (including 3rd party, ODBC, REST, OData, HTTP), you can ensure that any source data can be loaded to your lake database.

 

After the data is loaded, you can take advantage of pre-built AI and ML models that understand your data based on the lake database template schema. An example is the Retail-Product recommendation solution in the Gallery:

 

Figure 5 - Retail - Product Recommendations Solution based on Lake Database TemplatesFigure 5 - Retail - Product Recommendations Solution based on Lake Database Templates

 

Knowing the shape of the data allows us to provide pre-built industry AI solutions. The AI solution for Retail Product Recommendation provides a robust and scalable recommendation engine for out-of-the-box development in Synapse. No additional data shaping is needed, the solution can work on the data out of the box.  This accelerates productivity of existing or emerging data scientists for solving a specific problem in the Retail domain.

 

Figure 6 - PySpark notebook for Retail RecommendationsFigure 6 - PySpark notebook for Retail Recommendations

 

Figure 7 – Feature Importance of Retail RecommendationFigure 7 – Feature Importance of Retail Recommendation

 

Considering database templates as a core component in your next big data project will help you with a better integrated and scalable architecture:

 

Figure 8 – Database templates and Azure Synapse for Retail Solutions - ArchitectureFigure 8 – Database templates and Azure Synapse for Retail Solutions - Architecture

 

Benefits of database templates

 

By leveraging Azure Synapse’s library of database templates derived from decades of industry implementations, you can

  • Accelerate time to insights based on a standardized business area schemas for different industries
  • Identify gaps and opportunities in your existing enterprise data model
  • Consolidate data silos and query from a single pane of glass (Synapse Studio)
  • Create a well-formed data lake ready for the analytics at scale
  • Enrich your data with Azure Cognitive Services and Azure Machine Learning
  • Develop reports easily using Power BI

Conclusion

Azure Synapse Analytics gives you the freedom to query data on your terms, using either serverless or dedicated resources—at scale. It solves many of the productivity and scalability challenges that prevent you from maximizing the value of your data assets with a service that is ready to meet your current and future business needs.

 

To learn more, visit:

 

 

Leave a Reply

Your email address will not be published. Required fields are marked *

*

This site uses Akismet to reduce spam. Learn how your comment data is processed.