How to reduce infrastructure costs by up to 80% with Azure Databricks and Delta Lake

This post has been republished via RSS; it originally appeared at: New blog articles in Microsoft Tech Community.

Delta Lake and Azure Databricks enable the modern data architecture to simplify and accelerate data and AI solutions at any scale. The implementation of the modern data architecture allowed Relogix to scale back costs on wasted compute resources by 80% while further empowering their data team.
 
Modern Data Architecture with Azure Databricks and Delta Lake.png
Figure 1. Modern data architecture with Delta Lake and Azure Databricks
 
The medallion architecture (as noted in the following diagram) allows for flexible access and extendable data processing. The Bronze tables are for data ingestion and enable quick access (without the need for data modeling) to a single source of truth for incoming IoT and transactional events. As data flows to Silver tables, it becomes more refined and optimized for business intelligence and data science use cases through data transformations and feature engineering. The Bronze and Silver tables also act as Operational Data Store (ODS) style tables allowing for agile modifications and reproducibility of downstream tables. Deeper analysis is done on Gold tables where analysts are empowered to use their method of choice (PySpark, Koalas, SQL, BI, and Excel all enable business analytics at Relogix) to derive new insights and formulate queries.
 
Architecting your Delta Lake with the medallion data quality data flow.png

Figure 2. Architecting your Delta Lake with the medallion data quality data flow

 

REMEMBER: these articles are REPUBLISHED. Your best bet to get a reply is to follow the link at the top of the post to the ORIGINAL post! BUT you're more than welcome to start discussions here:

This site uses Akismet to reduce spam. Learn how your comment data is processed.