This post has been republished via RSS; it originally appeared at: New blog articles in Microsoft Tech Community.
Data is quite possibly the world’s most valuable resource. Recent assertions kicked off a debate about whether data is even more valuable than oil. A resource with this much value should be available to as many people who can benefit from it as possible. At the same time, however, access to data must respect privacy and security for all.
Microsoft’s mission of empowering every person and every organization on the planet to achieve more is grounded on a deep respect for data privacy and security as a fundamental right, ultimately ensuring trust. Our customers and partners are similarly challenged to provide trustworthy, secure access to data to as many people as possible while respecting privacy, and this challenge is amplified by the current, “modern data landscape”.
Modern data landscape
The modern data landscape is a complex web of applications, storage systems, and data types, only some of which connect to each other in any usable way.
Figure 1. The modern data landscape
Data is everywhere. It is in the public cloud, private cloud, at the edge, and resident in applications and services that you may not own but to which you have access. This leads to extreme cost, security and privacy issues related to accomplishing the five “Cs” of modern data: Connect, Combine, Consume, and Collaborate with Compliance. Compliance with laws, regulations, and internal and external policies while delivering data to as many people as possible at the lowest cost presents a major challenge. This, in turn, leads to a lack of peace and harmony within and across organizations. People are demanding more access, but technology leaders need to govern access for compliance while lowering cost.
In the past, these efforts led to a lot of data movement and centralization. However, the current data landscape makes physical data movement and centralization too complex and costly, leading people and organizations to make choices that are becoming suboptimal because of the benefits that data virtualization can bring, including the ability to:
- Accept insufficient access to data.
- Invest in costly, monolithic data migrations.
- Conduct multiple initiatives to incrementally improve the data ecosystem.
Data virtualization technologies are emerging to solve the five “Cs” and relieve people and organizations of the tension that exists between data access, security, compliance, and cost. Data virtualization is all about the ability to interact with data without needing to understand the details of data location, data format, and data security.
Lightweight data virtualization
Conduit, a lightweight data virtualization solution from Microsoft Partner Blueprint Technologies, is a good example of an emerging technology that provides direct query to numerous data sources regardless of their location via a centralized security model that enables de-centralized access while achieving the five “Cs.” Conduit’s Apache Spark-powered, lightweight data virtualization unifies your data access across the enterprise, regardless of source, type, or environment. From on-prem to hybrid- and multi-cloud, from flat files to relational tables, Conduit allows you to unlock both no-code and / or developer programmatic data access for anyone.
Figure 2. Blueprint Technologies’ Conduit solution
Data virtualization helps unlock the value of your data in your environment - across single cloud, multi-cloud, on-premises, or any combination thereof. Using Conduit, for example, with a few clicks you can connect a variety of relational and non-relational data sources directly to your BI tool, analytics platform, and other tools such as Dynamics 365, regardless of native connectivity. Direct query to data at rest lowers the cost of data access without having to move the data while giving you the most recent and relevant data. Conduit connects directly to several data sources not yet available in some popular reporting tools such as Power BI. For example, you can use Conduit with Power BI to directly connect to Azure Database for MySQL, Azure Database for MariaDB, Azure Database for PostgreSQL - Single server, and Azure Database for PostgreSQL – Hyperscale (Citus). You can also expose your datasets as REST APIs for developers and automation initiatives.
With Conduit, an organization can centralize and simplify data security and compliance. Connections are run though a central security console that can integrate with Azure-based or on-premise Active Directory for a single sign-on experience. Or you can quickly create simple, role-based access rights within Conduit itself. Conduit also provides both no-code and/or developer programmatic access for users, from business analysts to data scientists, with a single connection. The endpoints stay consistent; as data is migrated or moved, the connection can be updated in a centralized location, eliminating access downtime.
Conduit homogenizes queries across data types and storage environments so that disconnected datasets can be queried in parallel and easily combined in analysis and reporting. Users can quickly and easily combine CSV and JSON files in cloud storage with tables in relational databases without expensive engineering resources.
Data virtualization technologies are relieving the tension between data access and data governance for secure data access with privacy and trust while lowering costs to accomplish the five “Cs” of modern data: Connect, Combine, Consume, and Collaborate with Compliance.