The Technical Professional’s Learning Scope for SQL Server Big Data Clusters

This post has been republished via RSS; it originally appeared at: New blog articles in Microsoft Tech Community.

 If you’re a data professional, you know that it’s important to set aside some time for training when a new release or paradigm comes from your platform to create a solution. In the case of SQL Server 2019 (and later), you’ll want to pay close attention to the Big Data Clusters feature. It’s an exponential knowledge increase, and that’s no exaggeration.


There’s a lot to learn to implement SQL Server‘s Big Data Cluster system. I’ll be covering these topics at various workshops, events, courses, webinars and presentations around the world in more depth, and I thought we might show a few of the things the data professional needs to understand to get ready.


Some of these technologies and concepts are not owned or created by Microsoft – the concepts are universal, and a few of the technologies are open-source. I’ve marked those in italics.


We’ve also included a few links to a training resource I’ve found to be useful. You can use LinkedIn Learning for larger courses, along with EdX, DataCamp, and many other platforms for in-depth training. The links we have indicated here are by no means exhaustive, but they are free, and provide a good starting point.


Look for the training announcements we'’ll post here on this blog to find out where our team is presenting these topics, and feel free to post comments on resources you have found useful.

Technology – Description


Linux  Operating system used in Containers and Container management (Kubernetes)

git  Source control management system

Containers  Encapsulation level for the SQL Server Big Data Cluster architecture

Kubernetes  Management, control plane and security for Containers

Microsoft Azure  Cloud environment for services

Azure Kubernetes Service (AKS)  Kubernetes as a Service

Apache HDFS  Scale-out storage subsystem

Apache Spark  In-memory large-scale, scale-out data processing architecture used by SQL Server

Python, R, Java, SparkML  ML/AI programming languages used for Machine Learning and AI Model creation

Azure Data Studio  Tooling for SQL Server, HDFS, Kubernetes cluster management, T-SQL, R, Python, and SparkML languages

SQL Server Machine Learning Services  R, Python and Java extensions for SQL Server

Microsoft Data Science Process (TDSP)  Project, Development, Control and Management framework

Monitoring and Management  Dashboards, logs, API’s and other constructs to manage and monitor the solution

Security  RBAC, Keys, Secrets, VNETs and Compliance for solutions


If that looks like a lot, it’s because it’s a lot. Stay tuned – we're with you on the journey. We’ll learn together.

REMEMBER: these articles are SYNDICATED. Your best bet to get a reply is to follow the link at the top of the post to the ORIGINAL post! BUT you're more than welcome to start discussions here:

This site uses Akismet to reduce spam. Learn how your comment data is processed.