Setting up SLURM Job Accounting with Azure CycleCloud and Azure Database for MySQL Flexible Server

This post has been republished via RSS; it originally appeared at: New blog articles in Microsoft Community Hub.

SLURM (Simple Linux Utility for Resource Management) is a highly configurable open-source workload manager used in high-performance computing (HPC) environments. Job accounting is a crucial aspect of SLURM, allowing system administrators to track resource usage, monitor job performance, and allocate resources efficiently.

SLURM's job accounting feature records various metrics related to job execution, such as:

Job IDs: Unique identifiers for each job submitted to the SLURM system.
User Information: Details about the user who submitted the job, such as username and user ID.
Resource Usage: Information on resources consumed during job execution, including CPU time, memory usage, and disk space.
Job States: Tracking the state transitions of jobs (e.g., pending, running, completed, failed).
Start and End Times: Timestamps indicating when a job started and finished execution.
Node Allocation: Details about the nodes allocated to the job, including the number of nodes, node names, and partition information.

By enabling job accounting, system administrators gain insights into resource utilization patterns, identify potential bottlenecks, and optimize resource allocation for better efficiency and cost-effectiveness.

SLURM provides various tools and utilities for managing job accounting data, including commands for querying job records, generating reports, and integrating with external databases for long-term storage and analysis.

Overall, SLURM job accounting plays a crucial role in ensuring the effective management and optimization of computational resources in HPC environments.

This new blog serves as a continuation of my previous post, "Enabling Job Accounting for SLURM with Azure CycleCloud 8.2 and Azure MariaDB Database." Due to the retirement of Azure Database for MariaDB, scheduled for September 19, 2025, we are transitioning to the Azure Database for MySQL Flexible Server offering for configuring SlurmDBD for job accounting. In this blog, we'll explore the process of setting up SlurmDBD with Azure Database for MySQL Flexible Server to maintain efficient job accounting within SLURM.

Starting from Azure CycleCloud version 8.1.0, the Slurm template includes support for enabling SlurmDBD on Slurm versions 20.11 and above. This blog post operates under the assumption that you have access to Azure CycleCloud version 8.6 and Azure Database for MySQL Flexible Server to facilitate the setup of both the Slurm cluster and SlurmDBD configuration.

For the purpose of this demonstration, I've created a virtual network named "hpc" consisting of two subnets: "compute" and "mysql". The "compute" subnet is designated for the creation of CycleCloud VMs and the Slurm cluster. Meanwhile, the "mysql" subnet will be utilized for Azure Database for MySQL Flexible Server to facilitate the configuration of SlurmDBD.

Create a Azure Database for MySQL Flexible Server instance from Azure Portal.

Please furnish the details in accordance with your specifications. This includes providing information such as the database name, database username, password, region, MySQL version (selected as 8.0), workload type (Business Critical), authentication method (MySQL authentication only), and any other pertinent requirements.

In the Networking section, opt for Private access (VNet Integration) and choose the previously established "mysql" subnet. Proceed to create the Azure Database for MySQL Flexible Server. Upon successful deployment and initialization of the database, you will obtain the necessary details essential for configuring Slurm's job accounting setup.

To configure Slurm job accounting, gather the following details from the Azure Database for MySQL Flexible server:

Server name: myslurmdb.mysql.database.azure.com
Server Admin username: dbauser
Server Admin Password: **********
SSL Certificate URL: https://dl.cacerts.digicert.com/DigiCertGlobalRootCA.crt.pem

Reference: https://learn.microsoft.com/en-us/azure/mysql/flexible-server/how-to-connect-tls-ssl#download-the-public-ssl-certificate

Now, let's incorporate these configurations into the advanced settings of the CycleCloud Slurm cluster. Begin by enabling Job Accounting, then proceed to add the following details:

Slurm DBD URL = Server Name from Azure Database
Slurm DBD User = Server Admin username
Slurm DBD Password = Admin Password
SSL Cert URL = https://dl.cacerts.digicert.com/DigiCertGlobalRootCA.crt.pem

After adding the required details to set up the Slurm Cluster, save the configuration and start the cluster. Once the cluster is operational, execute a sample job and examine "sacct" to verify the functionality of job accounting.

[vinil@slurm1-scheduler ~]$ srun hostname slurm1-hpc-1 [vinil@slurm1-scheduler ~]$ sacct JobID JobName Partition Account AllocCPUS State ExitCode ------------ ---------- ---------- ---------- ---------- ---------- -------- 1 hostname hpc 1 COMPLETED 0:0 1.0 hostname 1 COMPLETED 0:0 [root@slurm1-scheduler march-2024]# sacct --format=jobid,elapsed,ncpus,ntasks,state JobID Elapsed NCPUS NTasks State ------------ ---------- ---------- -------- ---------- 1 00:00:05 1 COMPLETED 1.0 00:00:00 1 1 COMPLETED

You can also retrieve job statistics for a particular user or a specific cluster. Refer to the "sacct" documentation for additional examples and guidance.

To sum up, this blog provides a detailed walkthrough for configuring SLURM job accounting using Azure CycleCloud and Azure Database for MySQL Flexible Server. It equips administrators with the necessary tools to efficiently manage and enhance resource utilization in HPC environments. If you've found this blog helpful, please consider liking or commenting below to help me gauge its usefulness to you. Your feedback is invaluable in shaping future content.

Reference:

Quickstart: Use the Azure portal to create an Azure Database for MySQL - Flexible Server instance

Azure Database for MariaDB will be retired on 19 September 2025 – Migrate to Azure Database for MySQL Flexible Server

Cyclecloud documentation

Slurm Job Accounting

Leave a Reply Cancel reply