Run Spack on Azure and integrate the build cache with Azure Blob Storage

This post has been republished via RSS; it originally appeared at: New blog articles in Microsoft Tech Community.

If your team runs high-performance computing (HPC) workloads on Azure, you can use a package manager to save time. Spack is a popular open-source package management tool written in Python that builds software packages from source. Spack allows you to rapidly and consistently install many HPC packages using scripts called recipes. These recipes make it easy to deploy all the Azure resources needed for a particular piece of software, and more than 2,500 recipes for HPC packages already exist.
In this post, we explore using Spack to build and install packages on Azure Virtual Machines for HPC workloads (HBv2, HB, and HC). For testing purposes, this deployment takes advantage of the MPI libraries installed on the CentOS-HPC 7.7 image available from Azure Marketplace. It also integrates the Spack build cache with Azure Blob storage to create a repository for the pre-build binaries.
To show how this process works, we will use Spack to build OSU micro-benchmarks that measure the performance of the CentOS MPI libraries. The steps below explain how to upload the build cache to Blob storage, install OSU micro-benchmarks using the Blob storage build cache, and use Portable Batch System (PBS) scripts to run the resulting executables.


Install Spack on Azure

The azurehpc GitHub repository contains scripts (see apps/spack directory) to automatically install Spack on Azure, set up suitable configuration files, and integrate the CentOS-HPC 7.7 MPI libraries. The repository also provides example PBS scripts that you can use to run OSU micro-benchmark with different CentOS-HPC 7.7 libraries (Open MPI, MVAPICH2, HPCx, and Intel MPI).
To clone the GitHub repository:

git clone git@github.com:Azure/azurehpc.git

 

You can use the azurehpc scripts as is for testing, but you’ll probably want to customize your Spack installation. To do that, edit the configuration files (config.yaml, modules.yaml, compiler.yaml, and packages.yaml). The azurehpc Spack installation script (build_spack.sh) creates these files in ~hpcuser/.spack with suitable defaults.


The example scripts do the following:

 

  • Install all software packages in /apps/spack/$sku_type (config.yaml) (where sku_type is hbv2, hb, or hc to indicate the target processor architecture).
  • Generate the Tcl module files at /apps/modulefiles/spack/tcl/$sku_type and the Lmod module files in /apps/modules/spack/lmod/$sku_type (config.yaml).
  • Use 16 processes by default for a parallel build (for example, make -j) (config.yaml).
  • Use the CentOS-HPC 7.7 MPI libraries (they are not rebuilt). See the packages.yaml configuration file for details (packages.yaml, compiler.yaml).


Note: Make sure you build on one of the virtual machine types designed for HPC workloads—HBv2, HB, or HC.


Build and install OSU micro-benchmarks

To build OSU micro-benchmarks with MVAPICH2 (from the CentOS-HPC 7.7 image), run the following command in Azure CLI:

spack install osu-micro-benchmarks%gcc@9.2.0^mvapich2@2.3.2 

Where %gcc@9.2.0^mvapich2@2.3.2 is the spec, the syntax you use in Spack to specify versions and configuration options. Here, the spec tells Spack to use the gcc@9.2.0 compiler and the mvapich2@2.3.2 MPI library.
To see the detailed installation options and software dependencies for a given package, use:

spack info PACKAGE_NAME

 

Similarly, you can build OSU micro-benchmarks using Open MPI:

spack install osu-micro-benchmarks%gcc@9.2.0^openmpi@4.0.2 


Or HPCx:

spack install osu-micro-benchmarks%gcc@9.2.0^hpcx@2.5.0 


Or Intel MPI:

. /opt/intel/impi/2019.5.281/intel64/bin/mpivars.sh 
spack install --dirty osu-micro-benchmarks%gcc@9.2.0^intel-mpi@2019.5.281


The –dirty installation option allows Spack to use the current environment to build the software and use the Intel MPI environment to set my mpivars.sh.


Set-up the build cache on Blob storage

Creating the build cache Azure is a two-step procedure. The first step is to create the build cache locally. Then you can upload the local build cache to Blob storage. After your software is on Blob storage in a build cache, you can install it anytime from the build cache without the need to recompile the package.

 

   1. To create local build cache, go to the home directory of the virtual machine and run:

cd ~hpcuser 
mkdir -p buildcache/${sku_type} 
cd buildcache/${sku_type} 
spack buildcache create -k ${sku_type}_gpg SPEC

Where SPEC corresponds to the installed software (osu-micro-benchmarks%gcc@9.2.0^openmpi@4.0.2), and the -k option specifies the GPG key to use to sign the software. (A GPG key was generated as part of the azurehpc Spack installation. To see all available GPG keys, use spack gpg list).
If you need to retrieve software from the build cache later, remember to store your GPG keys using spack gpg expor

   2. To upload local build cache to your Blob storage account, use AzCopy V10 as follows:

azcopy sync "/share/home/hpcuser/buildcache" "<STORAGE_ENDPOINT>/buildache<SAS_KEY> --recursive

   3. Tell Spack to use the Blob storage account as a build cache:

 

spack mirror add ${sku_type}_buildcache “<STORAGE_ENDPOINT>/buildcache/${sku_type}<SAS_KEY>”

   4. Check that you can see all the software (identified by <SPEC> or <HASH>) available in the build cache:

spack buildcache list

   5. To install software from the build cache:

spack buildcache install <SPEC> or <HASH>

   6. If you want to add software to an already existing build cache on Blob storage, copy your build cache locally. Add the software to the local build cache, then copy the local build cache to Blob storage.

 

Test the installed software

We will use PBS to test OSU micro-benchmarks built with Spack. The PBS run scripts are available in the azurehpc repository. The following test uses the MVAPICH2 osu_bw /osu_latency PBS run script (osu_bw_latency_mvapich2.pbs). Similar scripts are available for the other MPI libraries.


#!/bin/bash 
SHARED_APPS=/apps
export OMP_NUM_THREADS=1

module load gcc-9.2.0
module load mpi/mvapich2-2.3.2
spack load osu-micro-benchmarks^mvapich2
cat $PBS_NODEFILE

mpirun osu_bw
sleep 2
mpirun osu_latency

 

With Spack’s spec syntax, you can specify the software to load using just enough of the name to uniquely identify it—for example, spack load osu-micro-benchmarks^mvapich2. You can also load the software with regular module load syntax if you prefer.
Assuming you have two HBv2 nodes, each running a single MPI process, you would submit the test script as follows:

qsub -l select=2:ncpus=120:mpiprocs=1 osu_bw_latency_mvapich2.pbs 


Summary

Spack can really save you time when managing HPC clusters on Azure, because you don’t have to build code and libraries by hand. It’s also very flexible. You can easily customize it to suit your requirements and get started quickly using the recipes that already exist for more than 2,500 HPC packages. It’s pretty easy to write your own recipes, too. For more information, see the Spack website.

Leave a Reply

Your email address will not be published. Required fields are marked *

*

This site uses Akismet to reduce spam. Learn how your comment data is processed.