Skip to main content

Slurm Factory

Slurm Factory is a modern Python CLI tool that builds and deploys GPG-signed, relocatable Slurm workload manager packages for HPC environments. It leverages a public binary cache with cryptographically signed packages for instant, secure installations or creates custom Docker builds with Spack for specific requirements.

Key Features

  • 🔐 GPG-Signed Packages: All packages cryptographically signed for security and integrity
  • Public Binary Cache: Pre-built packages at slurm-factory-spack-binary-cache.vantagecompute.ai
  • Instant Deployment: Install from cache in 5-15 minutes instead of 45-90 minutes of compilation
  • 📦 Relocatable Packages: Deploy to any filesystem path without recompilation
  • 🔧 Two Simple Commands: build for Slurm packages, build-compiler for GCC toolchains
  • 🏗️ Modern Architecture: Built with Python, Typer CLI, comprehensive test coverage
  • 🎮 GPU Support: CUDA/ROCm-enabled builds for GPU-accelerated HPC workloads
  • 🔄 Automated CI/CD: GitHub Actions workflows maintain the GPG-signed public buildcache
  • 📊 Module System: Lmod modules for easy environment management
  • 🌐 Global CDN: CloudFront distribution for fast worldwide access
  • 💾 Intelligent Caching: Multi-layer caching (Docker, binary packages, source archives)

Support Matrix

Slurm Factory supports Slurm builds for multiple OS distributions. All builds use the system compiler and are available as GPG-signed packages in the public buildcache:

Slurm Versions

  • 25.11 (Latest - recommended)
  • 24.11 (LTS)
  • 23.11 (Stable)

OS Toolchains

All toolchains use the system GCC and glibc from the base OS:

  • noble (Ubuntu 24.04 - Recommended, GCC 13.2, glibc 2.39)
  • jammy (Ubuntu 22.04 - LTS, GCC 11.2, glibc 2.35)
  • focal (Ubuntu 20.04 - Legacy, GCC 9.4, glibc 2.31)
  • rockylinux10 (Rocky Linux 10 / RHEL 10, GCC 14.2, glibc 2.40)
  • rockylinux9 (Rocky Linux 9 / RHEL 9, GCC 11.4, glibc 2.34)
  • rockylinux8 (Rocky Linux 8 / RHEL 8, GCC 8.5, glibc 2.28)
Slurm VersionToolchainBuild TypeUse Case
25.11nobleCPU/GPURecommended - Latest Ubuntu LTS
24.11jammyCPU/GPULong-term support production
25.11rockylinux10CPU/GPURHEL 10 compatible
25.11rockylinux9CPU/GPURHEL 9 compatible
23.11rockylinux8CPURHEL 8 compatibility

All builds are available in the buildcache:

https://slurm-factory-spack-binary-cache.vantagecompute.ai/
├── compilers/<TOOLCHAIN>/ # GPG-signed system compiler packages
├── deps/<TOOLCHAIN>/ # GPG-signed Slurm dependencies
├── slurm/<SLURM_VERSION>/<TOOLCHAIN>/ # GPG-signed Slurm packages
└── builds/<SLURM_VERSION>/<TOOLCHAIN>/ # Pre-built tarballs with signatures
├── slurm-<VERSION>-<TOOLCHAIN>-software.tar.gz
└── slurm-<VERSION>-<TOOLCHAIN>-software.tar.gz.asc # GPG signature

GPG Package Signing

All packages in the buildcache are cryptographically signed with GPG for security:

Key Information:

  • Key ID: DFB92630BCA5AB71
  • Fingerprint: 9C4E 8B2F 3A1D 5E6C 7F8A 9B0D DFB9 2630 BCA5 AB71
  • Owner: Vantage Compute Corporation (Slurm Factory Spack Cache Signing Key)
  • Email: info@vantagecompute.ai

Why GPG Signing?

  • Authenticity: Verify packages were built by Vantage Compute
  • Integrity: Detect tampering or corruption during download
  • Security: Prevent man-in-the-middle attacks
  • Trust Chain: Establish provenance for production deployments
  • Compliance: Meets security requirements for production deployments

Keys are automatically imported and trusted when using the buildcache. See GPG Package Verification below for details.

Quick Start

Choose your installation method:

No slurm-factory tool needed - just Spack:

# Install Spack v1.0.0
git clone --depth 1 --branch v1.0.0 https://github.com/spack/spack.git
source spack/share/spack/setup-env.sh

# Set versions and configure three-tier mirrors
SLURM_VERSION=25.11
COMPILER_VERSION=15.2.0
CLOUDFRONT_URL=https://slurm-factory-spack-binary-cache.vantagecompute.ai

spack mirror add slurm-factory-build-toolchain "${CLOUDFRONT_URL}/compilers/${COMPILER_VERSION}"
spack mirror add slurm-factory-slurm-deps "${CLOUDFRONT_URL}/deps/${COMPILER_VERSION}"
spack mirror add slurm-factory-slurm "${CLOUDFRONT_URL}/slurm/${SLURM_VERSION}/${COMPILER_VERSION}"

# Import GPG keys and install Slurm (5-15 minutes)
spack buildcache keys --install --trust
spack install slurm@${SLURM_VERSION}%gcc@${COMPILER_VERSION} target=x86_64_v3

→ See the complete guide: Installing Slurm from Buildcache

Option 1b: Download Pre-built Tarball (Alternative)

Download a complete Slurm installation as a GPG-signed tarball:

# Set versions
SLURM_VERSION=25.11
COMPILER_VERSION=15.2.0
CLOUDFRONT_URL=https://slurm-factory-spack-binary-cache.vantagecompute.ai

# Download tarball and GPG signature
wget "${CLOUDFRONT_URL}/builds/${SLURM_VERSION}/${COMPILER_VERSION}/slurm-${SLURM_VERSION}-gcc${COMPILER_VERSION}-software.tar.gz"
wget "${CLOUDFRONT_URL}/builds/${SLURM_VERSION}/${COMPILER_VERSION}/slurm-${SLURM_VERSION}-gcc${COMPILER_VERSION}-software.tar.gz.asc"

# Import GPG key and verify signature
gpg --keyserver keyserver.ubuntu.com --recv-keys DFB92630BCA5AB71
gpg --verify slurm-${SLURM_VERSION}-gcc${COMPILER_VERSION}-software.tar.gz.asc \
slurm-${SLURM_VERSION}-gcc${COMPILER_VERSION}-software.tar.gz

# Extract and install (only if signature is valid!)
sudo tar -xzf slurm-${SLURM_VERSION}-gcc${COMPILER_VERSION}-software.tar.gz -C /opt/
cd /opt && sudo ./data/slurm_assets/slurm_install.sh --full-init

→ See GPG verification guide: Verifying GPG Signatures → Full workflow and Dockerfiles: Installing Slurm from Tarball

Option 2: Build Custom Slurm with slurm-factory Tool

Install the slurm-factory tool for custom builds:

# Install Docker
curl -fsSL https://get.docker.com | sh
sudo usermod -aG docker $USER && newgrp docker

# Install slurm-factory from PyPI
pip install slurm-factory

# Build Slurm (~45 minutes)
slurm-factory build-slurm --slurm-version 25.11 --compiler-version 13.4.0

→ See the complete guide: Installing slurm-factory Tool

Two Primary Commands

build-compiler

Build GCC compiler toolchains for use with Slurm builds:

# Build GCC 13.4.0 (default recommended version)
slurm-factory build-compiler

# Build specific version
slurm-factory build-compiler --compiler-version 14.2.0

# Build and publish to buildcache (requires AWS credentials)
slurm-factory build-compiler --compiler-version 13.4.0 --publish

Supported Versions: 15.2.0, 14.2.0, 13.4.0, 12.5.0, 11.5.0, 10.5.0, 9.5.0, 8.5.0, 7.5.0

Output: Docker image and optional buildcache upload to S3

build-slurm

Build complete Slurm packages with all dependencies:

# Standard build (CPU-optimized, ~2-5GB)
slurm-factory build-slurm --slurm-version 25.11

# GPU support (includes CUDA/ROCm, ~15-25GB)
slurm-factory build-slurm --slurm-version 25.11 --gpu

# Use specific compiler version
slurm-factory build-slurm --slurm-version 25.11 --compiler-version 14.2.0

# Publish to buildcache (requires AWS credentials)
slurm-factory build-slurm --slurm-version 25.11 --publish=all

Supported Versions: 25.11, 24.11, 23.11

Output: Tarball at ~/.slurm-factory/builds/slurm-{version}-gcc{compiler}-software.tar.gz

Public Buildcache

Pre-built packages are available at slurm-factory-spack-binary-cache.vantagecompute.ai via global CloudFront CDN.

Available Packages

Compilers

  • URL: https://slurm-factory-spack-binary-cache.vantagecompute.ai/compilers/{version}/
  • Versions: GCC 7.5.0 through 15.2.0
  • Includes: gcc, gcc-runtime, binutils, gmp, mpfr, mpc, zlib-ng, zstd

Slurm Packages

  • URL: https://slurm-factory-spack-binary-cache.vantagecompute.ai/slurm/{slurm_version}/{compiler_version}/
  • Slurm Versions: 25.11, 24.11, 23.11
  • Compiler Combinations: Each Slurm version × each GCC version
  • Includes: All dependencies (OpenMPI, OpenSSL, Munge, PMIx, HDF5, etc.)

Benefits

  • 10x Faster: Install in 5-15 minutes vs 45-90 minutes building from source
  • 🔒 Verified Builds: All packages GPG-signed and tested via GitHub Actions CI/CD
  • 🌐 Global CDN: CloudFront distribution for fast worldwide access
  • 🔄 Always Current: Automated workflows keep packages up-to-date
  • 💾 Storage Efficient: Download only what you need (2-25GB vs 50GB build requirements)

Usage Example

Install Slurm 25.11 from Buildcache with gcc 13.4.0

# Install the Slurm-Factory Spack Repo
spack repo add https://github.com/vantagecompute/slurm-factory-spack-repo ~/slurm-factory-spack-repo

# Install latest Slurm mirrors for slurm built with gcc@13.4.0
spack mirror add slurm-factory-compiler \
https://slurm-factory-spack-binary-cache.vantagecompute.ai/compiler/13.4.0/
spack mirror add slurm-factory-deps \
https://slurm-factory-spack-binary-cache.vantagecompute.ai/deps/13.4.0/
spack mirror add slurm-factory-slurm \
https://slurm-factory-spack-binary-cache.vantagecompute.ai/slurm/25.11/13.4.0/

spack buildcache keys --install --trust
spack install slurm@25.11

Install Slurm 25.11 Tarball from Buildcache

# Download and install tarball (placeholder - see installation guide for details)
wget https://slurm-factory-spack-binary-cache.vantagecompute.ai/builds/25.11/15.2.0/slurm-25.11-gcc15.2.0-software.tar.gz
tar -xzf slurm-25.11-gcc15.2.0-software.tar.gz -C /opt/

GPG Package Verification

All packages in the Slurm Factory buildcache are cryptographically signed with GPG for security and integrity.

Importing GPG Keys

# After adding a buildcache mirror, import the signing keys
spack buildcache keys --install --trust

# Verify keys are imported
spack gpg list

Automatic Verification

Once keys are imported, Spack automatically verifies signatures:

# Install with signature verification (default behavior)
spack install slurm@25.11

# Spack will verify the GPG signature before installation

Why This Matters

  • Security: Ensures packages haven't been tampered with
  • Integrity: Confirms packages are authentic from Slurm Factory
  • Trust: Cryptographic proof of package origin
  • Compliance: Meets security requirements for production deployments

Manual Key Verification

For production environments, verify the key fingerprint:

# Import the public key
spack buildcache keys --install --trust

# Verify the fingerprint
gpg --list-keys --keyid-format LONG

# Should show:
# pub rsa4096/DFB92630BCA5AB71 2025-01-XX
# 9C4E 8B2F 3A1D 5E6C 7F8A 9B0D DFB9 2630 BCA5 AB71

All packages are signed during the CI/CD build process and verified before deployment.

Build Types Comparison

Build TypeDependenciesSizeBuild TimeBuildcache TimeUse Case
CPU-only~45 packages2-5GB35-45 min5-10 minProduction clusters
GPU-enabled~180 packages15-25GB75-90 min15-20 minGPU/CUDA clusters

Requirements

For Using Buildcache

  • Python 3.12+ (for slurm-factory tool - optional, can use Spack directly)
  • Spack v1.0.0+
  • 10-25GB disk space (depending on build type)
  • Internet connection for buildcache download

For Local Builds

  • Python 3.12+
  • Docker 24.0+
  • 50GB disk space
  • 4+ CPU cores, 16GB RAM recommended
  • Internet connection for initial Docker image pull

GitHub Actions CI/CD

Slurm Factory uses three GitHub Actions workflows to maintain the public buildcache:

  1. build-and-publish-compiler-buildcache.yml: Builds and publishes GCC compiler toolchains
  2. build-and-publish-slurm-deps-all-compilers.yml: Builds Slurm dependencies for all compiler combinations
  3. build-and-publish-all-packages.yml: Builds complete Slurm packages (Slurm + dependencies)

All workflows use:

  • AWS OIDC authentication for secure S3 access
  • Self-hosted runners for faster builds
  • Matrix builds for parallel execution
  • Automated testing of buildcache installations

See Contributing Guide for details on setting up CI/CD workflows.

Use Cases

  • HPC Cluster Deployment: Standardized Slurm installations across heterogeneous clusters
  • Development Environments: Quick Slurm setup for testing without lengthy compilation
  • Multi-Version Support: Running different Slurm versions side-by-side with module system
  • Performance Testing: Optimized builds for specific hardware configurations
  • Container Deployment: Portable packages for containerized HPC environments
  • Air-Gapped Installations: Download buildcache once, deploy offline
  • Research Computing Centers: Standardize Slurm deployments across multiple clusters
  • Cloud HPC Providers: Rapidly provision clusters with consistent, tested software stacks
  • Educational Institutions: Provide reproducible HPC environments for teaching and research
  • Industry HPC: Deploy compliance-ready solutions with full audit trails and security
  • CI/CD Pipelines: Automated testing and validation of HPC software stacks

Architecture

Slurm Factory uses a modern, modular architecture:

Key Components:

  • Typer CLI: Auto-completion, rich help text, type-safe command validation
  • Pydantic Configuration: Type-safe settings with environment variable support
  • Docker Isolation: Reproducible builds with version-controlled dependencies
  • Dynamic spack.yaml: Programmatically generated Spack environment specs
  • GPG Signing: Automatic cryptographic signing of all packages
  • Multi-Layer Caching: Docker layers, binary packages, source archives, compiler cache

Infrastructure

Slurm Factory is supported by comprehensive AWS infrastructure:

Components:

  • S3 Buildcache Bucket: slurm-factory-spack-buildcache-4b670
  • CloudFront Distribution: Global CDN for fast buildcache access
  • Route53 DNS: slurm-factory-spack-binary-cache.vantagecompute.ai
  • GitHub OIDC: Secure, keyless authentication for CI/CD
  • AWS CDK: Infrastructure as code for reproducible deployments

CI/CD Workflows:

Three GitHub Actions workflows maintain the buildcache:

  1. Compiler Buildcache: Build and publish GCC toolchains
  2. Slurm Dependencies: Build Slurm packages for all compiler combinations
  3. Tarball Publishing: Create and publish relocatable tarballs

All workflows run on self-hosted runners with GPG signing and automated testing.

See Infrastructure and GitHub Actions for details.

Next Steps


Built with ❤️ by Vantage Compute