API Reference
Complete reference for the slurm-factory Python API and command-line interface.
Command Line Interface
The slurm-factory CLI is built with Typer and provides a modern, user-friendly interface for building Slurm packages.
Global Options
All commands support these global options:
slurm-factory [GLOBAL_OPTIONS] COMMAND [COMMAND_OPTIONS]
Global Options:
--project-name TEXT
: LXD project name (default: “slurm-factory”, env:IF_PROJECT_NAME
)--verbose
,-v
: Enable verbose output and debug logging--help
: Show help message
slurm-factory build
Build a Slurm package with specified version and options.
slurm-factory build [OPTIONS]
Options:
--slurm-version [25.05|24.11|23.11|23.02]
: Slurm version to build (default: 25.05)--gpu / --no-gpu
: Enable GPU support (CUDA/ROCm) - creates larger packages (default: False)--minimal / --no-minimal
: Build minimal Slurm only (no OpenMPI, smaller size) (default: False)--verify / --no-verify
: Enable relocatability verification for CI/testing (default: False)--base-only / --no-base-only
: Only build base images (no extra features) (default: False)--help
: Show help message
Build Types:
- Default: ~2-5GB, CPU-only with OpenMPI and standard features
- GPU (
--gpu
): ~15-25GB, includes CUDA/ROCm support for GPU workloads - Minimal (
--minimal
): ~1-2GB, basic Slurm only without OpenMPI or extra features
Note: --gpu
and --minimal
cannot be used together.
Examples:
# Build default CPU version (25.05)
slurm-factory build
# Build specific version
slurm-factory build --slurm-version 24.11
# Build with GPU support
slurm-factory build --gpu
# Build minimal version
slurm-factory build --minimal
# Build with verification (CI)
slurm-factory build --verify
# Verbose build with custom project
slurm-factory --verbose --project-name my-project build --slurm-version 25.05
slurm-factory clean
Clean up LXD instances from the project.
slurm-factory clean [OPTIONS]
Options:
--full / --no-full
: Completely delete the LXD project (default: False)--help
: Show help message
Behavior:
- Default: Cleans build instances, keeps base instances for faster subsequent builds
- Full (
--full
): Completely deletes the LXD project including base instances
Examples:
# Clean build instances, keep base instances
slurm-factory clean
# Completely delete the LXD project
slurm-factory clean --full
# Clean with custom project
slurm-factory --project-name my-project clean --full
Python API
Module Structure
The slurm-factory Python package is organized into the following modules:
import slurm_factory
from slurm_factory import builder, config, constants, exceptions, spack_yaml, utils
Core Enums
SlurmVersion
Enumeration of supported Slurm versions.
from slurm_factory.constants import SlurmVersion
# Available versions
SlurmVersion.v25_05 # "25.05"
SlurmVersion.v24_11 # "24.11"
SlurmVersion.v23_11 # "23.11"
SlurmVersion.v23_02 # "23.02"
Configuration
Settings
Configuration management using Pydantic.
from slurm_factory.config import Settings
settings = Settings()
print(settings.project_name) # Default: "slurm-factory"
print(settings.home_cache_dir) # Cache directory path
# Ensure cache directories exist
settings.ensure_cache_dirs()
Builder Module
build()
Function
Main build function (typically called via CLI).
from slurm_factory.builder import build
import typer
# Build context (normally provided by CLI)
ctx = typer.Context(typer.Typer())
ctx.obj = {
"settings": Settings(),
"project_name": "slurm-factory"
}
# Build with options
build(
ctx=ctx,
slurm_version=SlurmVersion.v25_05,
gpu=False,
minimal=False,
verify=False,
base_only=False
)
Exception Handling
Custom Exceptions
from slurm_factory.exceptions import (
SlurmFactoryError,
ConfigurationError,
BuildError,
LXDError,
SpackError
)
try:
# Slurm Factory operations
pass
except BuildError as e:
print(f"Build failed: {e}")
except LXDError as e:
print(f"LXD operation failed: {e}")
except SlurmFactoryError as e:
print(f"General error: {e}")
Spack Configuration
generate_spack_yaml()
Function
Generate dynamic Spack configuration based on build options.
from slurm_factory.spack_yaml import generate_spack_yaml
from slurm_factory.constants import SlurmVersion
# Generate configuration
config = generate_spack_yaml(
slurm_version=SlurmVersion.v25_05,
gpu_enabled=False,
minimal=False
)
print(config) # YAML configuration as string
Utility Functions
LXD Operations
from slurm_factory.utils import (
get_base_instance_name,
set_profile,
create_buildcache_from_base_instance,
create_slurm_package
)
# Get standardized instance name
name = get_base_instance_name(SlurmVersion.v25_05, gpu=False, minimal=False)
print(name) # "slurm-factory-base-25.05"
# Set LXD profile for build optimization
set_profile("default", "slurm-factory", "/path/to/cache")
Constants
Build Configuration
from slurm_factory.constants import (
INSTANCE_NAME_PREFIX,
SPACK_VIEW_SCRIPT,
SLURM_PACKAGES,
EXTERNAL_BUILD_TOOLS
)
print(INSTANCE_NAME_PREFIX) # "slurm-factory"
print(SLURM_PACKAGES) # Dict of Slurm package specs
Error Handling
All functions raise specific exceptions for different error conditions:
SlurmFactoryError
: Base exception for all library errorsConfigurationError
: Configuration-related errorsBuildError
: Build process failuresLXDError
: LXD operation failuresSpackError
: Spack operation failures
Example error handling:
from slurm_factory.exceptions import BuildError, LXDError
from slurm_factory.builder import build
try:
build(ctx, SlurmVersion.v25_05)
except LXDError as e:
print(f"LXD setup failed: {e}")
except BuildError as e:
print(f"Build process failed: {e}")
except Exception as e:
print(f"Unexpected error: {e}")
Environment Variables
IF_PROJECT_NAME
: Default LXD project name (default: “slurm-factory”)
Exit Codes
0
: Success1
: General error (build failure, configuration error, etc.)
Logging
The library uses Python’s standard logging module. Enable debug logging with:
slurm-factory --verbose build
Or programmatically:
import logging
logging.getLogger("slurm_factory").setLevel(logging.DEBUG)
"""
def build_slurm_package(
self,
slurm_version: str,
gpu: bool = False
) -> BuildResult:
"""Build a Slurm package.
Args:
slurm_version: Version of Slurm to build
gpu: Enable GPU support
Returns:
BuildResult with success status and package paths
Raises:
BuildError: If build fails
ContainerError: If container operations fail
""" ```
Example Usage:
from slurm_factory.builder import SlurmBuilder
# Create builder
builder = SlurmBuilder()
# Build package
result = builder.build_slurm_package("25.05", gpu=False)
if result.success:
print(f"Built packages: {result.packages}")
else:
print(f"Build failed: {result.errors}")
SlurmVersion
Enum for supported Slurm versions.
from slurm_factory.constants import SlurmVersion
class SlurmVersion(str, Enum):
V25_05 = "25.05"
V24_11 = "24.11"
V24_05 = "24.05"
V23_11 = "23.11"
V23_02 = "23.02"
BuildResult
Result object returned by build operations.
@dataclass
class BuildResult:
success: bool
packages: List[Path]
build_time: timedelta
errors: List[str]
slurm_version: str
gpu_enabled: bool
Configuration
Config
Configuration management class.
from slurm_factory.config import Config
class Config:
@property
def build_dir(self) -> Path:
"""Build directory path."""
@property
def cache_dir(self) -> Path:
"""Cache directory path."""
@property
def output_dir(self) -> Path:
"""Output directory path."""
def get_container_config(self) -> ContainerConfig:
"""Get container configuration."""
def get_spack_config(self, version: str, gpu: bool) -> SpackConfig:
"""Get Spack configuration for version."""
Constants
Version Mappings
from slurm_factory.constants import SLURM_VERSIONS
# Dictionary mapping versions to download URLs and hashes
SLURM_VERSIONS = {
"25.05": {
"url": "https://github.com/SchedMD/slurm/archive/refs/tags/slurm-25-05.tar.gz",
"hash": "sha256:...",
"spack_spec": "slurm@25.05"
},
# ... other versions
}
Script Templates
from slurm_factory.constants import (
SPACK_INSTALL_SCRIPT,
SLURM_BUILD_SCRIPT,
PACKAGE_CREATION_SCRIPT
)
# Access script templates for customization
script = SPACK_INSTALL_SCRIPT.substitute(
spack_version="v0.20.1",
python_version="3.11"
)
Exceptions
BuildError
Base exception for build-related errors.
class BuildError(Exception):
"""Base exception for build errors."""
def __init__(self, message: str, details: Optional[Dict] = None):
super().__init__(message)
self.details = details or {}
ContainerError
Exception for container-related errors.
class ContainerError(BuildError):
"""Exception for container operations."""
def __init__(self, message: str, container_id: Optional[str] = None):
super().__init__(message)
self.container_id = container_id
SpackError
Exception for Spack-related errors.
class SpackError(BuildError):
"""Exception for Spack operations."""
def __init__(self, message: str, command: Optional[str] = None):
super().__init__(message)
self.command = command
Utilities
Package Management
from slurm_factory.utils import (
list_built_packages,
clean_build_artifacts,
validate_package
)
def list_built_packages() -> List[PackageInfo]:
"""List all built packages."""
def clean_build_artifacts(keep_cache: bool = True) -> CleanResult:
"""Clean build artifacts."""
def validate_package(package_path: Path) -> ValidationResult:
"""Validate package integrity."""
Container Utilities
from slurm_factory.container import (
list_containers,
cleanup_containers,
get_container_info
)
def list_containers() -> List[ContainerInfo]:
"""List active build containers."""
def cleanup_containers() -> int:
"""Clean up orphaned containers."""
def get_container_info(container_id: str) -> ContainerInfo:
"""Get container information."""
Data Structures
Configuration Objects
ContainerConfig
@dataclass
class ContainerConfig:
image: str = "ubuntu:22.04"
cpu_limit: Optional[str] = None
memory_limit: str = "16GB"
swap_limit: bool = False
network: str = "default"
storage_pool: str = "default"
SpackConfig
@dataclass
class SpackConfig:
version: str
install_dir: Path
cache_dir: Path
build_jobs: int
specs: List[str]
variants: Dict[str, str]
PackageInfo
@dataclass
class PackageInfo:
slurm_version: str
gpu_enabled: bool
build_date: datetime
software_package: Path
module_package: Path
size_mb: int
Result Objects
ValidationResult
@dataclass
class ValidationResult:
valid: bool
errors: List[str]
warnings: List[str]
package_info: Optional[PackageInfo] = None
CleanResult
@dataclass
class CleanResult:
files_removed: int
space_freed_mb: int
errors: List[str]
Environment Variables
Configuration Variables
# Build configuration
export SLURM_FACTORY_BUILD_DIR="$HOME/.slurm-factory"
export SLURM_FACTORY_CACHE_DIR="$HOME/.slurm-factory/cache"
export SLURM_FACTORY_OUTPUT_DIR="$HOME/.slurm-factory/builds"
# Container configuration
export SLURM_FACTORY_CONTAINER_CPU="8"
export SLURM_FACTORY_CONTAINER_MEMORY="16GB"
export SLURM_FACTORY_CONTAINER_IMAGE="ubuntu:22.04"
# Spack configuration
export SLURM_FACTORY_SPACK_JOBS="8"
export SLURM_FACTORY_SPACK_CACHE="$HOME/.slurm-factory/spack-cache"
# Logging
export SLURM_FACTORY_LOG_LEVEL="INFO"
export SLURM_FACTORY_LOG_FILE="$HOME/.slurm-factory/logs/slurm-factory.log"
Exit Codes
Code | Meaning |
---|---|
0 | Success |
1 | General error |
2 | Invalid arguments |
3 | Container error |
4 | Build error |
5 | Package error |
6 | Configuration error |
Examples
Basic Usage
#!/usr/bin/env python3
"""Example: Build and deploy Slurm package."""
from pathlib import Path
from slurm_factory.builder import SlurmBuilder
from slurm_factory.utils import validate_package
def main():
# Initialize builder
builder = SlurmBuilder(output_dir=Path("/custom/output"))
# Build package
result = builder.build_slurm_package("25.05", gpu=False)
if not result.success:
print(f"Build failed: {result.errors}")
return 1
# Validate packages
for package in result.packages:
validation = validate_package(package)
if not validation.valid:
print(f"Package validation failed: {validation.errors}")
return 1
print(f"Successfully built {len(result.packages)} packages")
print(f"Build time: {result.build_time}")
return 0
if __name__ == "__main__":
exit(main())
Custom Configuration
#!/usr/bin/env python3
"""Example: Custom build configuration."""
from slurm_factory.builder import SlurmBuilder
from slurm_factory.config import Config, ContainerConfig, SpackConfig
def main():
# Custom container configuration
container_config = ContainerConfig(
cpu_limit="16",
memory_limit="32GB",
swap_limit=False
)
# Custom Spack configuration
spack_config = SpackConfig(
build_jobs=16,
variants={"slurm": "+pmix +hwloc +numa"}
)
# Initialize with custom config
config = Config()
config.container_config = container_config
config.spack_config = spack_config
builder = SlurmBuilder(config=config)
# Build with custom configuration
result = builder.build_slurm_package("25.05", gpu=True)
return 0 if result.success else 1
if __name__ == "__main__":
exit(main())
See also:
- Architecture documentation for detailed design
- Examples repository
- Installation guide