Skip to main content

FreeIPMI Package

This package provides FreeIPMI support for hardware monitoring in Slurm clusters.

Overview

FreeIPMI is a collection of Intelligent Platform Management Interface (IPMI) system software that provides in-band and out-of-band management of local and remote systems. When integrated with Slurm via the +ipmi variant, it enables hardware monitoring and power management features.

Why Use FreeIPMI with Slurm?

Integrating FreeIPMI with Slurm enables:

  • Hardware Monitoring - Temperature, voltage, fan speeds
  • Power Management - Power capping and monitoring
  • Energy Accounting - Track energy consumption per job
  • Health Checks - Automated node health monitoring
  • Predictive Maintenance - Identify failing hardware before it causes issues

Installation

With Slurm

Enable IPMI support when installing Slurm:

spack install slurm@25-11-0-1 +ipmi

This automatically installs FreeIPMI as a dependency.

Standalone Installation

To install FreeIPMI independently:

spack install slurm_factory.freeipmi

Usage with Slurm

Once Slurm is built with +ipmi, configure IPMI monitoring in slurm.conf:

# Enable IPMI-based power management
AcctGatherEnergyType=acct_gather_energy/ipmi

# Optional: Set IPMI polling frequency
AcctGatherNodeFreq=30

Viewing Energy Data

After configuration, energy data appears in job accounting:

sacct -j <jobid> --format=JobID,ConsumedEnergy,ConsumedEnergyRaw

Node Power Monitoring

Monitor real-time power consumption:

scontrol show node <nodename> | grep CurrentWatts

Configuration

IPMI Credentials

FreeIPMI may require IPMI credentials. Configure in /etc/freeipmi/freeipmi.conf or via Slurm's IPMI configuration.

Permissions

IPMI device access requires appropriate permissions:

# Add slurm user to ipmi group (if applicable)
usermod -a -G ipmi slurm

# Or ensure /dev/ipmi0 is accessible
ls -l /dev/ipmi*

Supported Features

FreeIPMI provides access to:

  • Sensor Data - Temperature, voltage, current, fan speeds
  • Power Consumption - Real-time and historical power usage
  • System Event Log (SEL) - Hardware event logging
  • Chassis Control - Power on/off, reset
  • Field Replaceable Units (FRU) - Hardware inventory

Hardware Requirements

IPMI support requires:

  • BMC (Baseboard Management Controller) - IPMI-compliant hardware
  • IPMI Interface - In-band (KCS) or out-of-band (LAN) access
  • Kernel Support - IPMI device drivers loaded

Check IPMI availability:

# Check for IPMI device
ls /dev/ipmi*

# Test IPMI access
ipmi-sensors --quiet-cache --sdr-cache-recreate

Troubleshooting

No IPMI Device

If /dev/ipmi* doesn't exist:

# Load IPMI kernel modules
modprobe ipmi_devintf
modprobe ipmi_si

# Verify
lsmod | grep ipmi

Permission Denied

If Slurm can't access IPMI:

# Check device permissions
ls -l /dev/ipmi0

# Fix permissions (temporary)
chmod 666 /dev/ipmi0

# Fix permanently via udev rules
echo 'KERNEL=="ipmi*", MODE="0666"' > /etc/udev/rules.d/90-ipmi.rules

No Sensor Data

If sensors return no data:

# Clear and rebuild sensor cache
ipmi-sensors --quiet-cache --sdr-cache-recreate

# Test with verbose output
ipmi-sensors -vv

Package Source

License

FreeIPMI is licensed under GPL-3.0-or-later.

See Also