Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Working with HPC Profiles

HPC (High-Performance Computing) profiles provide pre-configured knowledge about specific HPC systems, including their partitions, resource limits, and optimal settings. Torc uses this information to automatically match job requirements to appropriate partitions.

Overview

HPC profiles contain:

  • Partition definitions: Available queues with their resource limits (CPUs, memory, walltime, GPUs)
  • Detection rules: How to identify when you're on a specific HPC system
  • Default settings: Account names and other system-specific defaults

Built-in profiles are available for systems like NLR's Kestrel. You can also define custom profiles for private clusters.

Listing Available Profiles

View all known HPC profiles:

torc hpc list

Example output:

Known HPC profiles:

╭─────────┬──────────────┬────────────┬──────────╮
│ Name    │ Display Name │ Partitions │ Detected │
├─────────┼──────────────┼────────────┼──────────┤
│ kestrel │ NLR Kestrel  │ 15         │ ✓        │
╰─────────┴──────────────┴────────────┴──────────╯

The "Detected" column shows if Torc recognizes you're currently on that system.

Dynamic Slurm Profiles

For Slurm-based clusters without a built-in profile, Torc can dynamically generate a profile by querying the cluster itself. This means you can use Torc on almost any Slurm system without manual configuration.

To use dynamic Slurm detection, you can:

  1. Explicitly request it: Use --hpc-profile slurm in any command that requires a profile.
  2. Let Torc auto-detect it: If you're on a Slurm system and haven't specified a profile or matched a built-in one, Torc will automatically fall back to dynamic Slurm detection.

Dynamic profiles are generated by:

  • Running sinfo to discover partitions, CPU/memory limits, and GRES (GPUs).
  • Running scontrol show partition to find shared node settings and default QOS.
  • Heuristically inferring GPU types if not explicitly reported by Slurm.

Detecting the Current System

Torc can automatically detect which HPC system you're on:

torc hpc detect

Torc uses a prioritized detection strategy:

  1. Built-in Profiles: Matches known systems via environment variables or hostname patterns.
  2. Custom Profiles: Matches your configured custom profiles.
  3. Dynamic Slurm: If Slurm commands (sinfo) are available, generates a profile from the current cluster.

Viewing Profile Details

See detailed information about a specific profile:

torc hpc show kestrel

You can also view the dynamically detected Slurm profile:

torc hpc show slurm

Viewing Available Partitions

List all partitions for a profile:

torc hpc partitions kestrel

For the current Slurm cluster:

torc hpc partitions slurm

Finding Matching Partitions

Find partitions that can satisfy specific resource requirements:

torc hpc match --cpus 32 --memory 64g --walltime 02:00:00

If no profile is specified, it will use the detected system (including dynamic Slurm).

Custom HPC Profiles

If your HPC system doesn't have a built-in profile, you have three options:

  1. Use Dynamic Slurm Detection (Easiest): Let Torc automatically discover your cluster's capabilities.
  2. Generate and Customize a Profile: Run torc hpc generate to create a TOML template based on your cluster, then customize it in your config file.
  3. Request Built-in Support: If your HPC is widely used, open an issue requesting built-in support.

Quick Example

Define custom profiles in your configuration file:

# ~/.config/torc/config.toml

[client.hpc.custom_profiles.mycluster]
display_name = "My Research Cluster"
description = "Internal research HPC system"
detect_env_var = "MY_CLUSTER=research"
default_account = "default_project"

[[client.hpc.custom_profiles.mycluster.partitions]]
name = "compute"
cpus_per_node = 64
memory_mb = 256000
max_walltime_secs = 172800
shared = false

[[client.hpc.custom_profiles.mycluster.partitions]]
name = "gpu"
cpus_per_node = 32
memory_mb = 128000
max_walltime_secs = 86400
gpus_per_node = 4
gpu_type = "A100"
shared = false

See Configuration Reference for full configuration options.

Using Profiles with Slurm Workflows

HPC profiles are used by Slurm-related commands to automatically generate scheduler configurations. See Advanced Slurm Configuration for details on:

  • torc submit-slurm - Submit workflows with auto-generated schedulers
  • torc workflows create-slurm - Create workflows with auto-generated schedulers

See Also