Working with HPC Profiles

HPC (High-Performance Computing) profiles provide pre-configured knowledge about specific HPC systems, including their partitions, resource limits, and optimal settings. Torc uses this information to automatically match job requirements to appropriate partitions.

Overview

HPC profiles contain:

Partition definitions: Available queues with their resource limits (CPUs, memory, walltime, GPUs)
Detection rules: How to identify when you're on a specific HPC system
Default settings: Account names and other system-specific defaults

Built-in profiles are available for systems like NLR's Kestrel. You can also define custom profiles for private clusters.

Listing Available Profiles

View all known HPC profiles:

torc hpc list

Example output:

Known HPC profiles:

╭─────────┬──────────────┬────────────┬──────────╮
│ Name    │ Display Name │ Partitions │ Detected │
├─────────┼──────────────┼────────────┼──────────┤
│ kestrel │ NLR Kestrel  │ 15         │ ✓        │
╰─────────┴──────────────┴────────────┴──────────╯

The "Detected" column shows if Torc recognizes you're currently on that system.

Dynamic Slurm Profiles

For Slurm-based clusters without a built-in profile, Torc can dynamically generate a profile by querying the cluster itself. This means you can use Torc on almost any Slurm system without manual configuration.

To use dynamic Slurm detection, you can:

Explicitly request it: Use --profile slurm in any command that requires a profile.
Let Torc auto-detect it: If you're on a Slurm system and haven't specified a profile or matched a built-in one, Torc will automatically fall back to dynamic Slurm detection.

Dynamic profiles are generated by:

Running sinfo to discover partitions, CPU/memory limits, and GRES (GPUs).
Running scontrol show partition to find shared node settings and default QOS.
Heuristically inferring GPU types if not explicitly reported by Slurm.

Detecting the Current System

Torc can automatically detect which HPC system you're on:

torc hpc detect

Torc uses a prioritized detection strategy:

Built-in Profiles: Matches known systems via environment variables or hostname patterns.
Custom Profiles: Matches your configured custom profiles.
Dynamic Slurm: If Slurm commands (sinfo) are available, generates a profile from the current cluster.

Viewing Profile Details

See detailed information about a specific profile:

torc hpc show kestrel

You can also view the dynamically detected Slurm profile:

torc hpc show slurm

Viewing Available Partitions

List all partitions for a profile:

torc hpc partitions kestrel

For the current Slurm cluster:

torc hpc partitions slurm

Finding Matching Partitions

Find partitions that can satisfy specific resource requirements:

torc hpc match --cpus 32 --memory 64g --walltime 02:00:00

If no profile is specified, it will use the detected system (including dynamic Slurm).

Custom HPC Profiles

If your HPC system doesn't have a built-in profile, you have three options:

Use Dynamic Slurm Detection (Easiest): Let Torc automatically discover your cluster's capabilities.
Generate and Customize a Profile: Run torc hpc generate to create a TOML template based on your cluster, then customize it in your config file.
Request Built-in Support: If your HPC is widely used, open an issue requesting built-in support.

Quick Example

Define custom profiles in your configuration file:

# ~/.config/torc/config.toml

[client.hpc.custom_profiles.mycluster]
display_name = "My Research Cluster"
description = "Internal research HPC system"
detect_env_var = "MY_CLUSTER=research"
default_account = "default_project"

[[client.hpc.custom_profiles.mycluster.partitions]]
name = "compute"
cpus_per_node = 64
memory_mb = 256000
max_walltime_secs = 172800
shared = false

[[client.hpc.custom_profiles.mycluster.partitions]]
name = "gpu"
cpus_per_node = 32
memory_mb = 128000
max_walltime_secs = 86400
gpus_per_node = 4
gpu_type = "A100"
shared = false

See Configuration Reference for full configuration options.

Using Profiles with Slurm Workflows

HPC profiles are used by Slurm-related commands to automatically generate scheduler configurations. See Advanced Slurm Configuration for details on:

torc slurm generate + torc submit - Submit workflows with auto-generated schedulers

Torc Documentation