Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Quick Start (HPC)

This guide walks you through running your first Torc workflow on an HPC cluster with Slurm. Jobs are submitted to Slurm and run on compute nodes.

For local execution (testing, development, or non-HPC environments), see Quick Start (Local).

Prerequisites

  • Access to an HPC cluster with Slurm
  • A Slurm account/allocation for submitting jobs
  • Torc installed (see Installation)

Start the Server

On the login node, start a Torc server with a local database:

Note: This uses a specific hostname routable from compute nodes, which may vary across HPC systems. Adjust as necessary or exclude --host to use the default.

torc-server run \
    --database torc.db \
    --host kl1.hsn.cm.kestrel.hpc.nrel.gov \
    --port 0 \
    --completion-check-interval-secs 5

With port=0 torc will find a random free port. It will print the port number on the console like below. You will need this number when connecting from the client.

2026-02-04T14:31:33.396627Z  INFO ThreadId(01) torc_server::server: 263: Starting a server (over http, so no TLS) on port 52619

Security Note: The server starts without authentication and is accessible from any machine that can reach this host. For networks with untrusted users, see Authentication to secure your server.

Setup the client

Set the Torc API URL in your environment using the port number from above:

export TORC_API_URL=http://kl1.hsn.cm.kestrel.hpc.nrel.gov:52619/torc-service/v1

Note: You can also set a custom port number as long as it does not conflict with others.

Check Your HPC Profile

Torc includes built-in profiles for common HPC systems. Check if your system is detected:

torc hpc detect

If detected, you'll see your HPC system name. To see available partitions:

torc hpc partitions <profile-name>

Note: If your HPC system isn't detected, see Custom HPC Profile or request built-in support.

Create a Workflow with Resource Requirements

Save this as workflow.yaml:

name: hpc_hello_world
description: Simple HPC workflow

resource_requirements:
  - name: small
    num_cpus: 4
    memory: 8g
    runtime: PT30M

jobs:
  - name: job1
    command: echo "Hello from compute node!" && hostname
    resource_requirements: small

  - name: job2
    command: echo "Hello again!" && hostname
    resource_requirements: small
    depends_on: [job1]

Key differences from local workflows:

  • resource_requirements: Define CPU, memory, and runtime needs
  • Jobs reference these requirements by name
  • Torc matches requirements to appropriate Slurm partitions

Submit the Workflow

Submit with your Slurm account:

torc submit-slurm --account <your-account> workflow.yaml

Torc will:

  1. Detect your HPC system
  2. Match job requirements to appropriate partitions
  3. Generate Slurm scheduler configurations
  4. Create and submit the workflow

Monitor Progress

Check workflow status:

torc workflows list
torc jobs list <workflow-id>

Or use the interactive TUI:

torc tui

Check Slurm queue:

squeue --me

View Results

Once jobs complete:

torc results list <workflow-id>

Job output is stored in the torc_output/ directory by default.

Example: Multi-Stage Pipeline

A more realistic workflow with different resource requirements per stage:

name: analysis_pipeline
description: Data processing pipeline

resource_requirements:
  - name: light
    num_cpus: 4
    memory: 8g
    runtime: PT30M

  - name: compute
    num_cpus: 32
    memory: 64g
    runtime: PT2H

  - name: gpu
    num_cpus: 8
    num_gpus: 1
    memory: 32g
    runtime: PT1H

jobs:
  - name: preprocess
    command: python preprocess.py
    resource_requirements: light

  - name: train
    command: python train.py
    resource_requirements: gpu
    depends_on: [preprocess]

  - name: evaluate
    command: python evaluate.py
    resource_requirements: compute
    depends_on: [train]

Torc stages resource allocation based on dependencies:

  • preprocess resources are allocated at workflow start
  • train resources are allocated when preprocess completes
  • evaluate resources are allocated when train completes

This prevents wasting allocation time on resources that aren't needed yet.

Preview Before Submitting

For production workflows, preview the generated Slurm configuration first:

torc slurm generate --account <your-account> workflow.yaml

This shows what schedulers and actions Torc will create without submitting anything.

Next Steps