Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Resource Requirements Reference

Technical reference for job resource specifications and allocation strategies.

Resource Requirements Fields

FieldTypeRequiredDefaultDescription
namestringYesIdentifier to reference from jobs
num_cpusintegerNo1Number of CPU cores
num_gpusintegerNo0Number of GPUs
num_nodesintegerNo1Number of nodes per job (srun --nodes); allocation size is set via Slurm scheduler config
memorystringNo1mMemory allocation (see format below)
runtimestringNoPT1HMaximum runtime (ISO 8601 duration)

Example

resource_requirements:
  - name: small
    num_cpus: 2
    num_gpus: 0
    num_nodes: 1
    memory: 4g
    runtime: PT30M

  - name: large
    num_cpus: 16
    num_gpus: 2
    num_nodes: 1
    memory: 128g
    runtime: PT8H

  - name: mpi_job       # multi-node MPI or Julia Distributed.jl
    num_cpus: 32
    num_nodes: 4        # each job step spans 4 nodes
    memory: 128g
    runtime: PT8H

num_nodes

The num_nodes field controls how many nodes each job step spans (srun --nodes). The Slurm allocation size (sbatch --nodes) is set separately via the Slurm scheduler configuration.

For most jobs the value is 1 (default). Set it to a larger value for multi-node jobs such as MPI or Julia Distributed.jl. For single-node jobs in a multi-node allocation, keep num_nodes=1 and configure the allocation size on the Slurm scheduler.

See Multi-Node Jobs for detailed examples and guidance.

Memory Format

String format with unit suffix:

SuffixUnitExample
kKilobytes512k
mMegabytes512m
gGigabytes16g

Examples:

memory: 512m    # 512 MB
memory: 1g      # 1 GB
memory: 16g     # 16 GB

Runtime Format

ISO 8601 duration format:

FormatDescriptionExample
PTnMMinutesPT30M (30 minutes)
PTnHHoursPT2H (2 hours)
PnDDaysP1D (1 day)
PnDTnHDays and hoursP1DT12H (1.5 days)

Examples:

runtime: PT10M      # 10 minutes
runtime: PT4H       # 4 hours
runtime: P1D        # 1 day
runtime: P1DT12H    # 1 day, 12 hours

Job Allocation Strategies

Resource-Based Allocation (Default)

The server considers each job's resource requirements and only returns jobs that fit within available compute node resources.

Behavior:

  • Considers CPU, memory, and GPU requirements
  • Prevents resource over-subscription
  • Enables efficient packing of heterogeneous workloads

Configuration: Run without --max-parallel-jobs:

torc run $WORKFLOW_ID

Queue-Based Allocation

The server returns the next N ready jobs regardless of resource requirements.

Behavior:

  • Ignores job resource requirements
  • Only limits concurrent job count
  • Simpler and faster (no resource calculation)

Configuration: Run with --max-parallel-jobs:

torc run $WORKFLOW_ID --max-parallel-jobs 10

Use cases:

  • Homogeneous workloads where all jobs need similar resources
  • Simple task queues
  • When resource tracking overhead is not wanted

Resource Tracking

When using resource-based allocation, the job runner tracks:

ResourceDescription
CPUsNumber of CPU cores in use
MemoryTotal memory allocated to running jobs
GPUsNumber of GPUs in use
NodesNumber of jobs running per node

Jobs are only started when sufficient resources are available.