https://doi.org/10.5281/zenodo.18728790

Papermill Slurm Job Submission#

Author: Steffen Bollmann

Date: 22 June 2025

License:

Note: If this notebook uses neuroimaging tools from Neurocontainers, those tools retain their original licenses. Please see Neurodesk citation guidelines for details.

Citation and Resources:#

Tools included in this workflow#

FSL

Papermill

  • nteract. Papermill [Software]: Parameterize and execute Jupyter Notebooks. nteract/papermill

SLURM

  • Yoo, A.B., Jette, M.A., Grondona, M. (2003). SLURM: Simple Linux Utility for Resource Management. In: Feitelson, D., Rudolph, L., Schwiegelshohn, U. (eds) Job Scheduling Strategies for Parallel Processing. JSSPP 2003. Lecture Notes in Computer Science, vol 2862. Springer, Berlin, Heidelberg. https://doi.org/10.1007/10968987_3

Dataset#

MP2RAGE T1-weighted average 7T model (human brain model)

  • Bollmann, Steffen, Andrew Janke, Lars Marstaller, David Reutens, Kieran O’Brien, and Markus Barth. “MP2RAGE T1-weighted average 7T model” January 1, 2017. doi:10.14264/uql.2017.266

Educational resources#

Introduction#

Scaling Notebook Analyses with Papermill and SLURM#

When running the same analysis across many subjects, manually re-running a notebook for each one quickly becomes impractical. Papermill solves this by allowing you to parameterise and execute Jupyter notebooks programmatically — you define which variables can be swapped out (e.g. a subject ID), and Papermill handles the rest. SLURM (Simple Linux Utility for Resource Management) is a job scheduling system used on most HPC (High Performance Computing) clusters. It allows you to submit jobs to a queue, allocating compute resources automatically across many tasks in parallel. This notebook demonstrates how to combine both tools: a neuroimaging analysis (brain extraction with FSL BET) is wrapped in a parameterised notebook and submitted as a SLURM job array, processing multiple subjects in parallel. As this example notebook runs on Neurodesk, Papermill does not need to be installed separately — it is pre-installed. Note that Papermill does not need to be imported in the notebook itself either — the only requirement is tagging the parameters cell as described below.

By the end of this notebook, you will be able to:

  • Parameterise a Jupyter notebook using Papermill cell tagging

  • Test a parameterised notebook locally from the terminal

  • Write a SLURM job array script to run the notebook across multiple subjects on an HPC cluster

Prerequisites:

  • This notebook requires a Python 3 kernel. Papermill must be called with the -k flag matching the registered kernelspec name on your system. On Neurodesk Play, use -k python3. Run jupyter kernelspec list in a terminal to check available kernels on your system.

  • Running the SLURM section requires access to an HPC cluster with SLURM. The Papermill terminal command can be tested locally without HPC access.

Load software tools#

#load FSL 6.0.4
import module
await module.load('fsl/6.0.4')
await module.list()
['fsl/6.0.4']
%%bash

# Check papermill version - papermill is a Python package but is called as a 
# command line tool here, so it does not need to be imported in the notebook
papermill --version
2.6.0 from /opt/conda/lib/python3.13/site-packages/papermill/cli.py (3.13.9)

Data download#

# Check if the file already exists before downloading - avoids re-downloading on repeated runs
# If mp2rage-01.nii exists: print "exist.", otherwise download it
![ -f ./mp2rage-01.nii  ] && echo "$FILE exist." || wget https://imaging.org.au/uploads/Human7T/mp2rageModel_L13_work03-plus-hippocampus-7T-sym-norm-mincanon_v0.8.nii -O ./mp2rage-01.nii 

# Create a second subject file by copying mp2rage-01.nii to simulate a two-subject dataset
# If mp2rage-02.nii exists: print "exist.", otherwise copy from mp2rage-01.nii
![ -f ./mp2rage-02.nii  ] && echo "$FILE exist." || cp ./mp2rage-01.nii ./mp2rage-02.nii 
--2026-04-09 04:11:24--  https://imaging.org.au/uploads/Human7T/mp2rageModel_L13_work03-plus-hippocampus-7T-sym-norm-mincanon_v0.8.nii
Resolving imaging.org.au (imaging.org.au)... 203.101.229.7
Connecting to imaging.org.au (imaging.org.au)|203.101.229.7|:443...
failed: Connection timed out.
Retrying.
--2026-04-09 04:13:36--  (try: 2)  https://imaging.org.au/uploads/Human7T/mp2rageModel_L13_work03-plus-hippocampus-7T-sym-norm-mincanon_v0.8.nii
Connecting to imaging.org.au (imaging.org.au)|203.101.229.7|:443...
failed: Connection timed out.
Retrying.
--2026-04-09 04:15:48--  (try: 3)  https://imaging.org.au/uploads/Human7T/mp2rageModel_L13_work03-plus-hippocampus-7T-sym-norm-mincanon_v0.8.nii
Connecting to imaging.org.au (imaging.org.au)|203.101.229.7|:443...
failed: Connection timed out.
Retrying.
--2026-04-09 04:18:02--  (try: 4)  https://imaging.org.au/uploads/Human7T/mp2rageModel_L13_work03-plus-hippocampus-7T-sym-norm-mincanon_v0.8.nii
Connecting to imaging.org.au (imaging.org.au)|203.101.229.7|:443...
failed: Connection timed out.
Retrying.
--2026-04-09 04:20:16--  (try: 5)  https://imaging.org.au/uploads/Human7T/mp2rageModel_L13_work03-plus-hippocampus-7T-sym-norm-mincanon_v0.8.nii
Connecting to imaging.org.au (imaging.org.au)|203.101.229.7|:443...
failed: Connection timed out.
Retrying.
--2026-04-09 04:22:33--  (try: 6)  https://imaging.org.au/uploads/Human7T/mp2rageModel_L13_work03-plus-hippocampus-7T-sym-norm-mincanon_v0.8.nii
Connecting to imaging.org.au (imaging.org.au)|203.101.229.7|:443...
failed: Connection timed out.
Retrying.
--2026-04-09 04:24:49--  (try: 7)  https://imaging.org.au/uploads/Human7T/mp2rageModel_L13_work03-plus-hippocampus-7T-sym-norm-mincanon_v0.8.nii
Connecting to imaging.org.au (imaging.org.au)|203.101.229.7|:443...
failed: Connection timed out.
Retrying.
--2026-04-09 04:27:05--  (try: 8)  https://imaging.org.au/uploads/Human7T/mp2rageModel_L13_work03-plus-hippocampus-7T-sym-norm-mincanon_v0.8.nii
Connecting to imaging.org.au (imaging.org.au)|203.101.229.7|:443...
failed: Connection timed out.
Retrying.
--2026-04-09 04:29:23--  (try: 9)  https://imaging.org.au/uploads/Human7T/mp2rageModel_L13_work03-plus-hippocampus-7T-sym-norm-mincanon_v0.8.nii
Connecting to imaging.org.au (imaging.org.au)|203.101.229.7|:443...
failed: Connection timed out.
Retrying.
--2026-04-09 04:31:41--  (try:10)  https://imaging.org.au/uploads/Human7T/mp2rageModel_L13_work03-plus-hippocampus-7T-sym-norm-mincanon_v0.8.nii
Connecting to imaging.org.au (imaging.org.au)|203.101.229.7|:443...
failed: Connection timed out.
Retrying.
--2026-04-09 04:34:02--  (try:11)  https://imaging.org.au/uploads/Human7T/mp2rageModel_L13_work03-plus-hippocampus-7T-sym-norm-mincanon_v0.8.nii
Connecting to imaging.org.au (imaging.org.au)|203.101.229.7|:443...
failed: Connection timed out.
Retrying.
--2026-04-09 04:36:21--  (try:12)  https://imaging.org.au/uploads/Human7T/mp2rageModel_L13_work03-plus-hippocampus-7T-sym-norm-mincanon_v0.8.nii
Connecting to imaging.org.au (imaging.org.au)|203.101.229.7|:443...
failed: Connection timed out.
Retrying.
--2026-04-09 04:38:40--  (try:13)  https://imaging.org.au/uploads/Human7T/mp2rageModel_L13_work03-plus-hippocampus-7T-sym-norm-mincanon_v0.8.nii
Connecting to imaging.org.au (imaging.org.au)|203.101.229.7|:443...
failed: Connection timed out.
Retrying.
--2026-04-09 04:40:59--  (try:14)  https://imaging.org.au/uploads/Human7T/mp2rageModel_L13_work03-plus-hippocampus-7T-sym-norm-mincanon_v0.8.nii
Connecting to imaging.org.au (imaging.org.au)|203.101.229.7|:443...
failed: Connection timed out.
Retrying.
--2026-04-09 04:43:19--  (try:15)  https://imaging.org.au/uploads/Human7T/mp2rageModel_L13_work03-plus-hippocampus-7T-sym-norm-mincanon_v0.8.nii
Connecting to imaging.org.au (imaging.org.au)|203.101.229.7|:443...
failed: Connection timed out.
Retrying.
--2026-04-09 04:45:38--  (try:16)  https://imaging.org.au/uploads/Human7T/mp2rageModel_L13_work03-plus-hippocampus-7T-sym-norm-mincanon_v0.8.nii
Connecting to imaging.org.au (imaging.org.au)|203.101.229.7|:443...
failed: Connection timed out.
Retrying.
--2026-04-09 04:47:57--  (try:17)  https://imaging.org.au/uploads/Human7T/mp2rageModel_L13_work03-plus-hippocampus-7T-sym-norm-mincanon_v0.8.nii
Connecting to imaging.org.au (imaging.org.au)|203.101.229.7|:443...
failed: Connection timed out.
Retrying.
--2026-04-09 04:50:16--  (try:18)  https://imaging.org.au/uploads/Human7T/mp2rageModel_L13_work03-plus-hippocampus-7T-sym-norm-mincanon_v0.8.nii
Connecting to imaging.org.au (imaging.org.au)|203.101.229.7|:443...
failed: Connection timed out.
Retrying.
--2026-04-09 04:52:36--  (try:19)  https://imaging.org.au/uploads/Human7T/mp2rageModel_L13_work03-plus-hippocampus-7T-sym-norm-mincanon_v0.8.nii
Connecting to imaging.org.au (imaging.org.au)|203.101.229.7|:443...
failed: Connection timed out.
Retrying.
--2026-04-09 04:54:55--  (try:20)  https://imaging.org.au/uploads/Human7T/mp2rageModel_L13_work03-plus-hippocampus-7T-sym-norm-mincanon_v0.8.nii
Connecting to imaging.org.au (imaging.org.au)|203.101.229.7|:443...
failed: Connection timed out.
Giving up.

Analysis#

# This cell defines the analysis parameters. It must be tagged as "parameters" 
# so Papermill can inject new values at runtime (e.g. a different subject ID).
# To tag this cell in JupyterLab: click the property inspector (⚙️ icon, right sidebar),
# type "parameters" in the Add Tag box and press Enter.

subject_id='01'
# Run FSL BET (Brain Extraction Tool) with robust brain centre estimation (-R flag) on the mp2rage image for the current subject.
# This strips the skull and produces a brain mask saved as mask-sub-{subject_id}.nii

!bet mp2rage-{subject_id}.nii mask-sub-{subject_id}.nii -R
Error: short read, file may be truncated
Error: short read, file may be truncated
Error: short read, file may be truncated
Error: short read, file may be truncated
Error: short read, file may be truncated
Error: short read, file may be truncated
Image Exception : #22 :: Failed to read volume mp2rage-01
Error : Error: short read, file may be truncated
terminate called after throwing an instance of 'std::runtime_error'
  what():  Failed to read volume mp2rage-01
Error : Error: short read, file may be truncated
/opt/fsl-6.0.4/bin/bet: line 399:  7174 Aborted                 (core dumped) ${FSLDIR}/bin/bet2 $IN $OUT $bet2opts
/opt/fsl-6.0.4/bin/bet failed during command:mp2rage-01.nii mask-sub-01.nii -R

Running it on the HPC#

Step 1: Test locally in the terminal#

Before submitting to the cluster, test that Papermill runs the notebook correctly for a single subject. Save the notebook, open a terminal, navigate to this notebook’s directory, and run the following command.

Run this cell to check your available kernels:

! jupyter kernelspec list
Available kernels:
  python3    /opt/conda/share/jupyter/kernels/python3

Note: Set the -k flag to specify the kernel to use. On Neurodesk, use -k python3.

# papermill papermill-slurm-submission-example.ipynb papermill_output.ipynb --parameters_raw subject_id 02 -k python3

Once the command completes, Papermill creates papermill_output.ipynb — a fully executed copy of the notebook where the injected parameters cell has been set to subject_id = "02" and all cells have been run in sequence.

Note that the data preparation cells will also re-run, though the download will be skipped if the files already exist. This output notebook serves as a reproducible record of that specific run.

Step 2: Submit to the HPC cluster#

Create sbat file and save the following content as papermill.sbat, adapting the account, time limit, and modules to match your cluster environment.

#!/bin/bash

# --- Job settings ---
#SBATCH --job-name=papermill_analysis       # Name shown in the job queue
#SBATCH --output=papermill_%A_%a.out        # Standard output log (%A = job ID, %a = array index)
#SBATCH --error=papermill_%A_%a.err         # Error log
#SBATCH --time=00:05:00                     # Maximum runtime (hh:mm:ss) - adjust to your analysis
#SBATCH --nodes=1                           # Number of compute nodes
#SBATCH --ntasks-per-node=1                 # Tasks per node
#SBATCH --cpus-per-task=4                   # CPU cores per task
#SBATCH --mem=20G                           # Memory per job
#SBATCH --partition=general                 # Cluster partition to use - check with your HPC admins
#SBATCH --account=a_barth                   # Your HPC account - replace with your own
#SBATCH --array=1-2                         # Job array: one job per subject (here: 2 subjects)


# --- Load modules ---
# Load the same modules active in your Jupyter session.
# Find out which modules you have loaded by running "ml" in the terminal.
module load julia/1.10.4
module load openssl/1.1
module load python/3.10.4-gcccore-11.3.0
module load libxslt/1.1.34-gcccore-11.3.0
module load lxml/4.9.1-gcccore-11.3.0
module load beautifulsoup/4.10.0-gcccore-11.3.0
module load jupyter-server/1.21.0-gcccore-11.3.0
module load jupyterlab/3.5.0-gcccore-11.3.0

# --- Setup ---
# Create output directory
mkdir -p papermill_outputs

# --- Subject selection ---
# Define the list of subject IDs to process
subjects=(01 02)

# SLURM_ARRAY_TASK_ID is automatically set by SLURM (1 for first job, 2 for second, etc.)
# This maps the array index to the corresponding subject ID
subject_id=${subjects[$((SLURM_ARRAY_TASK_ID-1))]}

echo "Processing subject: $subject_id"
echo "Job ID: $SLURM_JOB_ID"
echo "Array Task ID: $SLURM_ARRAY_TASK_ID"
echo "Started at: $(date)"

# --- Run Papermill ---
# Execute the notebook for this subject, saving the output as a separate notebook
papermill papermill-slurm-submission-example.ipynb \
    papermill_outputs/papermill_output_sub-${subject_id}.ipynb \
    --parameters_raw subject_id ${subject_id}

# --- Check result ---
# $? holds the exit code of the last command (0 = success, anything else = failure)
if [ $? -eq 0 ]; then
    echo "Successfully processed subject $subject_id"
else
    echo "ERROR: Failed to process subject $subject_id"
    exit 1
fi

echo "Completed subject $subject_id at $(date)"
# Submit the job to SLURM by running this command in the terminal:
# sbatch papermill.sbat

Dependencies in Jupyter/Python#

  • Using the package watermark to document system environment and software versions used in this notebook, alongside the Neurodesktop version extracted from the JUPYTER_IMAGE or NEURODESKTOP_VERSION environment variables.

import os

%load_ext watermark

%watermark
%watermark --iversions

neurodesktop_version = (
    os.environ.get('JUPYTER_IMAGE', '').split(':')[-1] or
    os.environ.get('NEURODESKTOP_VERSION', 'unknown')
)

print(f"Neurodesktop version: {neurodesktop_version}")
Last updated: 2026-04-09T04:57:07.779556+00:00

Python implementation: CPython
Python version       : 3.13.9
IPython version      : 9.7.0

Compiler    : GCC 14.3.0
OS          : Linux
Release     : 5.15.0-171-generic
Machine     : x86_64
Processor   : x86_64
CPU cores   : 32
Architecture: 64bit


Neurodesktop version: 2025-12-20