Papermill Slurm Job Submission#
Author: Steffen Bollmann
Date: 22 June 2025
License:
Note: If this notebook uses neuroimaging tools from Neurocontainers, those tools retain their original licenses. Please see Neurodesk citation guidelines for details.
Citation and Resources:#
Tools included in this workflow#
FSL
Jenkinson, M., Beckmann, C. F., Behrens, T. E. J., Woolrich, M. W., & Smith, S. M. (2012). FSL. NeuroImage, 62(2), 782–790. https://doi.org/10.1016/j.neuroimage.2011.09.015
Papermill
nteract. Papermill [Software]: Parameterize and execute Jupyter Notebooks. nteract/papermill
SLURM
Yoo, A.B., Jette, M.A., Grondona, M. (2003). SLURM: Simple Linux Utility for Resource Management. In: Feitelson, D., Rudolph, L., Schwiegelshohn, U. (eds) Job Scheduling Strategies for Parallel Processing. JSSPP 2003. Lecture Notes in Computer Science, vol 2862. Springer, Berlin, Heidelberg. https://doi.org/10.1007/10968987_3
Dataset#
MP2RAGE T1-weighted average 7T model (human brain model)
Bollmann, Steffen, Andrew Janke, Lars Marstaller, David Reutens, Kieran O’Brien, and Markus Barth. “MP2RAGE T1-weighted average 7T model” January 1, 2017. doi:10.14264/uql.2017.266
Educational resources#
Introduction#
Scaling Notebook Analyses with Papermill and SLURM#
When running the same analysis across many subjects, manually re-running a notebook for each one quickly becomes impractical. Papermill solves this by allowing you to parameterise and execute Jupyter notebooks programmatically — you define which variables can be swapped out (e.g. a subject ID), and Papermill handles the rest. SLURM (Simple Linux Utility for Resource Management) is a job scheduling system used on most HPC (High Performance Computing) clusters. It allows you to submit jobs to a queue, allocating compute resources automatically across many tasks in parallel. This notebook demonstrates how to combine both tools: a neuroimaging analysis (brain extraction with FSL BET) is wrapped in a parameterised notebook and submitted as a SLURM job array, processing multiple subjects in parallel. As this example notebook runs on Neurodesk, Papermill does not need to be installed separately — it is pre-installed. Note that Papermill does not need to be imported in the notebook itself either — the only requirement is tagging the parameters cell as described below.
By the end of this notebook, you will be able to:
Parameterise a Jupyter notebook using Papermill cell tagging
Test a parameterised notebook locally from the terminal
Write a SLURM job array script to run the notebook across multiple subjects on an HPC cluster
Prerequisites:
This notebook requires a Python 3 kernel. Papermill must be called with the
-kflag matching the registered kernelspec name on your system. On Neurodesk Play, use-k python3. Runjupyter kernelspec listin a terminal to check available kernels on your system.Running the SLURM section requires access to an HPC cluster with SLURM. The Papermill terminal command can be tested locally without HPC access.
Load software tools#
#load FSL 6.0.4
import module
await module.load('fsl/6.0.4')
await module.list()
['fsl/6.0.4']
%%bash
# Check papermill version - papermill is a Python package but is called as a
# command line tool here, so it does not need to be imported in the notebook
papermill --version
2.6.0 from /opt/conda/lib/python3.13/site-packages/papermill/cli.py (3.13.9)
Data download#
# Check if the file already exists before downloading - avoids re-downloading on repeated runs
# If mp2rage-01.nii exists: print "exist.", otherwise download it
![ -f ./mp2rage-01.nii ] && echo "$FILE exist." || wget https://imaging.org.au/uploads/Human7T/mp2rageModel_L13_work03-plus-hippocampus-7T-sym-norm-mincanon_v0.8.nii -O ./mp2rage-01.nii
# Create a second subject file by copying mp2rage-01.nii to simulate a two-subject dataset
# If mp2rage-02.nii exists: print "exist.", otherwise copy from mp2rage-01.nii
![ -f ./mp2rage-02.nii ] && echo "$FILE exist." || cp ./mp2rage-01.nii ./mp2rage-02.nii
--2026-04-09 04:11:24-- https://imaging.org.au/uploads/Human7T/mp2rageModel_L13_work03-plus-hippocampus-7T-sym-norm-mincanon_v0.8.nii
Resolving imaging.org.au (imaging.org.au)... 203.101.229.7
Connecting to imaging.org.au (imaging.org.au)|203.101.229.7|:443...
failed: Connection timed out.
Retrying.
--2026-04-09 04:13:36-- (try: 2) https://imaging.org.au/uploads/Human7T/mp2rageModel_L13_work03-plus-hippocampus-7T-sym-norm-mincanon_v0.8.nii
Connecting to imaging.org.au (imaging.org.au)|203.101.229.7|:443...
failed: Connection timed out.
Retrying.
--2026-04-09 04:15:48-- (try: 3) https://imaging.org.au/uploads/Human7T/mp2rageModel_L13_work03-plus-hippocampus-7T-sym-norm-mincanon_v0.8.nii
Connecting to imaging.org.au (imaging.org.au)|203.101.229.7|:443...
failed: Connection timed out.
Retrying.
--2026-04-09 04:18:02-- (try: 4) https://imaging.org.au/uploads/Human7T/mp2rageModel_L13_work03-plus-hippocampus-7T-sym-norm-mincanon_v0.8.nii
Connecting to imaging.org.au (imaging.org.au)|203.101.229.7|:443...
failed: Connection timed out.
Retrying.
--2026-04-09 04:20:16-- (try: 5) https://imaging.org.au/uploads/Human7T/mp2rageModel_L13_work03-plus-hippocampus-7T-sym-norm-mincanon_v0.8.nii
Connecting to imaging.org.au (imaging.org.au)|203.101.229.7|:443...
failed: Connection timed out.
Retrying.
--2026-04-09 04:22:33-- (try: 6) https://imaging.org.au/uploads/Human7T/mp2rageModel_L13_work03-plus-hippocampus-7T-sym-norm-mincanon_v0.8.nii
Connecting to imaging.org.au (imaging.org.au)|203.101.229.7|:443...
failed: Connection timed out.
Retrying.
--2026-04-09 04:24:49-- (try: 7) https://imaging.org.au/uploads/Human7T/mp2rageModel_L13_work03-plus-hippocampus-7T-sym-norm-mincanon_v0.8.nii
Connecting to imaging.org.au (imaging.org.au)|203.101.229.7|:443...
failed: Connection timed out.
Retrying.
--2026-04-09 04:27:05-- (try: 8) https://imaging.org.au/uploads/Human7T/mp2rageModel_L13_work03-plus-hippocampus-7T-sym-norm-mincanon_v0.8.nii
Connecting to imaging.org.au (imaging.org.au)|203.101.229.7|:443...
failed: Connection timed out.
Retrying.
--2026-04-09 04:29:23-- (try: 9) https://imaging.org.au/uploads/Human7T/mp2rageModel_L13_work03-plus-hippocampus-7T-sym-norm-mincanon_v0.8.nii
Connecting to imaging.org.au (imaging.org.au)|203.101.229.7|:443...
failed: Connection timed out.
Retrying.
--2026-04-09 04:31:41-- (try:10) https://imaging.org.au/uploads/Human7T/mp2rageModel_L13_work03-plus-hippocampus-7T-sym-norm-mincanon_v0.8.nii
Connecting to imaging.org.au (imaging.org.au)|203.101.229.7|:443...
failed: Connection timed out.
Retrying.
--2026-04-09 04:34:02-- (try:11) https://imaging.org.au/uploads/Human7T/mp2rageModel_L13_work03-plus-hippocampus-7T-sym-norm-mincanon_v0.8.nii
Connecting to imaging.org.au (imaging.org.au)|203.101.229.7|:443...
failed: Connection timed out.
Retrying.
--2026-04-09 04:36:21-- (try:12) https://imaging.org.au/uploads/Human7T/mp2rageModel_L13_work03-plus-hippocampus-7T-sym-norm-mincanon_v0.8.nii
Connecting to imaging.org.au (imaging.org.au)|203.101.229.7|:443...
failed: Connection timed out.
Retrying.
--2026-04-09 04:38:40-- (try:13) https://imaging.org.au/uploads/Human7T/mp2rageModel_L13_work03-plus-hippocampus-7T-sym-norm-mincanon_v0.8.nii
Connecting to imaging.org.au (imaging.org.au)|203.101.229.7|:443...
failed: Connection timed out.
Retrying.
--2026-04-09 04:40:59-- (try:14) https://imaging.org.au/uploads/Human7T/mp2rageModel_L13_work03-plus-hippocampus-7T-sym-norm-mincanon_v0.8.nii
Connecting to imaging.org.au (imaging.org.au)|203.101.229.7|:443...
failed: Connection timed out.
Retrying.
--2026-04-09 04:43:19-- (try:15) https://imaging.org.au/uploads/Human7T/mp2rageModel_L13_work03-plus-hippocampus-7T-sym-norm-mincanon_v0.8.nii
Connecting to imaging.org.au (imaging.org.au)|203.101.229.7|:443...
failed: Connection timed out.
Retrying.
--2026-04-09 04:45:38-- (try:16) https://imaging.org.au/uploads/Human7T/mp2rageModel_L13_work03-plus-hippocampus-7T-sym-norm-mincanon_v0.8.nii
Connecting to imaging.org.au (imaging.org.au)|203.101.229.7|:443...
failed: Connection timed out.
Retrying.
--2026-04-09 04:47:57-- (try:17) https://imaging.org.au/uploads/Human7T/mp2rageModel_L13_work03-plus-hippocampus-7T-sym-norm-mincanon_v0.8.nii
Connecting to imaging.org.au (imaging.org.au)|203.101.229.7|:443...
failed: Connection timed out.
Retrying.
--2026-04-09 04:50:16-- (try:18) https://imaging.org.au/uploads/Human7T/mp2rageModel_L13_work03-plus-hippocampus-7T-sym-norm-mincanon_v0.8.nii
Connecting to imaging.org.au (imaging.org.au)|203.101.229.7|:443...
failed: Connection timed out.
Retrying.
--2026-04-09 04:52:36-- (try:19) https://imaging.org.au/uploads/Human7T/mp2rageModel_L13_work03-plus-hippocampus-7T-sym-norm-mincanon_v0.8.nii
Connecting to imaging.org.au (imaging.org.au)|203.101.229.7|:443...
failed: Connection timed out.
Retrying.
--2026-04-09 04:54:55-- (try:20) https://imaging.org.au/uploads/Human7T/mp2rageModel_L13_work03-plus-hippocampus-7T-sym-norm-mincanon_v0.8.nii
Connecting to imaging.org.au (imaging.org.au)|203.101.229.7|:443...
failed: Connection timed out.
Giving up.
Analysis#
# This cell defines the analysis parameters. It must be tagged as "parameters"
# so Papermill can inject new values at runtime (e.g. a different subject ID).
# To tag this cell in JupyterLab: click the property inspector (⚙️ icon, right sidebar),
# type "parameters" in the Add Tag box and press Enter.
subject_id='01'
# Run FSL BET (Brain Extraction Tool) with robust brain centre estimation (-R flag) on the mp2rage image for the current subject.
# This strips the skull and produces a brain mask saved as mask-sub-{subject_id}.nii
!bet mp2rage-{subject_id}.nii mask-sub-{subject_id}.nii -R
Error: short read, file may be truncated
Error: short read, file may be truncated
Error: short read, file may be truncated
Error: short read, file may be truncated
Error: short read, file may be truncated
Error: short read, file may be truncated
Image Exception : #22 :: Failed to read volume mp2rage-01
Error : Error: short read, file may be truncated
terminate called after throwing an instance of 'std::runtime_error'
what(): Failed to read volume mp2rage-01
Error : Error: short read, file may be truncated
/opt/fsl-6.0.4/bin/bet: line 399: 7174 Aborted (core dumped) ${FSLDIR}/bin/bet2 $IN $OUT $bet2opts
/opt/fsl-6.0.4/bin/bet failed during command:mp2rage-01.nii mask-sub-01.nii -R
Running it on the HPC#
Step 1: Test locally in the terminal#
Before submitting to the cluster, test that Papermill runs the notebook correctly for a single subject. Save the notebook, open a terminal, navigate to this notebook’s directory, and run the following command.
Run this cell to check your available kernels:
! jupyter kernelspec list
Available kernels:
python3 /opt/conda/share/jupyter/kernels/python3
Note: Set the -k flag to specify the kernel to use. On Neurodesk, use -k python3.
# papermill papermill-slurm-submission-example.ipynb papermill_output.ipynb --parameters_raw subject_id 02 -k python3
Once the command completes, Papermill creates papermill_output.ipynb — a fully executed copy of the notebook where the injected parameters cell has been set to subject_id = "02" and all cells have been run in sequence.
Note that the data preparation cells will also re-run, though the download will be skipped if the files already exist. This output notebook serves as a reproducible record of that specific run.
Step 2: Submit to the HPC cluster#
Create sbat file and save the following content as papermill.sbat, adapting the account, time limit, and modules to match your cluster environment.
#!/bin/bash
# --- Job settings ---
#SBATCH --job-name=papermill_analysis # Name shown in the job queue
#SBATCH --output=papermill_%A_%a.out # Standard output log (%A = job ID, %a = array index)
#SBATCH --error=papermill_%A_%a.err # Error log
#SBATCH --time=00:05:00 # Maximum runtime (hh:mm:ss) - adjust to your analysis
#SBATCH --nodes=1 # Number of compute nodes
#SBATCH --ntasks-per-node=1 # Tasks per node
#SBATCH --cpus-per-task=4 # CPU cores per task
#SBATCH --mem=20G # Memory per job
#SBATCH --partition=general # Cluster partition to use - check with your HPC admins
#SBATCH --account=a_barth # Your HPC account - replace with your own
#SBATCH --array=1-2 # Job array: one job per subject (here: 2 subjects)
# --- Load modules ---
# Load the same modules active in your Jupyter session.
# Find out which modules you have loaded by running "ml" in the terminal.
module load julia/1.10.4
module load openssl/1.1
module load python/3.10.4-gcccore-11.3.0
module load libxslt/1.1.34-gcccore-11.3.0
module load lxml/4.9.1-gcccore-11.3.0
module load beautifulsoup/4.10.0-gcccore-11.3.0
module load jupyter-server/1.21.0-gcccore-11.3.0
module load jupyterlab/3.5.0-gcccore-11.3.0
# --- Setup ---
# Create output directory
mkdir -p papermill_outputs
# --- Subject selection ---
# Define the list of subject IDs to process
subjects=(01 02)
# SLURM_ARRAY_TASK_ID is automatically set by SLURM (1 for first job, 2 for second, etc.)
# This maps the array index to the corresponding subject ID
subject_id=${subjects[$((SLURM_ARRAY_TASK_ID-1))]}
echo "Processing subject: $subject_id"
echo "Job ID: $SLURM_JOB_ID"
echo "Array Task ID: $SLURM_ARRAY_TASK_ID"
echo "Started at: $(date)"
# --- Run Papermill ---
# Execute the notebook for this subject, saving the output as a separate notebook
papermill papermill-slurm-submission-example.ipynb \
papermill_outputs/papermill_output_sub-${subject_id}.ipynb \
--parameters_raw subject_id ${subject_id}
# --- Check result ---
# $? holds the exit code of the last command (0 = success, anything else = failure)
if [ $? -eq 0 ]; then
echo "Successfully processed subject $subject_id"
else
echo "ERROR: Failed to process subject $subject_id"
exit 1
fi
echo "Completed subject $subject_id at $(date)"
# Submit the job to SLURM by running this command in the terminal:
# sbatch papermill.sbat
Dependencies in Jupyter/Python#
Using the package watermark to document system environment and software versions used in this notebook, alongside the Neurodesktop version extracted from the
JUPYTER_IMAGEorNEURODESKTOP_VERSIONenvironment variables.
import os
%load_ext watermark
%watermark
%watermark --iversions
neurodesktop_version = (
os.environ.get('JUPYTER_IMAGE', '').split(':')[-1] or
os.environ.get('NEURODESKTOP_VERSION', 'unknown')
)
print(f"Neurodesktop version: {neurodesktop_version}")
Last updated: 2026-04-09T04:57:07.779556+00:00
Python implementation: CPython
Python version : 3.13.9
IPython version : 9.7.0
Compiler : GCC 14.3.0
OS : Linux
Release : 5.15.0-171-generic
Machine : x86_64
Processor : x86_64
CPU cores : 32
Architecture: 64bit
Neurodesktop version: 2025-12-20