https://doi.org/10.5281/zenodo.18717009

Nextflow in Neurodesk#

Author: Steffen Bollmann

Date: 13 Feb 2026

License:

Note: If this notebook uses neuroimaging tools from Neurocontainers, those tools retain their original licenses. Please see Neurodesk citation guidelines for details.

Citation and Resources#

Workflow engine#

  • Di Tommaso, P., Chatzou, M., Floden, E. W., Barja, P. P., Palumbo, E., & Notredame, C. (2017). Nextflow enables reproducible computational workflows. Nature Biotechnology, 35(4), 316-319. https://doi.org/10.1038/nbt.3820

Tools included in this workflow#

FSL - Brain Extraction Tool (BET)

Dataset from OSF#

Table of contents#

  1. Introduction

  2. Setup

  3. Download test data

  4. Hello Nextflow

  5. Core concepts

  6. Single-subject brain extraction

  7. Multi-subject pipeline

  8. Visualize results

  9. Next steps

  10. Cleanup

  11. Dependencies and environment capture

Introduction#

Nextflow is a workflow engine that lets you write data-driven computational pipelines. It handles parallelism, file staging, and error recovery so you can focus on the analysis logic.

Why use Nextflow for neuroimaging?

  • Automatically runs independent subjects in parallel

  • Tracks which steps have finished so you can resume failed runs with -resume

  • Works the same way on your laptop, an HPC cluster, or in the cloud

  • Large ecosystem of ready-made pipelines at nf-core

This notebook teaches Nextflow from scratch through three progressively complex examples:

  1. A “hello world” pipeline

  2. Brain extraction on a single subject

  3. A multi-subject pipeline with quality-control summary

Setup#

Load FSL via the Neurodesk module system.

import module
await module.load('fsl/6.0.7.16')
await module.list()
['fsl/6.0.7.16']
#nextflow is installed inside Neurodesk
!nextflow -version
      N E X T F L O W
      version 25.10.4 build 11173
      created 10-02-2026 15:17 UTC 
      cite doi:10.1038/nbt.3820
      http://nextflow.io

Download test data#

We download a T1-weighted MRI scan from the TOMCAT dataset on OSF, then create a copy as a simulated second subject for the multi-subject demo later.

%%bash
mkdir -p data

if [ -f ./data/sub-01.nii ]; then
    echo "sub-01.nii already exists, skipping download"
else
    if [ ! -f ./data/sub-01.nii.gz ]; then
        echo "Downloading T1w from OSF..."
        osf -p bt4ez fetch osfstorage/TOMCAT_DIB/sub-01/ses-01_7T/anat/sub-01_ses-01_7T_T1w_defaced.nii.gz ./data/sub-01.nii.gz
    fi
    echo "Decompressing..."
    gunzip ./data/sub-01.nii.gz
fi

echo "Done."
ls -lh ./data/sub-01.nii
sub-01.nii already exists, skipping download
Done.
-rw-rw-r-- 1 jovyan jovyan 88M Feb 28 05:35 ./data/sub-01.nii

Hello Nextflow#

Every Nextflow pipeline is a .nf script containing processes (units of work) and a workflow that wires them together using channels (asynchronous data queues).

Let’s start with the simplest possible pipeline.

%%writefile hello.nf
process SAY_HELLO {
    input:
    val greeting

    output:
    stdout

    script:
    """
    echo "$greeting from Nextflow!"
    """
}

workflow {
    greetings = Channel.of('Hello', 'Bonjour', 'Hola')
    SAY_HELLO(greetings) | view
}
Overwriting hello.nf
!nextflow run hello.nf
 N E X T F L O W   ~  version 25.10.4

Launching `hello.nf` [insane_faraday] DSL2 - revision: 59446397d2

[-        ] SAY_HELLO -

executor >  local (2)
[38/1ebba7] SAY_HELLO (3) | 0 of 3

executor >  local (3)
[c8/c7d545] SAY_HELLO (1) | 0 of 3
Bonjour from Nextflow!

Hello from Nextflow!

Hola from Nextflow!


executor >  local (3)
[c8/c7d545] SAY_HELLO (1) | 3 of 3
Bonjour from Nextflow!

Hello from Nextflow!

Hola from Nextflow!

What just happened?

  1. Channel.of(...) created a channel with three items

  2. Nextflow launched the SAY_HELLO process three times — once per item, potentially in parallel

  3. Each execution ran in its own isolated work/ subdirectory

Let’s peek inside the work directory to see how Nextflow organises task execution:

%%bash
echo "=== work directory structure ==="
# Show the first task directory as an example
TASK_DIR=$(find work -name '.command.sh' -print -quit 2>/dev/null | xargs dirname)
if [ -n "$TASK_DIR" ]; then
    echo "Example task dir: $TASK_DIR"
    echo ""
    echo "--- .command.sh (the actual script that ran) ---"
    cat "$TASK_DIR/.command.sh"
    echo ""
    echo "--- .command.log (stdout/stderr) ---"
    cat "$TASK_DIR/.command.log"
else
    echo "No work directory found (pipeline may not have run)"
fi
=== work directory structure ===
Example task dir: work/21/fb9529b232a67eef566871a63697b0

--- .command.sh (the actual script that ra
n) ---
#!/bin/bash -ue
bet sub-01.nii sub-01_brain -f 0.4 -m

--- .command.log (stdout/stderr) ---

Core concepts#

Concept

Description

Process

A unit of work with defined inputs, outputs, and a script. Runs in an isolated directory.

Channel

An asynchronous queue that connects processes. Data flows through channels.

Workflow

The top-level block that creates channels and wires processes together.

publishDir

Copies output files from the work directory to a permanent results folder.

params

Pipeline parameters that can be set on the command line with --name value.

Nextflow automatically handles:

  • Parallelism: If a channel has N items, the process runs N times (potentially concurrently)

  • File staging: Input files are symlinked into each task’s work directory

  • Resumability: Use -resume to skip already-completed tasks after a failure

Single-subject brain extraction#

Now let’s do something useful: run FSL’s bet (Brain Extraction Tool) on a single T1w image via Nextflow.

This introduces:

  • path inputs (file handling)

  • params for configurable settings

  • publishDir to save outputs to a results folder

%%writefile bet_single.nf
params.input = './data/sub-01.nii'
params.outdir = './results_single'
params.frac  = 0.4

process BET {
    publishDir params.outdir, mode: 'copy'

    input:
    path t1w

    output:
    path '*_brain.*'

    script:
    """
    bet ${t1w} ${t1w.baseName}_brain -f ${params.frac} -m
    """
}

workflow {
    input_ch = Channel.fromPath(params.input)
    BET(input_ch)
}
Overwriting bet_single.nf
!nextflow run bet_single.nf
 N E X T F L O W   ~  version 25.10.4

Launching `bet_single.nf` [wise_sax] DSL2 - revision: 0e8f469548

[-        ] BET -

[-        ] BET | 0 of 1

executor >  local (1)
[1b/a6faf9] BET (1) | 0 of 1

executor >  local (1)
[1b/a6faf9] BET (1) | 1 of 1

executor >  local (1)
[1b/a6faf9] BET (1) | 1 of 1
!ls -lh results_single/
total 18M
-rw-rw-r-- 1 jovyan jovyan 18M Feb 28 05:36 sub-01_brain.nii.gz

Key points:

  • Channel.fromPath(...) creates a channel from a file path

  • Inside the script block, ${t1w} refers to the staged input file

  • publishDir copies the outputs matching '*_brain.*' to our results folder

  • We could override any parameter from the command line, e.g. nextflow run bet_single.nf --frac 0.3

Multi-subject pipeline#

Real neuroimaging studies have multiple subjects. Nextflow makes this easy — we just put multiple files into a channel and Nextflow fans out automatically.

This pipeline has two processes:

  1. BET — runs brain extraction per subject (parallel fan-out)

  2. QC_SUMMARY — collects all results and generates a summary table (runs once after all BET tasks finish)

%%writefile bet_multi.nf
params.inputs = './data/sub-*.nii'
params.outdir = './results_multi'
params.frac   = 0.4

process BET {
    publishDir "${params.outdir}/bet", mode: 'copy'

    input:
    path t1w

    output:
    path '*_brain.nii.gz'
    path '*_brain_mask.nii.gz'

    script:
    """
    bet ${t1w} ${t1w.baseName}_brain -f ${params.frac} -m
    """
}

process QC_SUMMARY {
    publishDir params.outdir, mode: 'copy'

    input:
    path brain_images
    path mask_images

    output:
    path 'qc_summary.tsv'

    script:
    """
    echo -e "subject\tbrain_volume_voxels" > qc_summary.tsv
    for mask in *_brain_mask.nii.gz; do
        subj=\$(echo \$mask | sed 's/_brain_mask.nii.gz//')
        nvox=\$(fslstats \$mask -V | awk '{print \$1}')
        echo -e "\${subj}\t\${nvox}" >> qc_summary.tsv
    done
    echo "=== QC Summary ==="
    cat qc_summary.tsv
    """
}

workflow {
    t1w_ch = Channel.fromPath(params.inputs)

    BET(t1w_ch)

    QC_SUMMARY(
        BET.out[0].collect(),
        BET.out[1].collect()
    )
}
Overwriting bet_multi.nf
!nextflow run bet_multi.nf
 N E X T F L O W   ~  version 25.10.4

Launching `bet_multi.nf` [reverent_picasso] DSL2 - revision: ef713f98e8

[-        ] BET -

[-        ] BET        -
[-        ] QC_SUMMARY -

executor >  local (1)
[61/81fd38] BET (1)    | 0 of 1
[-        ] QC_SUMMARY -

executor >  local (1)
[61/81fd38] BET (1)    | 0 of 1
[-        ] QC_SUMMARY -

executor >  local (1)
[61/81fd38] BET (1)    | 1 of 1
[-        ] QC_SUMMARY -

executor >  local (2)
[61/81fd38] BET (1)    | 1 of 1
[e7/2ec298] QC_SUMMARY | 0 of 1

executor >  local (2)
[61/81fd38] BET (1)    | 1 of 1
[e7/2ec298] QC_SUMMARY | 1 of 1

executor >  local (2)
[61/81fd38] BET (1)    | 1 of 1
[e7/2ec298] QC_SUMMARY | 1 of 1
%%bash
echo "=== Output files ==="
ls -lh results_multi/bet/
echo ""
echo "=== QC Summary ==="
cat results_multi/qc_summary.tsv
=== Output files ===
total 19M
-rw-rw-r-- 1 jovyan jovyan  18M Feb 28 05:37 sub-01_brain.nii.gz
-rw-rw-r-- 1 jovyan jovya
n 324K Feb 28 05:37 sub-01_brain_mask.nii.gz

=== QC Summary ===
subject	brain_volume_voxels
sub-01	7879179

What’s new here?

  • Channel.fromPath('data/sub-*.nii') picks up both sub-01.nii and sub-02.nii

  • Nextflow runs BET twice in parallel (one per subject)

  • .collect() gathers all per-subject outputs into a single list and passes it to QC_SUMMARY

  • QC_SUMMARY runs once, after all BET tasks complete, and generates a combined table

This fan-out/collect pattern is the foundation of most neuroimaging Nextflow pipelines.

Visualize results#

Use ipyniivue to visualize results. Enable the first line to see the comparison.

from ipyniivue import NiiVue

nv = NiiVue()
nv.load_volumes([
    # {"url": "https://huggingface.co/datasets/neurodeskorg/neurodeskedu/resolve/main/data/examples/workflows/nextflow_neurodesk/sub-01_4f9df47e3a33.nii", "colormap": "gray"},
    {"url": "https://huggingface.co/datasets/neurodeskorg/neurodeskedu/resolve/main/data/examples/workflows/nextflow_neurodesk/sub-01_brain_9170ee068925.nii.gz", "colormap": "red"}
])
nv

Next steps#

You now know the core Nextflow patterns. Here are some ways to extend what you’ve learned:

  • -resume: Add this flag to skip already-completed tasks when re-running a pipeline after a failure or parameter change

  • nextflow.config: Move parameters, executor settings (local/SLURM/PBS), and resource limits (CPUs, memory) into a separate config file

  • Containers: Nextflow can pull and run Docker/Singularity containers per process — set container in a process or config

  • nf-core: Browse nf-co.re for production-grade neuroimaging pipelines and community best practices

  • More modalities: Extend the glob pattern (params.inputs) to pick up T2w, FLAIR, or functional data

Dependencies and environment capture#

  • Using the package watermark to document system environment and software versions used in this notebook

%load_ext watermark

%watermark
%watermark --iversions
Last updated: 2026-02-28T05:37:18.027773+00:00

Python implementation: CPython
Python version       : 3.13.11
IPython version      : 9.9.0

Compiler    : GCC 14.3.0
OS          : Linux
Release     : 5.15.0-170-generic
Machine     : x86_64
Processor   : x86_64
CPU cores   : 32
Architecture: 64bit

ipyniivue: 2.4.4