Nipype on Neurodesk#
An interactive RISE slideshow#
Author: Monika Doerig
Press Space
to proceed through the slideshow.
Set up Neurodesk#
In code cells you press Shift-Enter
(as usual) to evaluate your code and directly move to the next cell if it is already displayed.
Press Ctrl-Enter
to run a command without direclty moving to the next cell.
%%capture
import os
import sys
IN_COLAB = 'google.colab' in sys.modules
if IN_COLAB:
os.environ["LD_PRELOAD"] = "";
os.environ["APPTAINER_BINDPATH"] = "/content,/tmp,/cvmfs"
os.environ["MPLCONFIGDIR"] = "/content/matplotlib-mpldir"
os.environ["LMOD_CMD"] = "/usr/share/lmod/lmod/libexec/lmod"
!curl -J -O https://raw.githubusercontent.com/NeuroDesk/neurocommand/main/googlecolab_setup.sh
!chmod +x googlecolab_setup.sh
!./googlecolab_setup.sh
os.environ["MODULEPATH"] = ':'.join(map(str, list(map(lambda x: os.path.join(os.path.abspath('/cvmfs/neurodesk.ardc.edu.au/neurodesk-modules/'), x),os.listdir('/cvmfs/neurodesk.ardc.edu.au/neurodesk-modules/')))))
# Output CPU information:
!cat /proc/cpuinfo | grep 'vendor' | uniq
!cat /proc/cpuinfo | grep 'model name' | uniq
vendor_id : GenuineIntel
model name : Intel(R) Xeon(R) Gold 6126 CPU @ 2.60GHz
Keep pressing Space
to advance to the next slide.
Objectives
- Know the basics of Nipype
- And how to use it on Neurodesk
- Learn how Python can be applied to analyze neuroimaging data through practical examples
- Get pointers to resources
Be aware ...
- Nipype is part of a large ecosystem
- Therefore, it is about knowing what is out there and empowering you with new tools
- Sometimes, the devil is in the details
- Things take time
Table of content#
1. Introduction to Nipype
2. Nipype in Jupyter Notebooks on Neurodesk
3. Exploration of Nipype’s building blocks
4. Pydra: A modern dataflow engine developed for the Nipype project
1. Introduction to Nipype#
Open-source Python project that originated within the neuroimaging community
Provides a unified interface to diverse neuroimaging packages including ANTS, SPM, FSL, FreeSurfer, and others
Facilitates seamless interaction between these packages
Its flexibility has made it a preferred basis for widely used pre-processing tools such as fMRIPrep
$\rightarrow$ A primary goal driving Nipype is to simplify the integration of various analysis packages, allowing for the utilization of algorithms that are most appropriate for specific problems.
2. Nipype in Jupyter Notebooks on Neurodesk#
Neurodesk project enables the use of all neuroimaging applications inside computational notebooks
Demonstration of the module system in Python and Nipype:
We will use the software tool lmod
to manage and load different software packages and libraires. It simplifies the process of accessing and utilizing various software applications and allows users to easily switch between different versions of software packages, manage dependencies, and ensure compatibility with their computing environment.
# In code cells you press Shift-Enter to evaluate your code and directly move to the next cell if it is already displayed.
# Or press Ctrl-Enter to run a command without direclty moving to the next cell.
# Use lmod to load any software tool with a specific version
import lmod
await lmod.load('fsl/6.0.7.4')
await lmod.list()
['fsl/6.0.7.4']
import os
os.environ["FSLOUTPUTTYPE"]="NIFTI_GZ" # Default is NIFTI
from nipype.interfaces.fsl.base import Info
print(Info.version())
print(Info.output_type())
# If the FSL version is changed using lmod above, the kernel of the notebook needs to be restarted!
6.0.7.4
NIFTI_GZ
# Load afni and spm as well
await lmod.load('afni/22.3.06')
await lmod.load('spm12/r7771')
await lmod.list()
['fsl/6.0.7.4', 'afni/22.3.06', 'spm12/r7771']
3. Exploration of Nipype’s building blocks#
Interfaces: Wraps a program/ function
Workflow engine:
Nodes: Wraps an interface for use in a workflow
Workflows: A directed graph or forest of graphs whose edges represent data flow
Data Input: Many different modules to grab/ select data depending on the data structure
Data Output: Different modules to handle data stream output
Plugin: A component that describes how a Workflow should be executed
Preparation: Download of opensource data, installations and imports#
# Download 2 subjects of the Flanker Dataset
PATTERN = "sub-0[1-2]"
!datalad install https://github.com/OpenNeuroDatasets/ds000102.git
!cd ds000102 && datalad get $PATTERN
Cloning: 0%| | 0.00/2.00 [00:00<?, ? candidates/s]
Enumerating: 0.00 Objects [00:00, ? Objects/s]
Counting: 0%| | 0.00/27.0 [00:00<?, ? Objects/s]
Compressing: 0%| | 0.00/23.0 [00:00<?, ? Objects/s]
Receiving: 0%| | 0.00/2.15k [00:00<?, ? Objects/s]
Resolving: 0%| | 0.00/537 [00:00<?, ? Deltas/s]
[INFO ] scanning for unlocked files (this may take some time)
[INFO ] Remote origin not usable by git-annex; setting annex-ignore
[INFO ] access to 1 dataset sibling s3-PRIVATE not auto-enabled, enable with:
| datalad siblings -d "/home/jovyan/neurodesktop-storage/ds000102" enable -s s3-PRIVATE
install(ok): /home/jovyan/neurodesktop-storage/ds000102 (dataset)
Total: 0%| | 0.00/136M [00:00<?, ? Bytes/s]
Get sub-02/a .. 2_T1w.nii.gz: 0%| | 0.00/10.7M [00:00<?, ? Bytes/s]
Get sub-02/a .. 2_T1w.nii.gz: 1%| | 70.1k/10.7M [00:00<00:28, 371k Bytes/s]
Get sub-02/a .. 2_T1w.nii.gz: 2%| | 209k/10.7M [00:00<00:17, 590k Bytes/s]
Get sub-02/a .. 2_T1w.nii.gz: 5%|▏ | 558k/10.7M [00:00<00:06, 1.48M Bytes/s]
Get sub-02/a .. 2_T1w.nii.gz: 10%|▎ | 1.04M/10.7M [00:00<00:04, 2.03M Bytes/s]
Get sub-02/a .. 2_T1w.nii.gz: 21%|▌ | 2.22M/10.7M [00:00<00:01, 4.56M Bytes/s]
Get sub-02/a .. 2_T1w.nii.gz: 33%|▉ | 3.50M/10.7M [00:00<00:01, 6.79M Bytes/s]
Get sub-02/a .. 2_T1w.nii.gz: 41%|█▏ | 4.39M/10.7M [00:00<00:00, 7.29M Bytes/s]
Get sub-02/a .. 2_T1w.nii.gz: 64%|█▉ | 6.86M/10.7M [00:01<00:00, 12.2M Bytes/s]
Get sub-02/a .. 2_T1w.nii.gz: 84%|██▌| 9.01M/10.7M [00:01<00:00, 14.9M Bytes/s]
Total: 8%|██▏ | 10.7M/136M [00:01<00:22, 5.57M Bytes/s]
Get sub-02/f .. _bold.nii.gz: 0%| | 0.00/29.2M [00:00<?, ? Bytes/s]
Get sub-02/f .. _bold.nii.gz: 6%|▏ | 1.81M/29.2M [00:00<00:01, 16.9M Bytes/s]
Get sub-02/f .. _bold.nii.gz: 12%|▎ | 3.64M/29.2M [00:00<00:01, 15.7M Bytes/s]
Get sub-02/f .. _bold.nii.gz: 25%|▊ | 7.44M/29.2M [00:00<00:01, 18.9M Bytes/s]
Get sub-02/f .. _bold.nii.gz: 38%|█▏ | 11.2M/29.2M [00:00<00:00, 18.8M Bytes/s]
Get sub-02/f .. _bold.nii.gz: 45%|█▎ | 13.2M/29.2M [00:00<00:00, 19.1M Bytes/s]
Get sub-02/f .. _bold.nii.gz: 53%|█▌ | 15.5M/29.2M [00:00<00:00, 20.0M Bytes/s]
Get sub-02/f .. _bold.nii.gz: 60%|█▊ | 17.5M/29.2M [00:00<00:00, 20.1M Bytes/s]
Get sub-02/f .. _bold.nii.gz: 75%|██▏| 21.8M/29.2M [00:01<00:00, 20.6M Bytes/s]
Get sub-02/f .. _bold.nii.gz: 83%|██▍| 24.2M/29.2M [00:01<00:00, 20.5M Bytes/s]
Get sub-02/f .. _bold.nii.gz: 97%|██▉| 28.2M/29.2M [00:01<00:00, 20.9M Bytes/s]
Total: 29%|███████▉ | 39.9M/136M [00:03<00:09, 10.5M Bytes/s]
Get sub-02/f .. _bold.nii.gz: 0%| | 0.00/29.2M [00:00<?, ? Bytes/s]
Get sub-02/f .. _bold.nii.gz: 8%|▏ | 2.43M/29.2M [00:00<00:01, 24.2M Bytes/s]
Get sub-02/f .. _bold.nii.gz: 20%|▌ | 5.80M/29.2M [00:00<00:00, 24.1M Bytes/s]
Get sub-02/f .. _bold.nii.gz: 32%|▉ | 9.31M/29.2M [00:00<00:00, 23.0M Bytes/s]
Get sub-02/f .. _bold.nii.gz: 46%|█▍ | 13.4M/29.2M [00:00<00:00, 21.7M Bytes/s]
Get sub-02/f .. _bold.nii.gz: 54%|█▋ | 15.9M/29.2M [00:00<00:00, 22.5M Bytes/s]
Get sub-02/f .. _bold.nii.gz: 70%|██ | 20.6M/29.2M [00:00<00:00, 22.9M Bytes/s]
Get sub-02/f .. _bold.nii.gz: 84%|██▌| 24.5M/29.2M [00:01<00:00, 21.6M Bytes/s]
Get sub-02/f .. _bold.nii.gz: 94%|██▊| 27.4M/29.2M [00:01<00:00, 21.5M Bytes/s]
Total: 51%|█████████████▋ | 69.1M/136M [00:05<00:05, 12.6M Bytes/s]
Get sub-01/a .. 1_T1w.nii.gz: 0%| | 0.00/10.6M [00:00<?, ? Bytes/s]
Get sub-01/a .. 1_T1w.nii.gz: 32%|▉ | 3.40M/10.6M [00:00<00:00, 17.0M Bytes/s]
Get sub-01/a .. 1_T1w.nii.gz: 57%|█▋ | 6.02M/10.6M [00:00<00:00, 14.7M Bytes/s]
Get sub-01/a .. 1_T1w.nii.gz: 82%|██▍| 8.67M/10.6M [00:00<00:00, 18.2M Bytes/s]
Get sub-01/f .. _bold.nii.gz: 0%| | 0.00/28.1M [00:00<?, ? Bytes/s]
Get sub-01/f .. _bold.nii.gz: 7%|▏ | 1.82M/28.1M [00:00<00:01, 18.2M Bytes/s]
Get sub-01/f .. _bold.nii.gz: 18%|▌ | 5.14M/28.1M [00:00<00:01, 17.8M Bytes/s]
Get sub-01/f .. _bold.nii.gz: 25%|▊ | 7.15M/28.1M [00:00<00:01, 17.9M Bytes/s]
Get sub-01/f .. _bold.nii.gz: 33%|▉ | 9.14M/28.1M [00:00<00:01, 18.5M Bytes/s]
Get sub-01/f .. _bold.nii.gz: 46%|█▍ | 12.9M/28.1M [00:00<00:00, 18.6M Bytes/s]
Get sub-01/f .. _bold.nii.gz: 59%|█▊ | 16.6M/28.1M [00:00<00:00, 18.6M Bytes/s]
Get sub-01/f .. _bold.nii.gz: 72%|██▏| 20.2M/28.1M [00:01<00:00, 18.3M Bytes/s]
Get sub-01/f .. _bold.nii.gz: 79%|██▎| 22.0M/28.1M [00:01<00:00, 17.8M Bytes/s]
Get sub-01/f .. _bold.nii.gz: 85%|██▌| 23.8M/28.1M [00:01<00:00, 17.8M Bytes/s]
Get sub-01/f .. _bold.nii.gz: 92%|██▊| 25.7M/28.1M [00:01<00:00, 18.1M Bytes/s]
Get sub-01/f .. _bold.nii.gz: 99%|██▉| 27.7M/28.1M [00:01<00:00, 18.4M Bytes/s]
Total: 79%|██████████████████████▏ | 108M/136M [00:08<00:02, 13.1M Bytes/s]
Get sub-01/f .. _bold.nii.gz: 0%| | 0.00/28.1M [00:00<?, ? Bytes/s]
Get sub-01/f .. _bold.nii.gz: 14%|▍ | 3.84M/28.1M [00:00<00:01, 19.2M Bytes/s]
Get sub-01/f .. _bold.nii.gz: 21%|▌ | 5.82M/28.1M [00:00<00:01, 19.4M Bytes/s]
Get sub-01/f .. _bold.nii.gz: 28%|▊ | 7.80M/28.1M [00:00<00:01, 19.5M Bytes/s]
Get sub-01/f .. _bold.nii.gz: 35%|█ | 9.80M/28.1M [00:00<00:00, 19.6M Bytes/s]
Get sub-01/f .. _bold.nii.gz: 49%|█▍ | 13.8M/28.1M [00:00<00:00, 19.7M Bytes/s]
Get sub-01/f .. _bold.nii.gz: 63%|█▉ | 17.8M/28.1M [00:00<00:00, 19.5M Bytes/s]
Get sub-01/f .. _bold.nii.gz: 77%|██▎| 21.7M/28.1M [00:01<00:00, 19.0M Bytes/s]
Get sub-01/f .. _bold.nii.gz: 84%|██▌| 23.7M/28.1M [00:01<00:00, 19.2M Bytes/s]
Get sub-01/f .. _bold.nii.gz: 99%|██▉| 27.8M/28.1M [00:01<00:00, 19.7M Bytes/s]
get(ok): sub-02/anat/sub-02_T1w.nii.gz (file) [from s3-PUBLIC...]
get(ok): sub-02/func/sub-02_task-flanker_run-1_bold.nii.gz (file) [from s3-PUBLIC...]
get(ok): sub-02/func/sub-02_task-flanker_run-2_bold.nii.gz (file) [from s3-PUBLIC...]
get(ok): sub-01/anat/sub-01_T1w.nii.gz (file) [from s3-PUBLIC...]
get(ok): sub-01/func/sub-01_task-flanker_run-1_bold.nii.gz (file) [from s3-PUBLIC...]
get(ok): sub-01/func/sub-01_task-flanker_run-2_bold.nii.gz (file) [from s3-PUBLIC...]
get(ok): sub-02 (directory)
get(ok): sub-01 (directory)
action summary:
get (ok: 8)
! pip install nilearn
Requirement already satisfied: nilearn in /opt/conda/lib/python3.11/site-packages (0.10.4)
Requirement already satisfied: joblib>=1.0.0 in /opt/conda/lib/python3.11/site-packages (from nilearn) (1.4.2)
Requirement already satisfied: lxml in /opt/conda/lib/python3.11/site-packages (from nilearn) (5.2.1)
Requirement already satisfied: nibabel>=4.0.0 in /opt/conda/lib/python3.11/site-packages (from nilearn) (5.2.1)
Requirement already satisfied: numpy>=1.19.0 in /opt/conda/lib/python3.11/site-packages (from nilearn) (1.26.4)
Requirement already satisfied: packaging in /opt/conda/lib/python3.11/site-packages (from nilearn) (23.2)
Requirement already satisfied: pandas>=1.1.5 in /opt/conda/lib/python3.11/site-packages (from nilearn) (2.2.2)
Requirement already satisfied: requests>=2.25.0 in /opt/conda/lib/python3.11/site-packages (from nilearn) (2.31.0)
Requirement already satisfied: scikit-learn>=1.0.0 in /opt/conda/lib/python3.11/site-packages (from nilearn) (1.5.0)
Requirement already satisfied: scipy>=1.8.0 in /opt/conda/lib/python3.11/site-packages (from nilearn) (1.13.0)
Requirement already satisfied: python-dateutil>=2.8.2 in /opt/conda/lib/python3.11/site-packages (from pandas>=1.1.5->nilearn) (2.8.2)
Requirement already satisfied: pytz>=2020.1 in /opt/conda/lib/python3.11/site-packages (from pandas>=1.1.5->nilearn) (2023.3)
Requirement already satisfied: tzdata>=2022.7 in /opt/conda/lib/python3.11/site-packages (from pandas>=1.1.5->nilearn) (2024.1)
Requirement already satisfied: charset-normalizer<4,>=2 in /opt/conda/lib/python3.11/site-packages (from requests>=2.25.0->nilearn) (3.3.0)
Requirement already satisfied: idna<4,>=2.5 in /opt/conda/lib/python3.11/site-packages (from requests>=2.25.0->nilearn) (3.4)
Requirement already satisfied: urllib3<3,>=1.21.1 in /opt/conda/lib/python3.11/site-packages (from requests>=2.25.0->nilearn) (2.0.7)
Requirement already satisfied: certifi>=2017.4.17 in /opt/conda/lib/python3.11/site-packages (from requests>=2.25.0->nilearn) (2024.2.2)
Requirement already satisfied: threadpoolctl>=3.1.0 in /opt/conda/lib/python3.11/site-packages (from scikit-learn>=1.0.0->nilearn) (3.5.0)
Requirement already satisfied: six>=1.5 in /opt/conda/lib/python3.11/site-packages (from python-dateutil>=2.8.2->pandas>=1.1.5->nilearn) (1.16.0)
from nipype import Node, Workflow, DataGrabber, DataSink
from nipype.interfaces.utility import IdentityInterface
from nipype.interfaces import fsl
from nilearn import plotting
from IPython.display import Image
import os
from os.path import join as opj
import matplotlib.pyplot as plt
import numpy as np
import nibabel as nib
# Create directory for all the outputs (if it doesn't exist yet)
! [ ! -d output ] && mkdir output
3.1. Interfaces : The core pieces of Nipype#
Python wrapper around a particular piece of software (even if it is written in another programming language than python):
- FSL
- AFNI
- ANTS
- FreeSurfer
- SPM
- dcm2nii
- Nipy
- MNE
- DIPY
- ...
Such an interface knows what sort of options an external program has and how to execute it (e.g., keeps track of the inputs and outputs, and checks their expected types).
In the Nipype framework we can get an information page on an interface class by using the help()
function.
Example: Interface for FSL’s Brain Extraction Tool BET#
# help() function to get a general explanation of the class as well as a list of possible (mandatory and optional) input and output parameters
fsl.BET.help()
Wraps the executable command ``bet``.
FSL BET wrapper for skull stripping
For complete details, see the `BET Documentation.
<https://fsl.fmrib.ox.ac.uk/fsl/fslwiki/BET/UserGuide>`_
Examples
--------
>>> from nipype.interfaces import fsl
>>> btr = fsl.BET()
>>> btr.inputs.in_file = 'structural.nii'
>>> btr.inputs.frac = 0.7
>>> btr.inputs.out_file = 'brain_anat.nii'
>>> btr.cmdline
'bet structural.nii brain_anat.nii -f 0.70'
>>> res = btr.run() # doctest: +SKIP
Inputs::
[Mandatory]
in_file: (a pathlike object or string representing an existing file)
input file to skull strip
argument: ``%s``, position: 0
[Optional]
out_file: (a pathlike object or string representing a file)
name of output skull stripped image
argument: ``%s``, position: 1
outline: (a boolean)
create surface outline image
argument: ``-o``
mask: (a boolean)
create binary mask image
argument: ``-m``
skull: (a boolean)
create skull image
argument: ``-s``
no_output: (a boolean)
Don't generate segmented output
argument: ``-n``
frac: (a float)
fractional intensity threshold
argument: ``-f %.2f``
vertical_gradient: (a float)
vertical gradient in fractional intensity threshold (-1, 1)
argument: ``-g %.2f``
radius: (an integer)
head radius
argument: ``-r %d``
center: (a list of at most 3 items which are an integer)
center of gravity in voxels
argument: ``-c %s``
threshold: (a boolean)
apply thresholding to segmented brain image and mask
argument: ``-t``
mesh: (a boolean)
generate a vtk mesh brain surface
argument: ``-e``
robust: (a boolean)
robust brain centre estimation (iterates BET several times)
argument: ``-R``
mutually_exclusive: functional, reduce_bias, robust, padding,
remove_eyes, surfaces, t2_guided
padding: (a boolean)
improve BET if FOV is very small in Z (by temporarily padding end
slices)
argument: ``-Z``
mutually_exclusive: functional, reduce_bias, robust, padding,
remove_eyes, surfaces, t2_guided
remove_eyes: (a boolean)
eye & optic nerve cleanup (can be useful in SIENA)
argument: ``-S``
mutually_exclusive: functional, reduce_bias, robust, padding,
remove_eyes, surfaces, t2_guided
surfaces: (a boolean)
run bet2 and then betsurf to get additional skull and scalp surfaces
(includes registrations)
argument: ``-A``
mutually_exclusive: functional, reduce_bias, robust, padding,
remove_eyes, surfaces, t2_guided
t2_guided: (a pathlike object or string representing a file)
as with creating surfaces, when also feeding in non-brain-extracted
T2 (includes registrations)
argument: ``-A2 %s``
mutually_exclusive: functional, reduce_bias, robust, padding,
remove_eyes, surfaces, t2_guided
functional: (a boolean)
apply to 4D fMRI data
argument: ``-F``
mutually_exclusive: functional, reduce_bias, robust, padding,
remove_eyes, surfaces, t2_guided
reduce_bias: (a boolean)
bias field and neck cleanup
argument: ``-B``
mutually_exclusive: functional, reduce_bias, robust, padding,
remove_eyes, surfaces, t2_guided
output_type: ('NIFTI' or 'NIFTI_PAIR' or 'NIFTI_GZ' or
'NIFTI_PAIR_GZ')
FSL output type
args: (a string)
Additional parameters to the command
argument: ``%s``
environ: (a dictionary with keys which are a bytes or None or a value
of class 'str' and with values which are a bytes or None or a
value of class 'str', nipype default value: {})
Environment variables
Outputs::
out_file: (a pathlike object or string representing a file)
path/name of skullstripped file (if generated)
mask_file: (a pathlike object or string representing a file)
path/name of binary brain mask (if generated)
outline_file: (a pathlike object or string representing a file)
path/name of outline file (if generated)
meshfile: (a pathlike object or string representing a file)
path/name of vtk mesh file (if generated)
inskull_mask_file: (a pathlike object or string representing a file)
path/name of inskull mask (if generated)
inskull_mesh_file: (a pathlike object or string representing a file)
path/name of inskull mesh outline (if generated)
outskull_mask_file: (a pathlike object or string representing a file)
path/name of outskull mask (if generated)
outskull_mesh_file: (a pathlike object or string representing a file)
path/name of outskull mesh outline (if generated)
outskin_mask_file: (a pathlike object or string representing a file)
path/name of outskin mask (if generated)
outskin_mesh_file: (a pathlike object or string representing a file)
path/name of outskin mesh outline (if generated)
skull_mask_file: (a pathlike object or string representing a file)
path/name of skull mask (if generated)
skull_file: (a pathlike object or string representing a file)
path/name of skull file (if generated)
References:
-----------
None
# Create an instance of the fsl.BET object
skullstrip = fsl.BET()
# Set input (and output)
skullstrip.inputs.in_file = 'ds000102/sub-01/anat/sub-01_T1w.nii.gz'
skullstrip.inputs.out_file = 'output/T1w_nipype_bet.nii.gz' # Interfaces by default spit out results to the local directory why relative paths work (outputs are not stored in temporary files like in Nodes/Workflow)
# Execute the node and shows outputs
res = skullstrip.run()
res.outputs
inskull_mask_file = <undefined>
inskull_mesh_file = <undefined>
mask_file = <undefined>
meshfile = <undefined>
out_file = /home/jovyan/neurodesktop-storage/output/T1w_nipype_bet.nii.gz
outline_file = <undefined>
outskin_mask_file = <undefined>
outskin_mesh_file = <undefined>
outskull_mask_file = <undefined>
outskull_mesh_file = <undefined>
skull_file = <undefined>
skull_mask_file = <undefined>
# Gives you transparency to what's happening under the hood with one additional line
skullstrip.cmdline
'bet ds000102/sub-01/anat/sub-01_T1w.nii.gz output/T1w_nipype_bet.nii.gz'
3.2. Nodes: The light wrapper around interfaces#
To streamline the analysis and to execute multiple interfaces in a sensible order, they need to be put in a Node.
A node is an object that executes a certain function: Nipype interface, a user-specified function or an external script.
Each node consists of a name, an interface category and at least one input field, and at least one output field.
$\rightarrow$ Nodes expose inputs and outputs of the Interface as its own and add additional functionality allowing to connect Nodes into a Workflow (directed graph):
MapNode#
Quite similar to a normal Node, but it can take a list of inputs and operate over each input separately, ultimately returning a list of outputs.
Example: Multiple functional images (A) and each of them should be motion corrected (B1, B2, B3,..). Afterwards, put them all together into a GLM, i.e. the input for the GLM should be an array of [B1, B2, B3, …].
Iterables#
For repetitive steps: Iterables split up the execution workflow into many different branches.
Example: Running the same preprocessing on multiple subjects or doing statistical inference on multiple files.
JoinNode#
Has the opposite effect of iterables: JoinNode merges the different branches back into one node.
A JoinNode generalizes MapNode to operate in conjunction with an upstream iterable node to reassemble downstream results, e.g., to merge files into a group level analysis.
Example: Node#
nipype.pipeline.engine.nodes module
nodename = Nodetype(interface_function(), name='labelname')
nodename: Variable name of the node in the python environment.
Nodetype: Type of node: Node, MapNode or JoinNode.
interface_function: Function the node should execute. Can be user specific or coming from an Interface.
labelname: Label name of the node in the workflow environment (defines the name of the working directory).
To execute a node, apply the
.run()
methodTo return the output fields of the underlying interface, use
.outputs
To get help,
.help()
prints the interface help
The specification of base_dir is very important (and is why we needed to use absolute paths above) because otherwise all the outputs would be saved somewhere in the temporary files. Unlike interfaces, which by default spit out results to the local directly, the Workflow engine executes things off in its own directory hierarchy.
# Create FSL BET Node with fractional intensity threshold of 0.3 and create a binary mask image
# For reasons that will become clear in the Workflow section, it's important to pass filenames to Nodes as absolute paths.
input_file = opj(os.getcwd(), 'ds000102/sub-01/anat/sub-01_T1w.nii.gz')
output_file = opj(os.getcwd(), 'output/T1w_nipype_bet.nii.gz')
# Create FSL BET Node with fractional intensity threshold of 0.3 and create a binary mask image
bet = Node(fsl.BET(), name='bet_node')
# Define inputs
bet.inputs.frac = 0.3
bet.inputs.mask = True
bet.inputs.in_file = input_file
bet.inputs.out_file = output_file
# Run the node
res = bet.run()
240613-06:55:40,240 nipype.workflow INFO:
[Node] Setting-up "bet_node" in "/tmp/tmpbdkoaxdf/bet_node".
240613-06:55:40,244 nipype.workflow INFO:
[Node] Executing "bet_node" <nipype.interfaces.fsl.preprocess.BET>
240613-06:55:44,124 nipype.workflow INFO:
[Node] Finished "bet_node", elapsed time 3.879271s.
# Shows produced outputs
res.outputs
inskull_mask_file = <undefined>
inskull_mesh_file = <undefined>
mask_file = /home/jovyan/neurodesktop-storage/output/T1w_nipype_bet_mask.nii.gz
meshfile = <undefined>
out_file = /home/jovyan/neurodesktop-storage/output/T1w_nipype_bet.nii.gz
outline_file = <undefined>
outskin_mask_file = <undefined>
outskin_mesh_file = <undefined>
outskull_mask_file = <undefined>
outskull_mesh_file = <undefined>
skull_file = <undefined>
skull_mask_file = <undefined>
# Plot original input file
plotting.plot_anat(input_file, title='BET input', cut_coords=(10,10,10),
display_mode='ortho', dim=-1, draw_cross=False, annotate=False);
# Plot skullstripped output file (out_file) through the outputs property
plotting.plot_anat(res.outputs.out_file, title='BET output', cut_coords=(10,10,10),
display_mode='ortho', dim=-1, draw_cross=False, annotate=False);
3.3. Workflows#
Define functionality for pipelined execution of interfaces
Consist of multiple nodes, each representing a specific interface.
The processing stream is encoded as a directed acyclic graph (DAG), where each stage of processing is a node. Nodes are unidirectionally dependent on others, ensuring no cycles and clear directionality. The Node and Workflow classes make these relationships explicit.
Edges represent the data flow between nodes.
Control the setup and the execution of individual interfaces.
Will take care of inputs and outputs of each interface and arrange the execution of each interface in the most efficient way.
nipype.pipeline.engine.workflows module
Workflow(name, base_dir=None)
name: Label name of the workflow.
base_dir: Defines the working directory for this instance of workflow element. Unlike interfaces, which by default store results in the local directory, the Workflow engine executes things off in its own directory hierarchy. By default (if not set manually), it is a temporary directory (/tmp).
Workflow methods that we will use during this tutorial:
Workflow.connect()
: Connect nodes in the pipelineWorkflow.write_graph()
: Generates a graphviz dot file and a png fileWorkflow.run()
: Execute the workflow
Example: Workflow#
First, define different nodes to:
Skullstrip an image to obtain a mask
Smooth the original image
Mask the smoothed image
in_file = input_file # See node example
# Skullstrip process
skullstrip = Node(fsl.BET(in_file=in_file, mask=True), name="skullstrip")
# Smooth process
smooth = Node(fsl.IsotropicSmooth(in_file=in_file, fwhm=4), name="smooth")
# Mask process
mask = Node(fsl.ApplyMask(), name="mask")
# Create a working directory for all workflows created during this workshop
! [ ! -d output/working_dir ] && mkdir output/working_dir
wf_work_dir = opj(os.getcwd(), 'output/working_dir')
# Initiation of a workflow with specifying the working directory.
# This specification of base_dir is very important (and is why we needed to use absolute paths above for the input files) because otherwise all the outputs would be saved somewhere in the temporary files.
wf = Workflow(name="smoothflow", base_dir=wf_work_dir )
Connect nodes within a workflow#
method called
connect
that is going to do most of the workchecks if inputs and outputs are actually provided by the nodes that are being connected
$\rightarrow$ There are two different ways to call connect:
Establish one connection at a time:
wf.connect(source, "source_output", dest, "dest_input")
Establish multiple connections between two nodes at once:
wf.connect([(source, dest, [("source_output1", "dest_input1"),
("source_output2", "dest_input2")
])
])
# Option 1: connect the binary mask of the skullstripping process to the mask node
wf.connect(skullstrip, "mask_file", mask, "mask_file")
# Option 2: connect the output of the smoothing node to the input of the masking node
wf.connect([(smooth, mask, [("out_file", "in_file")])])
# Explore the workflow visually
wf.write_graph("workflow_graph.dot")
Image(filename=opj(wf_work_dir,"smoothflow/workflow_graph.png"))
240613-06:55:48,704 nipype.workflow INFO:
Generated workflow graph: /home/jovyan/neurodesktop-storage/output/working_dir/smoothflow/workflow_graph.png (graph2use=hierarchical, simple_form=True).
# Certain graph types also allow you to further inspect the individual connections between the nodes
wf.write_graph(graph2use='flat')
Image(filename=opj(wf_work_dir,"smoothflow/graph_detailed.png"))
240613-06:55:49,23 nipype.workflow INFO:
Generated workflow graph: /home/jovyan/neurodesktop-storage/output/working_dir/smoothflow/graph.png (graph2use=flat, simple_form=True).
# Execute the workflow (running serially here)
wf.run()
240613-06:55:49,35 nipype.workflow INFO:
Workflow smoothflow settings: ['check', 'execution', 'logging', 'monitoring']
240613-06:55:49,45 nipype.workflow INFO:
Running serially.
240613-06:55:49,46 nipype.workflow INFO:
[Node] Setting-up "smoothflow.skullstrip" in "/home/jovyan/neurodesktop-storage/output/working_dir/smoothflow/skullstrip".
240613-06:55:49,49 nipype.workflow INFO:
[Node] Executing "skullstrip" <nipype.interfaces.fsl.preprocess.BET>
240613-06:55:52,889 nipype.workflow INFO:
[Node] Finished "skullstrip", elapsed time 3.8385249999999997s.
240613-06:55:52,892 nipype.workflow INFO:
[Node] Setting-up "smoothflow.smooth" in "/home/jovyan/neurodesktop-storage/output/working_dir/smoothflow/smooth".
240613-06:55:52,894 nipype.workflow INFO:
[Node] Executing "smooth" <nipype.interfaces.fsl.maths.IsotropicSmooth>
240613-06:56:00,41 nipype.workflow INFO:
[Node] Finished "smooth", elapsed time 7.145348s.
240613-06:56:00,50 nipype.workflow INFO:
[Node] Setting-up "smoothflow.mask" in "/home/jovyan/neurodesktop-storage/output/working_dir/smoothflow/mask".
240613-06:56:00,53 nipype.workflow INFO:
[Node] Executing "mask" <nipype.interfaces.fsl.maths.ApplyMask>
240613-06:56:01,408 nipype.workflow INFO:
[Node] Finished "mask", elapsed time 1.353534s.
<networkx.classes.digraph.DiGraph at 0x7f5f6c965450>
# Check the working directories of the workflow
!tree output/working_dir/smoothflow/ -I '*js|*json|*html|*pklz|_report'
output/working_dir/smoothflow/
├── graph.dot
├── graph.png
├── graph_detailed.dot
├── graph_detailed.png
├── mask
│ ├── command.txt
│ └── sub-01_T1w_smooth_masked.nii.gz
├── skullstrip
│ ├── command.txt
│ └── sub-01_T1w_brain_mask.nii.gz
├── smooth
│ ├── command.txt
│ └── sub-01_T1w_smooth.nii.gz
├── workflow_graph.dot
└── workflow_graph.png
3 directories, 12 files
# Helper function to plot 3D NIfTI images
def plot_slice(fname):
# Load the image
img = nib.load(fname)
data = img.get_fdata()
# Cut in the middle of the brain
cut = int(data.shape[-1]/2) + 10
# Plot the data
plt.imshow(np.rot90(data[..., cut]), cmap="gray")
plt.gca().set_axis_off()
f = plt.figure(figsize=(12, 4))
for i, img in enumerate([input_file,
opj(wf_work_dir, "smoothflow/smooth/sub-01_T1w_smooth.nii.gz"),
opj(wf_work_dir, "smoothflow/skullstrip/sub-01_T1w_brain_mask.nii.gz"),
opj(wf_work_dir, "smoothflow/mask/sub-01_T1w_smooth_masked.nii.gz")]):
f.add_subplot(1, 4, i + 1)
plot_slice(img)
3.4. Execution Plugins: Execution on different systems#
Allow seamless execution across many architectures and make using parallel computation quite easy.
Local Machines:
Serial: Runs the workflow one node at a time in a single process locally. The order of the nodes is determined by a topolocial sort of the workflow.
Multicore: Uses the Python multiprocessing library to distribute jobs as new processes on a local system.
Submission to Cluster Schedulers:
Plugins like HTCondor, PBS, SLURM, SGE, OAR, and LSF submit jobs to clusters managed by these job scheduling systems.
Advanced Cluster Integration:
DAGMan: Manages complex workflow dependencies for submission to DAGMan cluster scheduler.
IPython: Utilizes IPython parallel computing capabilities for distributed execution in clusters.
Specialized Execution Plugins:
Soma-Workflow: Integrates with Soma-Workflow system for distributed execution in HPC environments.
Cluster operation often needs a special setup.
All plugins can be executed with:
workflow.run(plugin=PLUGIN_NAME, plugin_args=ARGS_DICT)
To run the workflow one node at a time:
wf.run(plugin='Linear')
To distribute processing on a multicore machine, number of processors/threads will be automatically detected:
wf.run(plugin='MultiProc')
Plugin arguments:
arguments = {'n_procs' : num_threads,
'memory_gb' : num_gb}
wf.run(plugin='MultiProc', plugin_args=arguments)
In order to use Nipype with SLURM simply call:
wf.run(plugin='SLURM')
Optional arguments:
template
: If you want to use your own job submission template (the plugin generates a basic one by default).
sbatch_args
: Takes any arguments such as nodes/partitions/gres/etc that you would want to pass on to the sbatch command underneath.
jobid_re
: Regular expression for custom job submission id search.
3.5. Data Input: First step of every analysis#
Nipype provides many different modules how to get the data into the framework.
We will work through an example with the DataGrabber module:
DataGrabber: Versatile input module to retrieve data from a local file system based on user-defined search criteria, including wildcard patterns, regular expressions, and directory hierarchies. It supports almost any file organization of your data.
But there are many more alternatives available:
SelectFiles: A simpler alternative to the DataGrabber interface, built on Python format strings. Format strings allow you to replace named sections of template strings set off by curly braces ({}).
BIDSDataGrabber: Get neuroimaging data organized in BIDS-compliant directory structures. It simplifies the process of accessing and organizing neuroimaging data for analysis pipelines.
DataFinder: Search for paths that match a given regular expression. Allows a less proscriptive approach to gathering input files compared to DataGrabber.
FreeSurferSource: Specific case of a file grabber that facilitates the data import of outputs from the FreeSurfer recon-all algorithm.
JSONFileGrabber: Datagrabber interface that loads a json file and generates an output for every first-level object.
S3DataGrabber: Pull data from an Amazon S3 Bucket.
SSHDataGrabber: Extension of DataGrabber module that downloads the file list and optionally the files from a SSH server.
XNATSource: Pull data from an XNAT server.
Example: DataGrabber#
Let’s assume we want to grab the anatomical and functional images of certain subjects of the Flanker dataset:
!tree -L 4 ds000102/ -I '*csv|*pdf'
ds000102/
├── CHANGES
├── README
├── T1w.json
├── dataset_description.json
├── derivatives
│ └── mriqc
├── participants.tsv
├── sub-01
│ ├── anat
│ │ └── sub-01_T1w.nii.gz -> ../../.git/annex/objects/Pf/6k/MD5E-s10581116--757e697a01eeea5c97a7d6fbc7153373.nii.gz/MD5E-s10581116--757e697a01eeea5c97a7d6fbc7153373.nii.gz
│ └── func
│ ├── sub-01_task-flanker_run-1_bold.nii.gz -> ../../.git/annex/objects/5m/w9/MD5E-s28061534--8e8c44ff53f9b5d46f2caae5916fa4ef.nii.gz/MD5E-s28061534--8e8c44ff53f9b5d46f2caae5916fa4ef.nii.gz
│ ├── sub-01_task-flanker_run-1_events.tsv
│ ├── sub-01_task-flanker_run-2_bold.nii.gz -> ../../.git/annex/objects/2F/58/MD5E-s28143286--f0bcf782c3688e2cf7149b4665949484.nii.gz/MD5E-s28143286--f0bcf782c3688e2cf7149b4665949484.nii.gz
│ └── sub-01_task-flanker_run-2_events.tsv
├── sub-02
│ ├── anat
│ │ └── sub-02_T1w.nii.gz -> ../../.git/annex/objects/3m/FF/MD5E-s10737123--cbd4181ee26559e8ec0a441fa2f834a7.nii.gz/MD5E-s10737123--cbd4181ee26559e8ec0a441fa2f834a7.nii.gz
│ └── func
│ ├── sub-02_task-flanker_run-1_bold.nii.gz -> ../../.git/annex/objects/8v/2j/MD5E-s29188378--80050f0deb13562c24f2fc23f8d095bd.nii.gz/MD5E-s29188378--80050f0deb13562c24f2fc23f8d095bd.nii.gz
│ ├── sub-02_task-flanker_run-1_events.tsv
│ ├── sub-02_task-flanker_run-2_bold.nii.gz -> ../../.git/annex/objects/fM/Kw/MD5E-s29193540--cc013f2d7d148b448edca8aada349d02.nii.gz/MD5E-s29193540--cc013f2d7d148b448edca8aada349d02.nii.gz
│ └── sub-02_task-flanker_run-2_events.tsv
├── sub-03
│ ├── anat
│ │ └── sub-03_T1w.nii.gz -> ../../.git/annex/objects/7W/9z/MD5E-s10707026--8f1858934cc7c7457e3a4a71cc2131fc.nii.gz/MD5E-s10707026--8f1858934cc7c7457e3a4a71cc2131fc.nii.gz
│ └── func
│ ├── sub-03_task-flanker_run-1_bold.nii.gz -> ../../.git/annex/objects/q6/kF/MD5E-s28755729--b19466702eee6b9385bd6e19e362f94c.nii.gz/MD5E-s28755729--b19466702eee6b9385bd6e19e362f94c.nii.gz
│ ├── sub-03_task-flanker_run-1_events.tsv
│ ├── sub-03_task-flanker_run-2_bold.nii.gz -> ../../.git/annex/objects/zV/K1/MD5E-s28782544--8d9700a435d08c90f0c1d534efdc8b69.nii.gz/MD5E-s28782544--8d9700a435d08c90f0c1d534efdc8b69.nii.gz
│ └── sub-03_task-flanker_run-2_events.tsv
├── sub-04
│ ├── anat
│ │ └── sub-04_T1w.nii.gz -> ../../.git/annex/objects/FW/14/MD5E-s10738444--2a9a2ba4ea7d2324c84bf5a2882f196c.nii.gz/MD5E-s10738444--2a9a2ba4ea7d2324c84bf5a2882f196c.nii.gz
│ └── func
│ ├── sub-04_task-flanker_run-1_bold.nii.gz -> ../../.git/annex/objects/9Z/0Q/MD5E-s29062799--27171406951ea275cb5857ea0dc32345.nii.gz/MD5E-s29062799--27171406951ea275cb5857ea0dc32345.nii.gz
│ ├── sub-04_task-flanker_run-1_events.tsv
│ ├── sub-04_task-flanker_run-2_bold.nii.gz -> ../../.git/annex/objects/FW/FZ/MD5E-s29071279--f89b61fe3ebab26df1374f2564bd95c2.nii.gz/MD5E-s29071279--f89b61fe3ebab26df1374f2564bd95c2.nii.gz
│ └── sub-04_task-flanker_run-2_events.tsv
├── sub-05
│ ├── anat
│ │ └── sub-05_T1w.nii.gz -> ../../.git/annex/objects/k2/Kj/MD5E-s10753867--c4b5788da5f4c627f0f5862da5f46c35.nii.gz/MD5E-s10753867--c4b5788da5f4c627f0f5862da5f46c35.nii.gz
│ └── func
│ ├── sub-05_task-flanker_run-1_bold.nii.gz -> ../../.git/annex/objects/VZ/z5/MD5E-s29667270--0ce9ac78b6aa9a77fc94c655a6ff5a06.nii.gz/MD5E-s29667270--0ce9ac78b6aa9a77fc94c655a6ff5a06.nii.gz
│ ├── sub-05_task-flanker_run-1_events.tsv
│ ├── sub-05_task-flanker_run-2_bold.nii.gz -> ../../.git/annex/objects/z7/MP/MD5E-s29660544--752750dabb21e2cf28e87d1d550a71b9.nii.gz/MD5E-s29660544--752750dabb21e2cf28e87d1d550a71b9.nii.gz
│ └── sub-05_task-flanker_run-2_events.tsv
├── sub-06
│ ├── anat
│ │ └── sub-06_T1w.nii.gz -> ../../.git/annex/objects/5w/G0/MD5E-s10620585--1132eab3830fe59b8a10b6582bb49004.nii.gz/MD5E-s10620585--1132eab3830fe59b8a10b6582bb49004.nii.gz
│ └── func
│ ├── sub-06_task-flanker_run-1_bold.nii.gz -> ../../.git/annex/objects/3x/qj/MD5E-s29386982--e671c0c647ce7d0d4596e35b702ee970.nii.gz/MD5E-s29386982--e671c0c647ce7d0d4596e35b702ee970.nii.gz
│ ├── sub-06_task-flanker_run-1_events.tsv
│ ├── sub-06_task-flanker_run-2_bold.nii.gz -> ../../.git/annex/objects/9j/6P/MD5E-s29379265--e513a2746d2b5c603f96044cf48c557c.nii.gz/MD5E-s29379265--e513a2746d2b5c603f96044cf48c557c.nii.gz
│ └── sub-06_task-flanker_run-2_events.tsv
├── sub-07
│ ├── anat
│ │ └── sub-07_T1w.nii.gz -> ../../.git/annex/objects/08/fF/MD5E-s10718092--38481fbc489dfb1ec4b174b57591a074.nii.gz/MD5E-s10718092--38481fbc489dfb1ec4b174b57591a074.nii.gz
│ └── func
│ ├── sub-07_task-flanker_run-1_bold.nii.gz -> ../../.git/annex/objects/z1/7W/MD5E-s28946009--5baf7a314874b280543fc0f91f2731af.nii.gz/MD5E-s28946009--5baf7a314874b280543fc0f91f2731af.nii.gz
│ ├── sub-07_task-flanker_run-1_events.tsv
│ ├── sub-07_task-flanker_run-2_bold.nii.gz -> ../../.git/annex/objects/Jf/W7/MD5E-s28960603--682e13963bfc49cc6ae05e9ba5c62619.nii.gz/MD5E-s28960603--682e13963bfc49cc6ae05e9ba5c62619.nii.gz
│ └── sub-07_task-flanker_run-2_events.tsv
├── sub-08
│ ├── anat
│ │ └── sub-08_T1w.nii.gz -> ../../.git/annex/objects/mw/MM/MD5E-s10561256--b94dddd8dc1c146aa8cd97f8d9994146.nii.gz/MD5E-s10561256--b94dddd8dc1c146aa8cd97f8d9994146.nii.gz
│ └── func
│ ├── sub-08_task-flanker_run-1_bold.nii.gz -> ../../.git/annex/objects/zX/v9/MD5E-s28641609--47314e6d1a14b8545686110b5b67f8b8.nii.gz/MD5E-s28641609--47314e6d1a14b8545686110b5b67f8b8.nii.gz
│ ├── sub-08_task-flanker_run-1_events.tsv
│ ├── sub-08_task-flanker_run-2_bold.nii.gz -> ../../.git/annex/objects/WZ/F0/MD5E-s28636310--4535bf26281e1c5556ad0d3468e7fe4e.nii.gz/MD5E-s28636310--4535bf26281e1c5556ad0d3468e7fe4e.nii.gz
│ └── sub-08_task-flanker_run-2_events.tsv
├── sub-09
│ ├── anat
│ │ └── sub-09_T1w.nii.gz -> ../../.git/annex/objects/QJ/ZZ/MD5E-s10775967--e6a18e64bc0a6b17254a9564cf9b8f82.nii.gz/MD5E-s10775967--e6a18e64bc0a6b17254a9564cf9b8f82.nii.gz
│ └── func
│ ├── sub-09_task-flanker_run-1_bold.nii.gz -> ../../.git/annex/objects/k9/1X/MD5E-s29200533--59e86a903e0ab3d1d320c794ba1f0777.nii.gz/MD5E-s29200533--59e86a903e0ab3d1d320c794ba1f0777.nii.gz
│ ├── sub-09_task-flanker_run-1_events.tsv
│ ├── sub-09_task-flanker_run-2_bold.nii.gz -> ../../.git/annex/objects/W3/94/MD5E-s29223017--7f3fb9e260d3bd28e29b0b586ce4c344.nii.gz/MD5E-s29223017--7f3fb9e260d3bd28e29b0b586ce4c344.nii.gz
│ └── sub-09_task-flanker_run-2_events.tsv
├── sub-10
│ ├── anat
│ │ └── sub-10_T1w.nii.gz -> ../../.git/annex/objects/5F/3f/MD5E-s10750712--bde2309077bffe22cb65e42ebdce5bfa.nii.gz/MD5E-s10750712--bde2309077bffe22cb65e42ebdce5bfa.nii.gz
│ └── func
│ ├── sub-10_task-flanker_run-1_bold.nii.gz -> ../../.git/annex/objects/3p/qp/MD5E-s29732696--339715d5cec387f4d44dfe94f304a429.nii.gz/MD5E-s29732696--339715d5cec387f4d44dfe94f304a429.nii.gz
│ ├── sub-10_task-flanker_run-1_events.tsv
│ ├── sub-10_task-flanker_run-2_bold.nii.gz -> ../../.git/annex/objects/11/Zx/MD5E-s29724034--16f2bf452524a315182f188becc1866d.nii.gz/MD5E-s29724034--16f2bf452524a315182f188becc1866d.nii.gz
│ └── sub-10_task-flanker_run-2_events.tsv
├── sub-11
│ ├── anat
│ │ └── sub-11_T1w.nii.gz -> ../../.git/annex/objects/kj/xX/MD5E-s10534963--9e5bff7ec0b5df2850e1d05b1af281ba.nii.gz/MD5E-s10534963--9e5bff7ec0b5df2850e1d05b1af281ba.nii.gz
│ └── func
│ ├── sub-11_task-flanker_run-1_bold.nii.gz -> ../../.git/annex/objects/35/fk/MD5E-s28226875--d5012074c2c7a0a394861b010bcf9a8f.nii.gz/MD5E-s28226875--d5012074c2c7a0a394861b010bcf9a8f.nii.gz
│ ├── sub-11_task-flanker_run-1_events.tsv
│ ├── sub-11_task-flanker_run-2_bold.nii.gz -> ../../.git/annex/objects/j7/ff/MD5E-s28198976--c0a64e3b549568c44bb40b1588027c9a.nii.gz/MD5E-s28198976--c0a64e3b549568c44bb40b1588027c9a.nii.gz
│ └── sub-11_task-flanker_run-2_events.tsv
├── sub-12
│ ├── anat
│ │ └── sub-12_T1w.nii.gz -> ../../.git/annex/objects/kx/2F/MD5E-s10550168--a7f651adc817b6678148b575654532a4.nii.gz/MD5E-s10550168--a7f651adc817b6678148b575654532a4.nii.gz
│ └── func
│ ├── sub-12_task-flanker_run-1_bold.nii.gz -> ../../.git/annex/objects/M0/fX/MD5E-s28403807--f1c3eb2e519020f4315a696ea845fc01.nii.gz/MD5E-s28403807--f1c3eb2e519020f4315a696ea845fc01.nii.gz
│ ├── sub-12_task-flanker_run-1_events.tsv
│ ├── sub-12_task-flanker_run-2_bold.nii.gz -> ../../.git/annex/objects/vW/V0/MD5E-s28424992--8740628349be3c056a0411bf4a852b25.nii.gz/MD5E-s28424992--8740628349be3c056a0411bf4a852b25.nii.gz
│ └── sub-12_task-flanker_run-2_events.tsv
├── sub-13
│ ├── anat
│ │ └── sub-13_T1w.nii.gz -> ../../.git/annex/objects/wM/Xw/MD5E-s10609761--440413c3251d182086105649164222c6.nii.gz/MD5E-s10609761--440413c3251d182086105649164222c6.nii.gz
│ └── func
│ ├── sub-13_task-flanker_run-1_bold.nii.gz -> ../../.git/annex/objects/mf/M4/MD5E-s28180916--aa35f4ad0cf630d6396a8a2dd1f3dda6.nii.gz/MD5E-s28180916--aa35f4ad0cf630d6396a8a2dd1f3dda6.nii.gz
│ ├── sub-13_task-flanker_run-1_events.tsv
│ ├── sub-13_task-flanker_run-2_bold.nii.gz -> ../../.git/annex/objects/XP/76/MD5E-s28202786--8caf1ac548c87b2b35f85e8ae2bf72c1.nii.gz/MD5E-s28202786--8caf1ac548c87b2b35f85e8ae2bf72c1.nii.gz
│ └── sub-13_task-flanker_run-2_events.tsv
├── sub-14
│ ├── anat
│ │ └── sub-14_T1w.nii.gz -> ../../.git/annex/objects/Zw/0z/MD5E-s9223596--33abfb5da565f3487e3a7aebc15f940c.nii.gz/MD5E-s9223596--33abfb5da565f3487e3a7aebc15f940c.nii.gz
│ └── func
│ ├── sub-14_task-flanker_run-1_bold.nii.gz -> ../../.git/annex/objects/Jp/29/MD5E-s29001492--250f1e4daa9be1d95e06af0d56629cc9.nii.gz/MD5E-s29001492--250f1e4daa9be1d95e06af0d56629cc9.nii.gz
│ ├── sub-14_task-flanker_run-1_events.tsv
│ ├── sub-14_task-flanker_run-2_bold.nii.gz -> ../../.git/annex/objects/PK/V2/MD5E-s29068193--5621a3b0af8132c509420b4ad9aaf8fb.nii.gz/MD5E-s29068193--5621a3b0af8132c509420b4ad9aaf8fb.nii.gz
│ └── sub-14_task-flanker_run-2_events.tsv
├── sub-15
│ ├── anat
│ │ └── sub-15_T1w.nii.gz -> ../../.git/annex/objects/Mz/qq/MD5E-s10752891--ddd2622f115ec0d29a0c7ab2366f6f95.nii.gz/MD5E-s10752891--ddd2622f115ec0d29a0c7ab2366f6f95.nii.gz
│ └── func
│ ├── sub-15_task-flanker_run-1_bold.nii.gz -> ../../.git/annex/objects/08/JJ/MD5E-s28285239--feda22c4526af1910fcee58d4c42f07e.nii.gz/MD5E-s28285239--feda22c4526af1910fcee58d4c42f07e.nii.gz
│ ├── sub-15_task-flanker_run-1_events.tsv
│ ├── sub-15_task-flanker_run-2_bold.nii.gz -> ../../.git/annex/objects/9f/0W/MD5E-s28289760--433000a1def662e72d8433dba151c61b.nii.gz/MD5E-s28289760--433000a1def662e72d8433dba151c61b.nii.gz
│ └── sub-15_task-flanker_run-2_events.tsv
├── sub-16
│ ├── anat
│ │ └── sub-16_T1w.nii.gz -> ../../.git/annex/objects/4g/8k/MD5E-s10927450--a196f7075c793328dd6ff3cebf36ea6b.nii.gz/MD5E-s10927450--a196f7075c793328dd6ff3cebf36ea6b.nii.gz
│ └── func
│ ├── sub-16_task-flanker_run-1_bold.nii.gz -> ../../.git/annex/objects/9z/g2/MD5E-s29757991--1a1648b2fa6cc74e31c94f109d8137ba.nii.gz/MD5E-s29757991--1a1648b2fa6cc74e31c94f109d8137ba.nii.gz
│ ├── sub-16_task-flanker_run-1_events.tsv
│ ├── sub-16_task-flanker_run-2_bold.nii.gz -> ../../.git/annex/objects/k8/4F/MD5E-s29773832--fe08739ea816254395b985ee704aaa99.nii.gz/MD5E-s29773832--fe08739ea816254395b985ee704aaa99.nii.gz
│ └── sub-16_task-flanker_run-2_events.tsv
├── sub-17
│ ├── anat
│ │ └── sub-17_T1w.nii.gz -> ../../.git/annex/objects/jQ/MQ/MD5E-s10826014--8e2a6b062df4d1c4327802f2b905ef36.nii.gz/MD5E-s10826014--8e2a6b062df4d1c4327802f2b905ef36.nii.gz
│ └── func
│ ├── sub-17_task-flanker_run-1_bold.nii.gz -> ../../.git/annex/objects/Wz/2P/MD5E-s28991563--9845f461a017a39d1f6e18baaa0c9c41.nii.gz/MD5E-s28991563--9845f461a017a39d1f6e18baaa0c9c41.nii.gz
│ ├── sub-17_task-flanker_run-1_events.tsv
│ ├── sub-17_task-flanker_run-2_bold.nii.gz -> ../../.git/annex/objects/jF/3m/MD5E-s29057821--84ccc041163bcc5b3a9443951e2a5a78.nii.gz/MD5E-s29057821--84ccc041163bcc5b3a9443951e2a5a78.nii.gz
│ └── sub-17_task-flanker_run-2_events.tsv
├── sub-18
│ ├── anat
│ │ └── sub-18_T1w.nii.gz -> ../../.git/annex/objects/3v/pK/MD5E-s10571510--6fc4b5792bc50ea4d14eb5247676fafe.nii.gz/MD5E-s10571510--6fc4b5792bc50ea4d14eb5247676fafe.nii.gz
│ └── func
│ ├── sub-18_task-flanker_run-1_bold.nii.gz -> ../../.git/annex/objects/94/P2/MD5E-s28185776--5b3879ec6fc4bbe1e48efc64984f88cf.nii.gz/MD5E-s28185776--5b3879ec6fc4bbe1e48efc64984f88cf.nii.gz
│ ├── sub-18_task-flanker_run-1_events.tsv
│ ├── sub-18_task-flanker_run-2_bold.nii.gz -> ../../.git/annex/objects/qp/6K/MD5E-s28234699--58019d798a133e5d7806569374dd8160.nii.gz/MD5E-s28234699--58019d798a133e5d7806569374dd8160.nii.gz
│ └── sub-18_task-flanker_run-2_events.tsv
├── sub-19
│ ├── anat
│ │ └── sub-19_T1w.nii.gz -> ../../.git/annex/objects/Zw/p8/MD5E-s8861893--d338005753d8af3f3d7bd8dc293e2a97.nii.gz/MD5E-s8861893--d338005753d8af3f3d7bd8dc293e2a97.nii.gz
│ └── func
│ ├── sub-19_task-flanker_run-1_bold.nii.gz -> ../../.git/annex/objects/04/k6/MD5E-s28178448--3874e748258cf19aa69a05a7c37ad137.nii.gz/MD5E-s28178448--3874e748258cf19aa69a05a7c37ad137.nii.gz
│ ├── sub-19_task-flanker_run-1_events.tsv
│ ├── sub-19_task-flanker_run-2_bold.nii.gz -> ../../.git/annex/objects/mz/P4/MD5E-s28190932--91e6b3e4318ca28f01de8cb967cf8421.nii.gz/MD5E-s28190932--91e6b3e4318ca28f01de8cb967cf8421.nii.gz
│ └── sub-19_task-flanker_run-2_events.tsv
├── sub-20
│ ├── anat
│ │ └── sub-20_T1w.nii.gz -> ../../.git/annex/objects/g1/FF/MD5E-s11025608--5929806a7aa5720fc755687e1450b06c.nii.gz/MD5E-s11025608--5929806a7aa5720fc755687e1450b06c.nii.gz
│ └── func
│ ├── sub-20_task-flanker_run-1_bold.nii.gz -> ../../.git/annex/objects/v5/ZJ/MD5E-s29931631--bf9abb057367ce66961f0b7913e8e707.nii.gz/MD5E-s29931631--bf9abb057367ce66961f0b7913e8e707.nii.gz
│ ├── sub-20_task-flanker_run-1_events.tsv
│ ├── sub-20_task-flanker_run-2_bold.nii.gz -> ../../.git/annex/objects/J3/KW/MD5E-s29945590--96cfd5b77cd096f6c6a3530015fea32d.nii.gz/MD5E-s29945590--96cfd5b77cd096f6c6a3530015fea32d.nii.gz
│ └── sub-20_task-flanker_run-2_events.tsv
├── sub-21
│ ├── anat
│ │ └── sub-21_T1w.nii.gz -> ../../.git/annex/objects/K6/6K/MD5E-s8662805--77b262ddd929fa08d78591bfbe558ac6.nii.gz/MD5E-s8662805--77b262ddd929fa08d78591bfbe558ac6.nii.gz
│ └── func
│ ├── sub-21_task-flanker_run-1_bold.nii.gz -> ../../.git/annex/objects/Wz/p9/MD5E-s28756041--9ae556d4e3042532d25af5dc4ab31840.nii.gz/MD5E-s28756041--9ae556d4e3042532d25af5dc4ab31840.nii.gz
│ ├── sub-21_task-flanker_run-1_events.tsv
│ ├── sub-21_task-flanker_run-2_bold.nii.gz -> ../../.git/annex/objects/xF/M3/MD5E-s28758438--81866411fc6b6333ec382a20ff0be718.nii.gz/MD5E-s28758438--81866411fc6b6333ec382a20ff0be718.nii.gz
│ └── sub-21_task-flanker_run-2_events.tsv
├── sub-22
│ ├── anat
│ │ └── sub-22_T1w.nii.gz -> ../../.git/annex/objects/JG/ZV/MD5E-s9282392--9e7296a6a5b68df46b77836182b6681a.nii.gz/MD5E-s9282392--9e7296a6a5b68df46b77836182b6681a.nii.gz
│ └── func
│ ├── sub-22_task-flanker_run-1_bold.nii.gz -> ../../.git/annex/objects/qW/Gw/MD5E-s28002098--c6bea10177a38667ceea3261a642b3c6.nii.gz/MD5E-s28002098--c6bea10177a38667ceea3261a642b3c6.nii.gz
│ ├── sub-22_task-flanker_run-1_events.tsv
│ ├── sub-22_task-flanker_run-2_bold.nii.gz -> ../../.git/annex/objects/VX/Zj/MD5E-s28027568--b34d0df9ad62485aba25296939429885.nii.gz/MD5E-s28027568--b34d0df9ad62485aba25296939429885.nii.gz
│ └── sub-22_task-flanker_run-2_events.tsv
├── sub-23
│ ├── anat
│ │ └── sub-23_T1w.nii.gz -> ../../.git/annex/objects/4Z/4x/MD5E-s10626062--db5a6ba6730b319c6425f2e847ce9b14.nii.gz/MD5E-s10626062--db5a6ba6730b319c6425f2e847ce9b14.nii.gz
│ └── func
│ ├── sub-23_task-flanker_run-1_bold.nii.gz -> ../../.git/annex/objects/VK/8F/MD5E-s28965005--4a9a96d9322563510ca14439e7fd6cea.nii.gz/MD5E-s28965005--4a9a96d9322563510ca14439e7fd6cea.nii.gz
│ ├── sub-23_task-flanker_run-1_events.tsv
│ ├── sub-23_task-flanker_run-2_bold.nii.gz -> ../../.git/annex/objects/56/20/MD5E-s29050413--753b0d2c23c4af6592501219c2e2c6bd.nii.gz/MD5E-s29050413--753b0d2c23c4af6592501219c2e2c6bd.nii.gz
│ └── sub-23_task-flanker_run-2_events.tsv
├── sub-24
│ ├── anat
│ │ └── sub-24_T1w.nii.gz -> ../../.git/annex/objects/jQ/fV/MD5E-s10739691--458f0046eff18ee8c43456637766a819.nii.gz/MD5E-s10739691--458f0046eff18ee8c43456637766a819.nii.gz
│ └── func
│ ├── sub-24_task-flanker_run-1_bold.nii.gz -> ../../.git/annex/objects/km/fV/MD5E-s29354610--29ebfa60e52d49f7dac6814cb5fdc2bc.nii.gz/MD5E-s29354610--29ebfa60e52d49f7dac6814cb5fdc2bc.nii.gz
│ ├── sub-24_task-flanker_run-1_events.tsv
│ ├── sub-24_task-flanker_run-2_bold.nii.gz -> ../../.git/annex/objects/Wj/KK/MD5E-s29423307--fedaa1d7c6e34420735bb3bbe5a2fe38.nii.gz/MD5E-s29423307--fedaa1d7c6e34420735bb3bbe5a2fe38.nii.gz
│ └── sub-24_task-flanker_run-2_events.tsv
├── sub-25
│ ├── anat
│ │ └── sub-25_T1w.nii.gz -> ../../.git/annex/objects/Gk/FQ/MD5E-s8998578--f560d832f13e757b485c16d570bf6ebc.nii.gz/MD5E-s8998578--f560d832f13e757b485c16d570bf6ebc.nii.gz
│ └── func
│ ├── sub-25_task-flanker_run-1_bold.nii.gz -> ../../.git/annex/objects/XW/1v/MD5E-s29473003--49b04e7e4b450ec5ef93ff02d4158775.nii.gz/MD5E-s29473003--49b04e7e4b450ec5ef93ff02d4158775.nii.gz
│ ├── sub-25_task-flanker_run-1_events.tsv
│ ├── sub-25_task-flanker_run-2_bold.nii.gz -> ../../.git/annex/objects/Qm/M7/MD5E-s29460132--b0e9039e9f33510631f229c8c2193285.nii.gz/MD5E-s29460132--b0e9039e9f33510631f229c8c2193285.nii.gz
│ └── sub-25_task-flanker_run-2_events.tsv
├── sub-26
│ ├── anat
│ │ └── sub-26_T1w.nii.gz -> ../../.git/annex/objects/kf/9F/MD5E-s10850250--5f103b2660f488e4afa193f9307c1291.nii.gz/MD5E-s10850250--5f103b2660f488e4afa193f9307c1291.nii.gz
│ └── func
│ ├── sub-26_task-flanker_run-1_bold.nii.gz -> ../../.git/annex/objects/QV/10/MD5E-s30127491--8e30aa4bbfcc461bac8598bf621283c5.nii.gz/MD5E-s30127491--8e30aa4bbfcc461bac8598bf621283c5.nii.gz
│ ├── sub-26_task-flanker_run-1_events.tsv
│ ├── sub-26_task-flanker_run-2_bold.nii.gz -> ../../.git/annex/objects/3G/Q6/MD5E-s30162480--80fd132e7cb1600ab248249e78f6f1aa.nii.gz/MD5E-s30162480--80fd132e7cb1600ab248249e78f6f1aa.nii.gz
│ └── sub-26_task-flanker_run-2_events.tsv
└── task-flanker_bold.json
80 directories, 136 files
The two files we desire are at the following locations:
anatomical image: ds000102/sub-01/anat/sub-01_T1w.nii.gz
functional image: ds000102/sub-01/func/sub-01_task-flanker_run-1_bold.nii.gz
This means that we can rewrite the paths as follows:
anat = base_directory/sub-[subject_id]/anat/sub-[subject_id]_T1w.nii.gz
func = base_directory/sub-[subject_id]/func/sub-[subject_id]_task-flanker_run_[run_id]_bold.nii.gz
Therefore, we need the parameters subject_id for the anatomical image and the parameters subject_id, and run_id for the functional images. In the context of DataGabber, this is specified as follows:
# Input: Set the input file path
data_dir = opj(os.getcwd(), 'ds000102') #base_directory of the data
# Dynamic parameters
subj_list = ['01', '02'] #subject_id
run_list = [1, 2] #run_id
# Initialise workflow
wf_input = Workflow(name='data_input', base_dir=wf_work_dir) # base_dir: Set path where nipype will store stepwise results
wf_input.config["execution"]["crashfile_format"] = "txt"
# Create DataGrabber node with input fields for all dynamic parameters (e.g. subject identifier, run identifier),
# as well as the two desired output fields anat and func.
dg = Node(
interface= DataGrabber(infields=["subject_id","run_id"], outfields=["anat", "func"]),
name="dg")
# Location of dataset folder
dg.inputs.base_directory = data_dir
# Necessary default parameters
dg.inputs.sort_filelist = True #return a sorted filelist to ensure to match files to correct runs/tasks
dg.inputs.template = "*" #wildcard
# Specify run_ids
dg.inputs.run_id = run_list
# Define arguments to fill the wildcards in the below paths
dg.inputs.template_args = dict(
anat=[["subject_id","subject_id"]],
func=[["subject_id","subject_id","run_id"]],
)
# Specify the template structure to find the specific data
dg.inputs.field_template = dict(
anat="sub-%s/anat/sub-%s_T1w.nii.gz",
func="sub-%s/func/sub-%s_task-flanker_run-%d_bold.nii.gz",
)
To feed dynamic parameters into the node either do this by specifying them directly as node inputs, or using another node and feed subject_id as connections to the DataGrabber node.
Specifying the input fields of DataGrabber directly for one subject:
dg.inputs.subject_id = '01'
If we want to start our workflow from creating subgraphs, i.e. running it for more than one subject, we can use another node: IdentityInterface, which is a special use case of iterables. It allows to create Nodes that do simple identity mapping, i.e. Nodes that only work on parameters/strings.
infosource = Node(IdentityInterface(fields=["subject_id"]),
name="infosource")
# Run a workflow iterating over various inputs using the iterables attribute of nodes --> splits up the workflow
infosource.iterables = [("subject_id", subj_list)]
# Connect the nodes and run the workflow
wf_input.connect([(infosource, dg, [("subject_id", "subject_id")])])
wf_input.run()
240613-06:56:04,896 nipype.workflow INFO:
Workflow data_input settings: ['check', 'execution', 'logging', 'monitoring']
240613-06:56:04,903 nipype.workflow INFO:
Running serially.
240613-06:56:04,904 nipype.workflow INFO:
[Node] Setting-up "data_input.dg" in "/home/jovyan/neurodesktop-storage/output/working_dir/data_input/_subject_id_01/dg".
240613-06:56:04,906 nipype.workflow INFO:
[Node] Executing "dg" <nipype.interfaces.io.DataGrabber>
240613-06:56:04,908 nipype.workflow INFO:
[Node] Finished "dg", elapsed time 0.000506s.
240613-06:56:04,910 nipype.workflow INFO:
[Node] Setting-up "data_input.dg" in "/home/jovyan/neurodesktop-storage/output/working_dir/data_input/_subject_id_02/dg".
240613-06:56:04,913 nipype.workflow INFO:
[Node] Executing "dg" <nipype.interfaces.io.DataGrabber>
240613-06:56:04,914 nipype.workflow INFO:
[Node] Finished "dg", elapsed time 0.000332s.
<networkx.classes.digraph.DiGraph at 0x7f5f6ca78dd0>
3.6. Data Output#
A workflow working directory is like a cache containing the outputs of various processing stages and various extraneous information such as execution reports, hashfiles determining the input state of processes.
!tree output/working_dir/smoothflow/ -I '*js|*dot|*png|*html|*json'
output/working_dir/smoothflow/
├── mask
│ ├── _inputs.pklz
│ ├── _node.pklz
│ ├── _report
│ │ └── report.rst
│ ├── command.txt
│ ├── result_mask.pklz
│ └── sub-01_T1w_smooth_masked.nii.gz
├── skullstrip
│ ├── _inputs.pklz
│ ├── _node.pklz
│ ├── _report
│ │ └── report.rst
│ ├── command.txt
│ ├── result_skullstrip.pklz
│ └── sub-01_T1w_brain_mask.nii.gz
└── smooth
├── _inputs.pklz
├── _node.pklz
├── _report
│ └── report.rst
├── command.txt
├── result_smooth.pklz
└── sub-01_T1w_smooth.nii.gz
6 directories, 18 files
Data output modules allow to restructure and rename computed output and to spatially differentiate relevant output files from the temporary computed intermediate files in the working directory.
In this tutorial, we will look into the DataSink module:
DataSink: Nipype’s standard output module, which allows the creation of arbitrary input attributes. The names of these attributes define the directory structure to be created for storing the files or directories.
Nipype also provides some simple frontends for storing values into a JSON File, MySQL and SQLite database or an XNAT Servers.
JSONFileSink
MySQLSink
SQLiteSink
XNATSink
Example: DataSink#
The following code segment defines the DataSink node and sets the base_directory
in which all outputs will be stored. The container
input creates a subdirectory within the base_directory.
from nipype.interfaces.io import DataSink
datasink = Node(DataSink(), name='sinker')
datasink.inputs.base_directory = '/path/to/output'
workflow.connect(inputnode, 'subject_id', datasink, 'container')
To store different outputs in the same place, a second port needs to be created with (.) This stores the files in a separate subfolder called mask:
workflow.connect(inputnode, 'mask_out_file', datasink, 'container.mask')
If you want to store the files in the same folder, use the .@ syntax. The @ tells the DataSink interface to not create the subfolder. This will allow to create different named input ports for DataSink and allow the user to store the files in the same folder.
workflow.connect(inputnode, 'subject_id', datasink, 'container')
workflow.connect(inputnode, 'mask_out_file', datasink, 'container.@mask')
🥳 Final Example: Mini-Preprocessing-Workflow
Input Stream: DataGrabber to grab the functional image (run-1) of sub-01
FSL-Interfaces: Motion correction and spatial smoothing (kernel of 4 mm) of the functional image
Output Stream: DataSink to grab the motion-corrected image, the motion parameters and the smoothed image
# Initialise Workflow
wf_preproc = Workflow(name='preproc', base_dir=wf_work_dir)
wf_preproc.config["execution"]["crashfile_format"] = "txt"
# DataGrabber Node
dg = Node(
interface= DataGrabber(infields=["subject_id", "run_id"], outfields=["func"]),
name="dg")
# Location of dataset folder
dg.inputs.base_directory = data_dir
# Necessary default parameters
dg.inputs.sort_filelist = True #return a sorted filelist to ensure to match files to correct runs/tasks
dg.inputs.template = "*"
dg.inputs.run_id = 1
dg.inputs.subject_id = '01'
dg.inputs.template_args = dict(
func=[["subject_id", "subject_id", "run_id"]],
)
# Specify the template structure to find the specific data
dg.inputs.field_template = dict(
func="sub-%s/func/sub-%s_task-flanker_run-%d_bold.nii.gz",
)
# Create Motion Correction Node
mcflirt = Node(fsl.MCFLIRT(save_plots=True),
name='mcflirt')
# Create Smoothing node
smooth = Node(fsl.IsotropicSmooth(fwhm=4),
name='smooth')
# Connect the three nodes to each other
wf_preproc.connect([(dg, mcflirt, [("func", "in_file")]),
(mcflirt, smooth, [("out_file", "in_file")])])
# Create DataSink object
sinker = Node(DataSink(), name='sinker')
# Name of the output folder
sinker.inputs.base_directory = opj(wf_work_dir, 'preproc/results')
# Save output in one folder called 'preproc' with .@
wf_preproc.connect([(smooth, sinker, [('out_file', 'sub-01.@in_file')]),
(mcflirt, sinker, [('out_file', 'sub-01.@mc_img'),
('par_file', 'sub-01.@par_file')]),
])
# Visualize the graph
wf_preproc.write_graph(graph2use='colored', format='png', simple_form=True)
Image(filename=opj(wf_preproc.base_dir, wf_preproc.name, 'graph.png'))
240613-06:56:06,73 nipype.workflow INFO:
Generated workflow graph: /home/jovyan/neurodesktop-storage/output/working_dir/preproc/graph.png (graph2use=colored, simple_form=True).
# Run workflow with distributed processing
wf_preproc.run('MultiProc')
240613-06:56:06,83 nipype.workflow INFO:
Workflow preproc settings: ['check', 'execution', 'logging', 'monitoring']
240613-06:56:06,88 nipype.workflow INFO:
Running in parallel.
240613-06:56:06,91 nipype.workflow INFO:
[MultiProc] Running 0 tasks, and 1 jobs ready. Free memory (GB): 219.48/219.48, Free processors: 32/32.
240613-06:56:07,56 nipype.workflow INFO:
[Node] Setting-up "preproc.dg" in "/home/jovyan/neurodesktop-storage/output/working_dir/preproc/dg".
240613-06:56:07,69 nipype.workflow INFO:
[Node] Executing "dg" <nipype.interfaces.io.DataGrabber>
240613-06:56:07,77 nipype.workflow INFO:
[Node] Finished "dg", elapsed time 0.001067s.
240613-06:56:08,93 nipype.workflow INFO:
[Job 0] Completed (preproc.dg).
240613-06:56:08,98 nipype.workflow INFO:
[MultiProc] Running 0 tasks, and 1 jobs ready. Free memory (GB): 219.48/219.48, Free processors: 32/32.
240613-06:56:08,398 nipype.workflow INFO:
[Node] Setting-up "preproc.mcflirt" in "/home/jovyan/neurodesktop-storage/output/working_dir/preproc/mcflirt".
240613-06:56:08,408 nipype.workflow INFO:
[Node] Executing "mcflirt" <nipype.interfaces.fsl.preprocess.MCFLIRT>
240613-06:56:10,91 nipype.workflow INFO:
[MultiProc] Running 1 tasks, and 0 jobs ready. Free memory (GB): 219.28/219.48, Free processors: 31/32.
Currently running:
* preproc.mcflirt
240613-06:56:35,950 nipype.workflow INFO:
[Node] Finished "mcflirt", elapsed time 27.532939s.
240613-06:56:36,93 nipype.workflow INFO:
[Job 1] Completed (preproc.mcflirt).
240613-06:56:36,95 nipype.workflow INFO:
[MultiProc] Running 0 tasks, and 1 jobs ready. Free memory (GB): 219.48/219.48, Free processors: 32/32.
240613-06:56:36,238 nipype.workflow INFO:
[Node] Setting-up "preproc.smooth" in "/home/jovyan/neurodesktop-storage/output/working_dir/preproc/smooth".
240613-06:56:36,259 nipype.workflow INFO:
[Node] Executing "smooth" <nipype.interfaces.fsl.maths.IsotropicSmooth>
240613-06:56:38,94 nipype.workflow INFO:
[MultiProc] Running 1 tasks, and 0 jobs ready. Free memory (GB): 219.28/219.48, Free processors: 31/32.
Currently running:
* preproc.smooth
240613-06:56:47,6 nipype.workflow INFO:
[Node] Finished "smooth", elapsed time 10.737365s.
240613-06:56:48,94 nipype.workflow INFO:
[Job 2] Completed (preproc.smooth).
240613-06:56:48,96 nipype.workflow INFO:
[MultiProc] Running 0 tasks, and 1 jobs ready. Free memory (GB): 219.48/219.48, Free processors: 32/32.
240613-06:56:48,225 nipype.workflow INFO:
[Node] Setting-up "preproc.sinker" in "/home/jovyan/neurodesktop-storage/output/working_dir/preproc/sinker".
240613-06:56:48,239 nipype.workflow INFO:
[Node] Executing "sinker" <nipype.interfaces.io.DataSink>
240613-06:56:48,243 nipype.workflow INFO:
[Node] Finished "sinker", elapsed time 0.001156s.
240613-06:56:50,95 nipype.workflow INFO:
[Job 3] Completed (preproc.sinker).
240613-06:56:50,96 nipype.workflow INFO:
[MultiProc] Running 0 tasks, and 0 jobs ready. Free memory (GB): 219.48/219.48, Free processors: 32/32.
<networkx.classes.digraph.DiGraph at 0x7f5f6ca35bd0>
! tree output/working_dir/preproc/results
output/working_dir/preproc/results
└── sub-01
├── sub-01_task-flanker_run-1_bold_mcf.nii.gz
├── sub-01_task-flanker_run-1_bold_mcf.nii.gz.par
└── sub-01_task-flanker_run-1_bold_mcf_smooth.nii.gz
1 directory, 3 files
Tip
-
DataSink offers the substitution input field to rename output files.
For example, to get rid of the string 'bold' and to adapt the file ending of the motion parameter file:
# Define substitution strings
substitutions = [('_bold', ''),
('.nii.gz.par', '.par')]
# Feed the substitution strings to the DataSink node
sinker.inputs.substitutions = substitutions
# Run the workflow again with the substitutions in place
wf_preproc.run()
240613-06:56:53,286 nipype.workflow INFO:
Workflow preproc settings: ['check', 'execution', 'logging', 'monitoring']
240613-06:56:53,292 nipype.workflow INFO:
Running serially.
240613-06:56:53,293 nipype.workflow INFO:
[Node] Setting-up "preproc.dg" in "/home/jovyan/neurodesktop-storage/output/working_dir/preproc/dg".
240613-06:56:53,296 nipype.workflow INFO:
[Node] Executing "dg" <nipype.interfaces.io.DataGrabber>
240613-06:56:53,297 nipype.workflow INFO:
[Node] Finished "dg", elapsed time 0.000322s.
240613-06:56:53,299 nipype.workflow INFO:
[Node] Setting-up "preproc.mcflirt" in "/home/jovyan/neurodesktop-storage/output/working_dir/preproc/mcflirt".
240613-06:56:53,301 nipype.workflow INFO:
[Node] Cached "preproc.mcflirt" - collecting precomputed outputs
240613-06:56:53,302 nipype.workflow INFO:
[Node] "preproc.mcflirt" found cached.
240613-06:56:53,302 nipype.workflow INFO:
[Node] Setting-up "preproc.smooth" in "/home/jovyan/neurodesktop-storage/output/working_dir/preproc/smooth".
240613-06:56:53,304 nipype.workflow INFO:
[Node] Cached "preproc.smooth" - collecting precomputed outputs
240613-06:56:53,304 nipype.workflow INFO:
[Node] "preproc.smooth" found cached.
240613-06:56:53,304 nipype.workflow INFO:
[Node] Setting-up "preproc.sinker" in "/home/jovyan/neurodesktop-storage/output/working_dir/preproc/sinker".
240613-06:56:53,306 nipype.workflow INFO:
[Node] Outdated cache found for "preproc.sinker".
240613-06:56:53,309 nipype.workflow INFO:
[Node] Executing "sinker" <nipype.interfaces.io.DataSink>
240613-06:56:53,310 nipype.interface INFO:
sub: /home/jovyan/neurodesktop-storage/output/working_dir/preproc/results/sub-01/sub-01_task-flanker_run-1_bold_mcf_smooth.nii.gz -> /home/jovyan/neurodesktop-storage/output/working_dir/preproc/results/sub-01/sub-01_task-flanker_run-1_mcf_smooth.nii.gz
240613-06:56:53,310 nipype.interface INFO:
sub: /home/jovyan/neurodesktop-storage/output/working_dir/preproc/results/sub-01/sub-01_task-flanker_run-1_bold_mcf.nii.gz -> /home/jovyan/neurodesktop-storage/output/working_dir/preproc/results/sub-01/sub-01_task-flanker_run-1_mcf.nii.gz
240613-06:56:53,311 nipype.interface INFO:
sub: /home/jovyan/neurodesktop-storage/output/working_dir/preproc/results/sub-01/sub-01_task-flanker_run-1_bold_mcf.nii.gz.par -> /home/jovyan/neurodesktop-storage/output/working_dir/preproc/results/sub-01/sub-01_task-flanker_run-1_mcf.par
240613-06:56:53,312 nipype.workflow INFO:
[Node] Finished "sinker", elapsed time 0.00222s.
<networkx.classes.digraph.DiGraph at 0x7f5f6ca87950>
! tree output/working_dir/preproc/results
output/working_dir/preproc/results
└── sub-01
├── sub-01_task-flanker_run-1_bold_mcf.nii.gz
├── sub-01_task-flanker_run-1_bold_mcf.nii.gz.par
├── sub-01_task-flanker_run-1_bold_mcf_smooth.nii.gz
├── sub-01_task-flanker_run-1_mcf.nii.gz
├── sub-01_task-flanker_run-1_mcf.par
└── sub-01_task-flanker_run-1_mcf_smooth.nii.gz
1 directory, 6 files
Nipype in a Nutshell
Nipype offers easy to use building blocks for:
establishing neuroimaging data processing services
constructing tailored data processing pipelines
4. Pydra: A modern dataflow engine developed for the Nipype project#
Pydra is a rewrite of the Nipype engine and forms the core of the Nipype 2.0 ecosystem and is meant to provide additional flexibility allowing users to define custom processing steps, interfaces and nested workflows.
Is a standalone project and designed to support analytics in any scientific domain (whereas Nipype is specifically designed for neuroimaging data analysis pipelines).
$\rightarrow$ Pydra aims to offer a lightweight Python (3.7+) dataflow engine for the construction, manipulation, and distributed execution of computational graphs. It serves as a tool for building reproducible, scalable, reusable, and fully automated scientific workflows.
The architecture in combination with several key features makes Pydra a customizable and powerful dataflow engine:
Architecture with three core components: Tasks (basic runnable components) including Workflows, Submitter (Classes for unpacking Tasks into standalone jobs) and Workers (Classes used to execute Tasks coordinate resource managment).
Composable dataflows: Nested dataflows of arbitrary depths encouraging the creation of reusable dataflows.
Global cache support to reduce recomputation.
Support for dataflow execution in containerized environments enabling greater consistency for reproducibility.
Splitting & combining semantics for creating nested loops over input sets (MapReduce extended to graphs).
Key features in more detail:#
Composable dataflows: A dataflow is represented as a directed acyclic graph, where each Task represents a Python function, execution of an external tool, or another dataflow. This enables the creation of simpe linear pipelines to complex nested dataflows of any depth. This approach promotes the development of reusable dataflows, enhancing modularity and scalability.
Nipype-Pydra architectures
- Pydra dataflow components: Task (basic runnable component with named inputs and outputs) with subclass Workflow
- Nipype basic concepts: Node (defined inputs and outputs), Workflow
Support for Python functions (FunctionTask) and external (shell) commands (ShellCommandTask): Pydra enables the seamless incorporation and utilization of pre-existing functions within Python libraries, as well as external command-line tools. This facilitates straightforward integration of existing code and software into Pydra workflows.
Support for execution of Tasks in containerized environments (ContainerTask): Any dataflow or Task can be executed in an associated container (via Docker or Singularity) enabling greater consistency for reproducibility.
Nipype-Pydra architectures
- Pydra Task subclasses: FunctionTask, ShellCommandTask, ContainerTask (Docker, Singularity)
- Nipype advanced concepts: base interfaces or interfaces for using existing functionality in other packages: wrapping of command line tools (nipype.interfaces.base CommandLine), run arbitrary function as nipype interface (nipype.interfaces.utility Function)
Splitting & combining semantics for creating nested loops over input sets: Versatile functionality for nested loop creation across input sets: Tasks or dataflows can iterate over input parameter sets, and their outputs can be recombined. This functionality resembles the Map-Reduce model, but Pydra extends this capability to graphs with nested dataflows.
Pydra-Nipype architectures
- Pydra: Optional State class: splitter and combiner attribute to specify how inputs should be split into parameter sets, and combined after Task execution
- Nipype: MapNode, Iterables/ Synchronize, JoinNode
Both Pydra and Nipype offer similar functionalities regarding
Hashing to manage task execution and caching of intermediate results. Hashes are computed for task inputs and parameters and used to determine task dependencies and to avoid unnecessary recomputation.
Provenance Tracking capabilities to captures dataflow execution activities as a provenance graph. It tracks inputs, outputs, and resources consumed by each task in a workflow, providing a detailed record of the workflow execution.
A content-addressable global cache to reduce recomputation: Hash values are computed for each graph and each Task. This supports reusing of previously computed and stored dataflows and Tasks. It also allows multiple people in or across laboratories to use each others’s execution outputs on the same data without having to rerun the same computation.
Auditing and provenance tracking: Pydra provides a simple JSON-LD-based message passing mechanism to capture the dataflow execution activities as a provenance graph. These messages track inputs and outputs of each task in a dataflow, and the resources consumed by the task.
Take Home Message
- Design Philosophy: Building pipelines for neuroimaging in vs. pipelines for any scientific domain
- Flexibility: Very flexible splitting and merging semantics in Pydra to create complex pipelines of any depth
- Execution Model: Pydra leverages modern parallel and distributed computing frameworks such as Dask
- Community and Ecosystem: Well-established community and ecosystem vs. a growing community
Pydra and Nipype are both open-source Python projects and offer similar functionalities for building and executing computational pipelines, including caching. However, they differ in their :
Dependencies in Jupyter/Python#
Using the package watermark to print out computer characteristics and software versions.
!pip install watermark
Collecting watermark
Downloading watermark-2.4.3-py2.py3-none-any.whl.metadata (1.4 kB)
Requirement already satisfied: ipython>=6.0 in /opt/conda/lib/python3.11/site-packages (from watermark) (8.16.1)
Requirement already satisfied: importlib-metadata>=1.4 in /opt/conda/lib/python3.11/site-packages (from watermark) (6.8.0)
Requirement already satisfied: setuptools in /opt/conda/lib/python3.11/site-packages (from watermark) (68.2.2)
Requirement already satisfied: zipp>=0.5 in /opt/conda/lib/python3.11/site-packages (from importlib-metadata>=1.4->watermark) (3.17.0)
Requirement already satisfied: backcall in /opt/conda/lib/python3.11/site-packages (from ipython>=6.0->watermark) (0.2.0)
Requirement already satisfied: decorator in /opt/conda/lib/python3.11/site-packages (from ipython>=6.0->watermark) (5.1.1)
Requirement already satisfied: jedi>=0.16 in /opt/conda/lib/python3.11/site-packages (from ipython>=6.0->watermark) (0.19.1)
Requirement already satisfied: matplotlib-inline in /opt/conda/lib/python3.11/site-packages (from ipython>=6.0->watermark) (0.1.6)
Requirement already satisfied: pickleshare in /opt/conda/lib/python3.11/site-packages (from ipython>=6.0->watermark) (0.7.5)
Requirement already satisfied: prompt-toolkit!=3.0.37,<3.1.0,>=3.0.30 in /opt/conda/lib/python3.11/site-packages (from ipython>=6.0->watermark) (3.0.39)
Requirement already satisfied: pygments>=2.4.0 in /opt/conda/lib/python3.11/site-packages (from ipython>=6.0->watermark) (2.16.1)
Requirement already satisfied: stack-data in /opt/conda/lib/python3.11/site-packages (from ipython>=6.0->watermark) (0.6.2)
Requirement already satisfied: traitlets>=5 in /opt/conda/lib/python3.11/site-packages (from ipython>=6.0->watermark) (5.11.2)
Requirement already satisfied: pexpect>4.3 in /opt/conda/lib/python3.11/site-packages (from ipython>=6.0->watermark) (4.8.0)
Requirement already satisfied: parso<0.9.0,>=0.8.3 in /opt/conda/lib/python3.11/site-packages (from jedi>=0.16->ipython>=6.0->watermark) (0.8.3)
Requirement already satisfied: ptyprocess>=0.5 in /opt/conda/lib/python3.11/site-packages (from pexpect>4.3->ipython>=6.0->watermark) (0.7.0)
Requirement already satisfied: wcwidth in /opt/conda/lib/python3.11/site-packages (from prompt-toolkit!=3.0.37,<3.1.0,>=3.0.30->ipython>=6.0->watermark) (0.2.8)
Requirement already satisfied: executing>=1.2.0 in /opt/conda/lib/python3.11/site-packages (from stack-data->ipython>=6.0->watermark) (1.2.0)
Requirement already satisfied: asttokens>=2.1.0 in /opt/conda/lib/python3.11/site-packages (from stack-data->ipython>=6.0->watermark) (2.4.0)
Requirement already satisfied: pure-eval in /opt/conda/lib/python3.11/site-packages (from stack-data->ipython>=6.0->watermark) (0.2.2)
Requirement already satisfied: six>=1.12.0 in /opt/conda/lib/python3.11/site-packages (from asttokens>=2.1.0->stack-data->ipython>=6.0->watermark) (1.16.0)
Downloading watermark-2.4.3-py2.py3-none-any.whl (7.6 kB)
Installing collected packages: watermark
Successfully installed watermark-2.4.3
%load_ext watermark
%watermark
%watermark --iversions
Last updated: 2024-06-13T06:56:58.149735+00:00
Python implementation: CPython
Python version : 3.11.6
IPython version : 8.16.1
Compiler : GCC 12.3.0
OS : Linux
Release : 5.4.0-182-generic
Machine : x86_64
Processor : x86_64
CPU cores : 32
Architecture: 64bit
nibabel : 5.2.1
matplotlib: 3.8.4
sys : 3.11.6 | packaged by conda-forge | (main, Oct 3 2023, 10:40:35) [GCC 12.3.0]
nilearn : 0.10.4
nipype : 1.8.6
numpy : 1.26.4
References/ Resources#
Gorgolewski K, Burns CD, Madison C, Clark D, Halchenko YO, Waskom ML, Ghosh SS. (2011). Nipype: a flexible, lightweight and extensible neuroimaging data processing framework in Python. Front. Neuroinform. 5:13.
Jarecka, Dorota & Goncalves, Mathias & Markiewicz, Christopher & Esteban, Oscar & Lo, Nicole & Kaczmarzyk, Jakub & Ghosh, Satrajit. (2020). Pydra - a flexible and lightweight dataflow engine for scientific analyses. 132-139. 10.25080/Majora-342d178e-012.
Renton, A.I., Dao, T.T., Johnstone, T. et al. Neurodesk: an accessible, flexible and portable data analysis environment for reproducible neuroimaging. Nat Methods (2024). https://doi.org/10.1038/s41592-023-02145-x