This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

Open Data

Tutorials about publishing and accessing open datasets

1 - datalad

Using datalad to publish and access open data on OSF

This tutorial was created by Steffen Bollmannn.

Github: @stebo85

DataLad is an open-source tool to publish and access open datasets. In addition to many open data sources (OpenNeuro, CBRAIN, brainlife.io, CONP, DANDI, Courtois Neuromod, Dataverse, Neurobagel), it can also connect to the Open Science Framework (OSF): http://osf.io/

Publish a dataset

First we have to create a DataLad dataset:

datalad create my_dataset

# now add files to your project and then add save the files with datalad
datalad save -m "added new files"

Now we can create a token on OSF (Account Settings -> Personal access tokens -> Create token) and authenticate:

datalad osf-credentials

Here is an example how to publish a dataset on the OSF:


# create sibling
datalad create-sibling-osf --title best-study-ever -s osf
git config --global --add datalad.extensions.load next

# push
datalad push --to osf

The last steps creates a DataLad dataset, which is not easily human readable.

If you would like to create a human-readable dataset (but without the option of downloading it as a datalad dataset later on):


# create sibling
datalad create-sibling-osf --title best-study-ever-human-readable --mode exportonly -s osf-export

git-annex export HEAD --to osf-export-storage

Access a dataset

To download a dataset from the OSF (if it was uploaded as a DataLad dataset before):

datalad clone osf://ehnwz

cd ehnwz

# now get the files you want to download:
datalad get .

2 - osfclient

Using osfclient to publish and access open data on OSF

This tutorial was created by Steffen Bollmannn.

Github: @stebo85

The osfclient is an open-source tool to publish and access open datasets on the Open Science Framework (OSF): http://osf.io/

Setup an OSF token

You can generate an OSF token under your user settings. Then, set the OSF token as an environment variable:

export OSF_TOKEN=YOURTOKEN

Publish a dataset

Here is an example how to publish a dataset on the OSF:

osf init
# enter your OSF credentials and project ID

# now copy your data into the directory, cd into the directory and then run:
osf upload -r . osfstorage/data

Access a dataset

To download a dataset from the OSF:

osf -p PROJECTID_HERE_eg_y5cq9 clone .