This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

Neurodesk CVMFS

How to interact with our CVMFS service.

1 - Setup CVMFS Proxy

Setup CVMFS Proxy server

If you want more speed in a region one way could be to setup another Stratum 1 server or a proxy. We currently don’t run any proxy servers but it would be important for using it on a cluster.

Setup a CVMFS proxy server

sudo yum install -y squid

Open the squid.confand use the following configuration

sudo vi /etc/squid/squid.conf
# List of local IP addresses (separate IPs and/or CIDR notation) allowed to access your local proxy
#acl local_nodes src YOUR_CLIENT_IPS

# Destination domains that are allowed
acl stratum_ones dstdomain .neurodesk.org
#acl stratum_ones dstdom_regex YOUR_REGEX

# Squid port
http_port 3128

# Deny access to anything which is not part of our stratum_ones ACL.
http_access deny !stratum_ones

# Only allow access from our local machines
#http_access allow local_nodes
http_access allow localhost

# Finally, deny all other access to this proxy
http_access deny all

minimum_expiry_time 0
maximum_object_size 1024 MB

cache_mem 128 MB
maximum_object_size_in_memory 128 KB
# 5 GB disk cache
cache_dir ufs /var/spool/squid 5000 16 256
sudo squid -k parse
sudo systemctl start squid
sudo systemctl enable squid
sudo systemctl status squid
sudo systemctl restart squid

Then add the proxy to the cvmfs config:

CVMFS_HTTP_PROXY="http://proxy-address:3128"

2 - CVMFS architecture

CVMFS architecture

We store our singularity containers unpacked on CVMFS. We tried the DUCC tool in the beginning, but it was causing too many issues with dockerhub and we were rate limited. The script to unpack our singularity containers is here: https://github.com/NeuroDesk/neurocommand/blob/main/cvmfs/sync_containers_to_cvmfs.sh

It gets called by a cronjob on the CVMFS Stratum 0 server and relies on the log.txt file being updated via an action in the neurocommand repository (https://github.com/NeuroDesk/neurocommand/blob/main/.github/workflows/upload_containers_simg.sh)

The Stratum 1 servers then pull this repo from Stratum 0 and our desktops mount these repos (configured here: https://github.com/NeuroDesk/neurodesktop/blob/main/Dockerfile)

The startup script (https://github.com/NeuroDesk/neurodesktop/blob/main/config/jupyter/before_notebook.sh) sets up CVMFS and tests which server is fastest during the container startup.

This can also be done manually:

sudo cvmfs_talk -i neurodesk.ardc.edu.au host info
sudo cvmfs_talk -i neurodesk.ardc.edu.au host probe
cvmfs_config stat -v neurodesk.ardc.edu.au

3 - Setup Stratum 0 server

Host a Stratum 0 server

Setup a Stratum 0 server:

Setup Storage

(would object storage be better? -> see comment below under next iteration ideas)

lsblk -l
sudo mkfs.ext4 /dev/vdb
sudo mkdir /storage
sudo mount /dev/vdb /storage/ -t auto
sudo chown ec2-user /storage/
sudo chmod a+rwx /storage/
sudo vi /etc/fstab
/dev/vdb  /storage    auto    defaults,nofail   0  2

Setup server

sudo yum install vim htop gcc git screen
sudo timedatectl set-timezone Australia/Brisbane

sudo yum install -y https://ecsft.cern.ch/dist/cvmfs/cvmfs-release/cvmfs-release-latest.noarch.rpm
sudo yum install -y cvmfs cvmfs-server

sudo systemctl enable httpd
sudo systemctl restart httpd

# sudo systemctl stop firewalld

# restore keys:
sudo mkdir /etc/cvmfs/keys/incoming
sudo chmod a+rwx /etc/cvmfs/keys/incoming
cd connections/cvmfs_keys/
scp neuro* ec2-user@203.101.226.164:/etc/cvmfs/keys/incoming
sudo mv /etc/cvmfs/keys/incoming/* /etc/cvmfs/keys/

#backup keys: 
#mkdir cvmfs_keys
#scp opc@158.101.127.61:/etc/cvmfs/keys/neuro* .

sudo cvmfs_server mkfs -o $USER neurodesk.ardc.edu.au

cd /storage
sudo mkdir -p cvmfs-storage/srv/
cd /srv/
sudo mv cvmfs/ /storage/cvmfs-storage/srv/
sudo ln -s /storage/cvmfs-storage/srv/cvmfs/

cd /var/spool
sudo mkdir /storage/spool
sudo mv cvmfs/ /storage/spool/
sudo ln -s  /storage/spool/cvmfs .

cvmfs_server transaction neurodesk.ardc.edu.au

cvmfs_server publish neurodesk.ardc.edu.au
sudo vi /etc/cron.d/cvmfs_resign
0 11 * * 1 root /usr/bin/cvmfs_server resign neurodesk.ardc.edu.au
cat /etc/cvmfs/keys/neurodesk.ardc.edu.au.pub
MIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEAuV9JBs9uXBR83qUs7AiE
nSQfvh6VCdNigVzOfRMol5cXsYq3cFy/Vn1Nt+7SGpDTQArQieZo4eWC9ww2oLq0
vY1pWyAms3Y4i+IUmMbwNifDU4GQ1KN9u4zl9Peun2YQCLE7mjC0ZLQtLM7Q0Z8h
NwP8jRJTN+u8mRKzkyxfSMLscVMKhm2pAwnT1zB9i3bzVV+FSnidXq8rnnzNHMgv
tfqx1h0gVyTeodToeFeGG5vq69wGZlwEwBJWVRGzzr+a8dWNBFMJ1HxamrBEBW4P
AxOKGHmQHTGbo+tdV/K6ZxZ2Ry+PVedNmbON/EPaGlI8Vd0fascACfByqqeUEhAB
dQIDAQAB
-----END PUBLIC KEY-----

Next iteration of this:

use object storage?

  • current implementation uses block storage, but this makes increasing the volume size a bit more work
  • we couldn’t get object storage to work on Oracle as it assumes AWS S3 -> Try again on AWS

Optimize settings for repositories for Container Images

from the CVMFS documentation: Repositories containing Linux container image contents (that is: container root file systems) should use overlayfs as a union file system and have the following configuration:

CVMFS_INCLUDE_XATTRS=true
CVMFS_VIRTUAL_DIR=true

Extended attributes of files, such as file capabilities and SElinux attributes, are recorded. And previous file system revisions can be accessed from the clients.

Currently not used

We tested the DUCC tool in the beginning, but it was leading to too many docker pulls and we therefore replaced it with our own script: https://github.com/NeuroDesk/neurocommand/blob/main/cvmfs/sync_containers_to_cvmfs.sh

This is the old DUCC setup

sudo yum install cvmfs-ducc.x86_64
sudo -i
dnf install -y yum-utils 
yum-config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo
dnf install docker-ce docker-ce-cli containerd.io
systemctl enable docker
systemctl start docker
docker version
docker info

# leave root mode

sudo groupadd docker
sudo usermod -aG docker $USER
sudo chown root:docker /var/run/docker.sock
newgrp docker


vi convert_appsjson_to_wishlist.sh
export DUCC_DOCKER_REGISTRY_PASS=configure_secret_password_here_and_dont_push_to_github
cd neurodesk
git pull
./gen_cvmfs_wishlist.sh
cvmfs_ducc convert recipe_neurodesk_auto.yaml
cd ..


chmod +x convert_appsjson_to_wishlist.sh

git clone https://github.com/NeuroDesk/neurodesk/

# setup cron job
sudo vi /etc/cron.d/cvmfs_dockerpull
*/5 * * * * opc cd ~ && bash /home/opc/convert_appsjson_to_wishlist.sh



#vi recipe.yaml

##version: 1
#user: vnmd
#cvmfs_repo: neurodesk.ardc.edu.au
#output_format: '$(scheme)://$(registry)/vnmd/thin_$(image)'
#input:
#- 'https://registry.hub.docker.com/vnmd/tgvqsm_1.0.0:20210119'
#- 'https://registry.hub.docker.com/vnmd/itksnap_3.8.0:20201208'


#cvmfs_ducc convert recipe_neurodesk.yaml
#cvmfs_ducc convert recipe_unpacked.yaml

4 - Setup Stratum 1 server

Host a Stratum 1 server

The stratum 1 servers for the desktop are configured here: https://github.com/NeuroDesk/neurodesktop/blob/main/Dockerfile

If you want more speed in a region one way could be to setup another Stratum 1 server or a proxy.

Setup a Stratum 1 server:

sudo yum install -y https://ecsft.cern.ch/dist/cvmfs/cvmfs-release/cvmfs-release-latest.noarch.rpm
sudo yum install -y cvmfs-server squid
sudo yum install -y python3-mod_wsgi 

sudo sed -i 's/Listen 80/Listen 127.0.0.1:8080/' /etc/httpd/conf/httpd.conf

set +H
echo "http_port 80 accel" | sudo tee /etc/squid/squid.conf
echo "http_port 8000 accel" | sudo tee -a /etc/squid/squid.conf
echo "http_access allow all" | sudo tee -a /etc/squid/squid.conf
echo "cache_peer 127.0.0.1 parent 8080 0 no-query originserver" | sudo tee -a /etc/squid/squid.conf
echo "acl CVMFSAPI urlpath_regex ^/cvmfs/[^/]*/api/" | sudo tee -a /etc/squid/squid.conf
echo "cache deny !CVMFSAPI" | sudo tee -a /etc/squid/squid.conf
echo "cache_mem 128 MB" | sudo tee -a /etc/squid/squid.conf

sudo systemctl start httpd
sudo systemctl start squid
sudo systemctl enable httpd
sudo systemctl enable squid

echo 'CVMFS_GEO_LICENSE_KEY=kGepdzqbAP4fjf5X' | sudo tee -a /etc/cvmfs/server.local
sudo chmod 600 /etc/cvmfs/server.local

sudo mkdir -p /etc/cvmfs/keys/ardc.edu.au/

echo "-----BEGIN PUBLIC KEY-----
MIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEAwUPEmxDp217SAtZxaBep
Bi2TQcLoh5AJ//HSIz68ypjOGFjwExGlHb95Frhu1SpcH5OASbV+jJ60oEBLi3sD
qA6rGYt9kVi90lWvEjQnhBkPb0uWcp1gNqQAUocybCzHvoiG3fUzAe259CrK09qR
pX8sZhgK3eHlfx4ycyMiIQeg66AHlgVCJ2fKa6fl1vnh6adJEPULmn6vZnevvUke
I6U1VcYTKm5dPMrOlY/fGimKlyWvivzVv1laa5TAR2Dt4CfdQncOz+rkXmWjLjkD
87WMiTgtKybsmMLb2yCGSgLSArlSWhbMA0MaZSzAwE9PJKCCMvTANo5644zc8jBe
NQIDAQAB
-----END PUBLIC KEY-----" | sudo tee /etc/cvmfs/keys/ardc.edu.au/neurodesk.ardc.edu.au.pub


sudo cvmfs_server add-replica -o $USER http://stratum0.neurodesk.cloud.edu.au/cvmfs/neurodesk.ardc.edu.au /etc/cvmfs/keys/ardc.edu.au

# CVMFS will store everything in /srv/cvmfs so make sure there is enough space or create a symlink to a bigger storage volume
# e.g.:
<!-- cd /storage
sudo mkdir -p cvmfs-storage/srv/
cd /srv/
sudo mv cvmfs/ /storage/cvmfs-storage/srv/
sudo ln -s /storage/cvmfs-storage/srv/cvmfs/ -->


sudo cvmfs_server snapshot neurodesk.ardc.edu.au


echo "/var/log/cvmfs/*.log {
    weekly
    missingok
    notifempty
}" | sudo tee /etc/logrotate.d/cvmfs


echo '*/5 * * * * root output=$(/usr/bin/cvmfs_server snapshot -a -i 2>&1) || echo "$output" ' | sudo tee /etc/cron.d/cvmfs_stratum1_snapshot

sudo yum install iptables
sudo iptables -t nat -A PREROUTING -p tcp -m tcp --dport 80 -j REDIRECT --to-ports 8000

sudo systemctl disable firewalld 
sudo systemctl stop firewalld 
# make sure that port 80 is open in the real firewall

sudo cvmfs_server update-geodb