viash build config.vsh.yaml --engine docker --output target
Add dependencies
In the previous section, reproducibility of our Viash component was ensured by a predefined Docker image such as bash:4.0
and python:3.10
. However, your script might require other software dependencies, such as command-line tools or Python and R packages.
By default, Viash will build component-specific Docker images. This means that every Viash component can have its own set of dependencies.
Extended example
Below is an example where additional software is added to a base Docker image using the setup
section of a Docker platform.
name: example_bash_with_setup
description: A minimal example component.
arguments:
- type: file
name: --input
example: file.txt
required: true
- type: file
name: --output
direction: output
example: output.txt
required: true
resources:
- type: bash_script
path: script.sh
engines:
- type: docker
image: bash:4.0
setup:
- type: apk
packages:
- curl
- wget
- type: native
runners:
- type: executable
- type: nextflow
name: example_csharp_with_setup
description: A minimal example component.
arguments:
- type: file
name: --input
example: file.txt
required: true
- type: file
name: --output
direction: output
example: output.txt
required: true
resources:
- type: csharp_script
path: script.csx
engines:
- type: docker
image: ghcr.io/data-intuitive/dotnet-script:1.3.1
setup:
- type: apk
packages:
- curl
- wget
- type: native
runners:
- type: executable
- type: nextflow
name: example_js_with_setup
description: A minimal example component.
arguments:
- type: file
name: --input
example: file.txt
required: true
- type: file
name: --output
direction: output
example: output.txt
required: true
resources:
- type: javascript_script
path: script.js
engines:
- type: docker
image: node:19-bullseye-slim
setup:
- type: apt
packages:
- curl
- wget
- type: native
runners:
- type: executable
- type: nextflow
name: example_python_with_setup
description: A minimal example component.
arguments:
- type: file
name: --input
example: file.txt
required: true
- type: file
name: --output
direction: output
example: output.txt
required: true
resources:
- type: python_script
path: script.py
engines:
- type: docker
image: python:3.10-slim
setup:
- type: apt
packages:
- curl
- wget
- type: python
packages: anndata
- type: native
runners:
- type: executable
- type: nextflow
name: example_r_with_setup
description: A minimal example component.
arguments:
- type: file
name: --input
example: file.txt
required: true
- type: file
name: --output
direction: output
example: output.txt
required: true
resources:
- type: r_script
path: script.R
engines:
- type: docker
image: eddelbuettel/r2u:22.04
setup:
- type: apt
packages:
- curl
- wget
- type: r
packages: tidyverse
- type: native
runners:
- type: executable
- type: nextflow
name: example_scala_with_setup
description: A minimal example component.
arguments:
- type: file
name: --input
example: file.txt
required: true
- type: file
name: --output
direction: output
example: output.txt
required: true
resources:
- type: scala_script
path: script.scala
engines:
- type: docker
image: sbtscala/scala-sbt:eclipse-temurin-19_36_1.7.2_2.13.10
setup:
- type: apt
packages:
- curl
- wget
- type: native
runners:
- type: executable
- type: nextflow
You can (re)build a component’s Docker image by passing the ---setup
flag to the executable:
Build the executable:
Build the Docker image:
target/example_bash_with_setup ---setup cachedbuild
[notice] Building container 'example_bash_with_setup:latest' with Dockerfile
Build the executable:
viash build config.vsh.yaml --engine docker --output target
Build the Docker image:
target/example_csharp_with_setup ---setup cachedbuild
[notice] Building container 'example_csharp_with_setup:latest' with Dockerfile
Build the executable:
viash build config.vsh.yaml --engine docker --output target
Build the Docker image:
target/example_js_with_setup ---setup cachedbuild
[notice] Building container 'example_js_with_setup:latest' with Dockerfile
Build the executable:
viash build config.vsh.yaml --engine docker --output target
Build the Docker image:
target/example_python_with_setup ---setup cachedbuild
[notice] Building container 'example_python_with_setup:latest' with Dockerfile
Build the executable:
viash build config.vsh.yaml --engine docker --output target
Build the Docker image:
target/example_r_with_setup ---setup cachedbuild
[notice] Building container 'example_r_with_setup:latest' with Dockerfile
Build the executable:
viash build config.vsh.yaml --engine docker --output target
Build the Docker image:
target/example_scala_with_setup ---setup cachedbuild
[notice] Building container 'example_scala_with_setup:latest' with Dockerfile
Alternatively, you can also build the executable and it’s corresponding Docker image in one go:
viash build config.vsh.yaml --engine docker --output target --setup cachedbuild
viash build config.vsh.yaml --engine docker --output target --setup cachedbuild
viash build config.vsh.yaml --engine docker --output target --setup cachedbuild
viash build config.vsh.yaml --engine docker --output target --setup cachedbuild
viash build config.vsh.yaml --engine docker --output target --setup cachedbuild
viash build config.vsh.yaml --engine docker --output target --setup cachedbuild
Steps for creating a custom Docker engine
Here is a series of steps you can follow to add a Docker engine to your Viash component from scratch.
Step 1: Choose a base image
To start off, you’ll need to choose a base Docker image to start working with. In deciding which base image to use, it’s important to consider the size of the image and how trustworthy the source image is.
If the container does not have Bash installed, don’t forget to install this in Step 2.
Here is a list of base images we commonly use:
- Bash:
bash
,ubuntu
- C#:
ghcr.io/data-intuitive/dotnet-script
- JavaScript:
node
- Python:
python
,nvcr.io/nvidia/pytorch
- R:
eddelbuettel/r2u
,rocker/tidyverse
- Scala:
sbtscala/scala-sbt
See the section on ‘minimum requirements’ when building a custom base image.
Step 2: Installing additional dependencies
You can use the setup
section to many different types of layers. Here are some examples:
Apk requirements:
setup: - type: apk packages: [ curl ]
Apt requirements:
setup: - type: apt packages: [ curl ]
Docker requirements:
setup: - type: docker build_args: "R_VERSION=hello_world" run: | echo 'Run a custom command' echo 'Foo' > /path/to/file.txt
Javascript requirements:
setup: - type: javascript packages: [ express ] github: [ "expressjs/express" ]
Python requirements:
setup: - type: python packages: [ anndata ] github: [ jkbr/httpie ]
R requirements:
setup: - type: r packages: [ anndata ] bioc: [ AnnotationDbi, SingleCellExperiment ] github: rcannood/SCORPIUS
Ruby requirements:
setup: - type: ruby packages: [ pry ]
Yum requirements:
setup: - type: ruby packages: [ pry ] github: [ "pry/pry" ]
For more information on the possible setup entries, check out the reference documentation.
Don’t forget to rebuild the Docker image after making changes to the setup
section of your Docker platform (see next step).
Step 3: Rebuild Docker image
After adding additional setup entries, it’s important to rerun ---setup cachedbuild
to rebuild the Docker image, as Viash will not rebuild the Docker image when it already exists.
viash build config.vsh.yaml
--engine docker
--output target
--setup cachedbuild
viash build config.vsh.yaml
--engine docker
--output target
--setup cachedbuild
viash build config.vsh.yaml
--engine docker
--output target
--setup cachedbuild
viash build config.vsh.yaml
--engine docker
--output target
--setup cachedbuild
viash build config.vsh.yaml
--engine docker
--output target
--setup cachedbuild
viash build config.vsh.yaml
--engine docker
--output target
--setup cachedbuild
You can choose what strategy to build an executable with when using a Docker backend by passing the --setup
option followed by one of the strategies below.
Building an image:
alwaysbuild
/build
/b
: Always build the image from the dockerfile. This is the default setup strategy.alwayscachedbuild
/cachedbuild
/cb
: Always build the image from the dockerfile, with caching enabled.ifneedbebuild
: Build the image if it does not exist locally.ifneedbecachedbuild
: Build the image with caching enabled if it does not exist locally.
Pulling an image:
alwayspull
/pull
/p
: Try to pull the container from Docker Hub or the specified docker registry.alwayspullelsebuild
/pullelsebuild
: Try to pull the image from a registry and build it if it doesn’t exist.alwayspullelsecachedbuild
/pullelsecachedbuild
: Try to pull the image from a registry and build it with caching if it doesn’t exist.ifneedbepull
: If the image does not exist locally, pull the image.ifneedbepullelsebuild
If the image does not exist locally, pull the image. If the image does exist, build it.ifneedbepullelsecachedbuild
: If the image does not exist locally, pull the image. If the image does exist, build it with caching enabled.
Pushing an image:
push
: Push the container to Docker Hub or the specified docker registry.pushifnotpresent
Push the container to Docker Hub or the specified docker registry if the specified tag does not exist yet.
Doing nothing:
donothing
/meh
: Do not build or pull anything.
Troubleshooting
Below are several steps that might help you troubleshoot the image when the setup fails.
View Dockerfile
You can view the actual Dockerfile used by Viash by passing the ---dockerfile
flag:
target/example_bash_with_setup ---dockerfile
FROM bash:4.0
ENTRYPOINT []
RUN apk add --no-cache curl wget
LABEL org.opencontainers.image.description="Companion container for running component example_bash_with_setup"
LABEL org.opencontainers.image.created="2024-09-06T22:13:54Z"
target/example_csharp_with_setup ---dockerfile
FROM ghcr.io/data-intuitive/dotnet-script:1.3.1
ENTRYPOINT []
RUN apk add --no-cache curl wget
LABEL org.opencontainers.image.description="Companion container for running component example_csharp_with_setup"
LABEL org.opencontainers.image.created="2024-09-06T22:14:06Z"
target/example_js_with_setup ---dockerfile
FROM node:19-bullseye-slim
ENTRYPOINT []
RUN apt-get update && \
DEBIAN_FRONTEND=noninteractive apt-get install -y curl wget && \
rm -rf /var/lib/apt/lists/*
LABEL org.opencontainers.image.description="Companion container for running component example_js_with_setup"
LABEL org.opencontainers.image.created="2024-09-06T22:14:17Z"
target/example_python_with_setup ---dockerfile
FROM python:3.10-slim
ENTRYPOINT []
RUN apt-get update && \
DEBIAN_FRONTEND=noninteractive apt-get install -y curl wget && \
rm -rf /var/lib/apt/lists/*
RUN pip install --upgrade pip && \
pip install --upgrade --no-cache-dir "anndata"
LABEL org.opencontainers.image.description="Companion container for running component example_python_with_setup"
LABEL org.opencontainers.image.created="2024-09-06T22:14:31Z"
target/example_r_with_setup ---dockerfile
FROM eddelbuettel/r2u:22.04
ENTRYPOINT []
RUN apt-get update && \
DEBIAN_FRONTEND=noninteractive apt-get install -y curl wget && \
rm -rf /var/lib/apt/lists/*
RUN Rscript -e 'if (!requireNamespace("remotes", quietly = TRUE)) install.packages("remotes")' && \
Rscript -e 'remotes::install_cran(c("tidyverse"), repos = "https://cran.rstudio.com")'
LABEL org.opencontainers.image.description="Companion container for running component example_r_with_setup"
LABEL org.opencontainers.image.created="2024-09-06T22:14:57Z"
target/example_scala_with_setup ---dockerfile
FROM sbtscala/scala-sbt:eclipse-temurin-19_36_1.7.2_2.13.10
ENTRYPOINT []
RUN apt-get update && \
DEBIAN_FRONTEND=noninteractive apt-get install -y curl wget && \
rm -rf /var/lib/apt/lists/*
LABEL org.opencontainers.image.description="Companion container for running component example_scala_with_setup"
LABEL org.opencontainers.image.created="2024-09-06T22:15:32Z"
Enter debugging session
You can also hop in a Bash session inside the Docker image using the ---debug
flag:
target/example_bash_with_setup ---debug
[notice] + docker run --entrypoint=bash -i --rm -v `pwd`:/pwd --workdir /pwd -t 'example_bash_with_setup:latest'
root@93c38006a124:/pwd#
target/example_csharp_with_setup ---debug
[notice] + docker run --entrypoint=bash -i --rm -v `pwd`:/pwd --workdir /pwd -t 'example_csharp_with_setup:latest'
root@93c38006a124:/pwd#
target/example_js_with_setup ---debug
[notice] + docker run --entrypoint=bash -i --rm -v `pwd`:/pwd --workdir /pwd -t 'example_js_with_setup:latest'
root@93c38006a124:/pwd#
target/example_python_with_setup ---debug
[notice] + docker run --entrypoint=bash -i --rm -v `pwd`:/pwd --workdir /pwd -t 'example_python_with_setup:latest'
root@93c38006a124:/pwd#
target/example_r_with_setup ---debug
[notice] + docker run --entrypoint=bash -i --rm -v `pwd`:/pwd --workdir /pwd -t 'example_r_with_setup:latest'
root@93c38006a124:/pwd#
target/example_scala_with_setup ---debug
[notice] + docker run --entrypoint=bash -i --rm -v `pwd`:/pwd --workdir /pwd -t 'example_scala_with_setup:latest'
root@93c38006a124:/pwd#
This is useful for interactively debugging issues inside the container. For example, for figuring out whether you need to use apk
, apt
or yum
to install software and to search for the exact name of packages like libcurl4-openssl-dev
.
Alternative solutions
There are multiple ways you might try to find a Docker image which contains the right set of dependencies for your component:
- Browse Docker Hub: Look a Docker image on Docker Hub or other Docker registries which has the right set of dependencies.
- This is generally not recommended because it might take a long time to find a pre-existing image with the right set of dependencies
- Poses a serious security risk.
- Write a custom Dockerfile: You can write a custom Dockerfile to build your own Docker image and store it in a Docker registry, effectively creating a new ‘trusted’ base image.
- Requires manual bookkeeping of which Docker images are used in which components.
- Not difficult but requires more know-how on how to build custom Docker images.
- Use Viash setup to build component-specific images: The methodology described above.
- Easier to add / change dependencies to one component without breaking another
- Store images in a centralized container registry
Behind the scenes
Auto-mount
Any executable built by Viash with a Docker engine will automatically mount the directories of files passed to the executable as arguments. For example, when running:
./my_executable --input /foo/bar/file.txt --output /dest/path
The executable will automatically mount the /foo/bar
and /dest
folder to /viash_automount/foo/bar/
and /viash_automount/dest
inside the Docker container.
Auto-chown
By default, files created and modified by a Docker container are owned by root. By default, Viash automatically changes the owner of any files defined in the config file to the user running the executable. This behaviour can be overridden by setting the chown setting to false
in your config file.
Example with standard Docker:
docker run -v `pwd`:/pwd bash:4.0 touch /pwd/file.txt
ls -l
-rw-r--r--. 1 root root 0 Jan 26 16:03 file.txt
Example with a Viash executable:
/my_executable --output file.txt
-rw-r--r--. 1 myuser myuser Jan 26 16:03 file.txt
Minimum requirements for custom Docker images
Viash components only require a minimal set of dependencies which need to be available inside the Docker image:
- Bash:
bash
. - C#:
bash
anddotnet-script
. - JavaScript:
bash
andnode
(Node.js). - Python:
bash
,python
andpip
. - R:
bash
andR
. - Scala:
bash
,openjdk-devel
andsbt
.