# Reusing rkt pods in a BioInformatics pipeline

A BioInformatics pipeline may require invocation of some low-level programs like samtools hundreds or thousands of times. If these processes are run as BioContainers, using rkt for example, this means creation of very many rkt pods, adding up to a significant overhead both in space and time. Can we do better?

In fact we can. For the purpose of running a containerized application in this context, all that matters is which container image it is, and which user it runs as. Since rkt already supports the idea of entering a running pod using rkt enter, we can use this to share pod instances between the multiple invocations of programs such as samtools.

The idea is simple. When the pod is initially created, we choose not to run the actual application, in this example the samtools program. Instead, we run a utility, rkt-run-slave, which given the --wait option simply blocks forever reading from its own pipe. This means the pod sits idle and ready for use, until it is stopped explicitly by means of rkt stop.

With the idle pod sitting there ready for use, we arrange for the actual application to be run within a rkt enter instance. Any subsequent invocations of the same container image by the same user result in further instantiations of rkt enter.

This feature is a recent addition to rktrunner, and is activated in its configuration file by the following global option.

worker-pods = true


It is important to know which pods are in use by means of rkt enter invocations. Rktrunner borrows an idea from the underlying rkt program for this, and uses a directory within /var/lib/rktrunner for each pod. Each rkt enter invocation holds a shared lock on this directory while it runs.

Reaping idle worker pods is done by a separate garbage collection cycle. The rktrunner-gc program, best run at regular intervals from a cron job, attempts to acquire an exclusive lock on the directory locked by rkt enter instances. If it succeeds, it knows the pod is no longer in use, and can safely stop the pod and remove the lock directory.

In practice, this worker pods feature of rktrunner reduced the overhead of running multiple and many utility programs like samtools down to approximately the same level as starting a process, as would be done in a non-containerized environment.

It is sufficient simply to run the application in a null mode, e.g. with --help, a side effect of which is to create the worker.
For applications which don’t have such a mode, since version 0.19.0 rktrunner has a --prepare option, which simply creates the worker pod and exits, without calling the application at all.