For the complete documentation index, see llms.txt. This page is also available as Markdown.

Running Containerized Jobs

Containers provide environment isolation, ensure consistent environments across nodes, simplify dependency management, and let you reproduce results reliably. In short, they are a fantastic way to manage dependencies for Slurm jobs. TensorWave's Slurm provides several methods for running jobs in containerized environments.

We recommend using Pyxis as it's the most performant. Docker is provided due to popular demand; however, we highly discourage using Docker due to issues with performance and how the Docker daemon interacts with Slurm. Apptainer provides a similar interface/user experience to Docker, while avoiding a lot of the problems the Docker daemon causes.

Last updated