Storage
Shared Home Directory
/home is a shared, persistent filesystem backed by a high-performance distributed storage system and mounted on every login pod and worker pod in the cluster. Files written to /home from a login pod are immediately visible on any worker pod running your jobs, and vice versa.
This makes /home the right place for:
Source code, scripts, and configuration files
Job output you need to keep after the job finishes
Data that needs to be accessible from multiple nodes at once
Storage quotas
Your /home allocation is defined in your deployment. To check current usage:
df -h /home/$USERLive storage data can also be viewed via the TensorWave dashboard. If you need more space or your quota adjusted, contact your TensorWave account manager. If /home runs out of space, things will start to break. See Common Issues for advice on managing /home.
Setting Up a Shared Directory
For shared data that multiple users on the same team need to access, the recommended approach is a shared directory under /home with appropriate group permissions.
1. Create a shared directory:
mkdir /home/shared/<project-name>2. Set group ownership:
3. Set permissions so group members can read and write:
The 2 (setgid) bit ensures new files and subdirectories inherit the group, so members do not need to manually chown files they create there.
4. Verify:
To create a globally available shared directory, use the user group. Groups are managed through LDAP. If you need a new group created or users added to an existing one, contact your cluster administrator.
Worker Pod Storage
Worker pods have several storage locations available during a job. Understanding which to use prevents both data loss and performance issues.
Summary
/home/$USER
Distributed network FS
Yes
Yes
Durable and performant; use as primary storage
/tmp
Memory-backed
No
No
Fast local scratch; useful for caching large files
/run/tmp
Memory-backed
No
No
Fast local scratch; used by enroot for container-runtime
/dev/shm
Memory-backed
No
No
Fast local scratch; commonly used by pytorch for inter-process communication
/home: primary storage
/home: primary storage/tmp: pod-local scratch
/tmp: pod-local scratchEach worker pod has a memory-backed /tmp. It is fast relative to a network filesystem and suitable for intermediate files your job produces and consumes within the same pod. It is not shared between pods and is not guaranteed to be empty at job start (though it is cleaned between jobs by policy). Do not write job outputs here that you need after the job finishes. Keep in mind that this space is carved out of the node's RAM, writing large amounts of data here can lead up pages being swapped to disk, leading to reduced performance of your job. The filesystem is cleared when the pod is replaced.
/run/tmp and /dev/shm: tmpfs for system resources
/run/tmp and /dev/shm: tmpfs for system resources/run/tmp and /dev/shm are memory-backed filesystems (tmpfs) mounted on each worker. These are smilar to /tmp, but with less space and are reserved for application use. /run/tmp is used internally by Pyxis/Enroot for container image staging (/run/tmp/enroot-data, /run/tmp/enroot-runtime). /dev/shm is used by PyTorch for inter-process communication.
General rule: Write outputs you need to keep to
/home. Use/tmpfor scratch that only lives for the duration of the job. Treat all pod-local paths as ephemeral.
Last updated

