Kubernetes Quickstart
Estimated time: 6 minutes, 8 minutes with buffer.
Installing Tooling
We'll start by installing kubectl and k3d. kubectl is a command-line tool for managing kubernetes interfaces, and k3d is a lightweight wrapper to run k3s in Docker. Download and install the lateset release of kubectl using the folllowing commands:
curl -LO "https://dl.k8s.io/release/$(curl -L -s https://dl.k8s.io/release/stable.txt)/bin/linux/amd64/kubectl"
sudo install -o root -g root -m 0755 kubectl /usr/local/bin/kubectl
Next, install the latest release of k3d using this command:
wget -q -O - https://raw.githubusercontent.com/k3d-io/k3d/main/install.sh | bash
Creating and Configuring a Cluster
Now, you must create a cluster:
k3d cluster create hello-world-cluster
Then go ahead and check your context:
kubectl config current-context
This should list the cluster you just created, but if not, run the following command to switch to the needed context:
kubectl config use-context k3d-hello-world-cluster
In order to operate with acceleration, Kubernetes must also be set up with the AMD GPU Operator and Labeler plugins. You can install these using the following commands:
kubectl create -f https://raw.githubusercontent.com/ROCm/k8s-device-plugin/master/k8s-ds-amdgpu-dp.yaml
kubectl create -f https://raw.githubusercontent.com/ROCm/k8s-device-plugin/master/k8s-ds-amdgpu-labeller.yaml
Then, go ahead and make your deployment manifest. Create a directory for your manifest and direct into it, then open up a deployment.yaml
file.
mkdir k8s-hello-world
cd k8s-hello-world
nano deployment.yaml
From there, paste in the following yaml:
apiVersion: apps/v1
kind: Deployment
metadata:
name: hello-world
spec:
replicas: 1
selector:
matchLabels:
app: hello-world
template:
metadata:
labels:
app: hello-world
spec:
containers:
- name: hello-world
image: tensorwavehq/hello_world:latest
resources:
limits:
amd.com/gpu: 1
volumeMounts:
- name: dev-kfd
mountPath: /dev/kfd
- name: dev-dri
mountPath: /dev/dri
securityContext:
runAsGroup: 110
volumes:
- name: dev-kfd
hostPath:
path: /dev/kfd
- name: dev-dri
hostPath:
path: /dev/dri
You'll notice there are a few extra configurations we added. These are necessary to running the pod with GPU acceleration.
resources: limits: amd.com/gpu: 1
This specifies that the container requires 1 AMD GPU. You must explicitly request GPU resources so that Kubernetes can schedule the pod on a node with an available AMD GPU.
volumes: - name: dev-kfd hostPath: path: /dev/kfd - name: dev-dri hostPath: path: /dev/dri
These definitions allow the cluster to use the necessary volumes from the host for utilizing the AMD GPUs.
volumeMounts: - name: dev-kfd mountPath: /dev/kfd - name: dev-dri mountPath: /dev/dri
These mounts correspond to the above volumes, allowing the container to access the GPU hardware.
securityContext: runAsGroup: 110
This security context runs the containers in the pod as the group ID 110, the render group, which is necessary for PyTorch to detect the devices properly (PyTorch is used in the hello world container).
Continue by applying the manifest using the following:
kubectl apply -f deployment.yaml
This should take a few minutes to create the container. You can monitor the status here:
kubectl get pods -l app=hello-world
Once this output displays that STATUS
is Completed
, you're ready to check output. Running:
kubectl logs -l app=hello-world
Should give an output of:
CUDA available: True
Number of GPUs: 1
GPU 0: AMD Instinct MI300X
You'll notice that you only have one GPU. That's because, as covered earlier, we specified a resource limit of one. You may raise or lower this number as necessary.
Teardown
Navigate back to your base directory and remove your k8s-hello-world
folder:
cd ~
rm -rf k8s-hello-world/
Last updated