# Kubernetes Quickstart

***

### Installing Tooling

We'll start by installing kubectl and k3d. kubectl is a command-line tool for managing kubernetes interfaces, and k3d is a lightweight wrapper to run k3s in Docker. Download and install the lateset release of kubectl using the folllowing commands:

```bash
curl -LO "https://dl.k8s.io/release/$(curl -L -s https://dl.k8s.io/release/stable.txt)/bin/linux/amd64/kubectl"
sudo install -o root -g root -m 0755 kubectl /usr/local/bin/kubectl
```

Next, install the latest release of k3d using this command:

```bash
wget -q -O - https://raw.githubusercontent.com/k3d-io/k3d/main/install.sh | bash
```

{% hint style="info" %}
Learn more about installing kubectl [here](https://kubernetes.io/docs/tasks/tools/install-kubectl-linux/), and installing k3d [here](https://k3d.io/v5.7.3/#installation).
{% endhint %}

***

### Creating and Configuring a Cluster

Now, you must create a cluster:

```bash
k3d cluster create hello-world-cluster
```

Then go ahead and check your context:

```bash
kubectl config current-context
```

This should list the cluster you just created, but if not, run the following command to switch to the needed context:

```bash
kubectl config use-context k3d-hello-world-cluster
```

In order to operate with acceleration, Kubernetes must also be set up with the AMD GPU Operator and Labeler plugins. You can install these using the following commands:

```bash
kubectl create -f https://raw.githubusercontent.com/ROCm/k8s-device-plugin/master/k8s-ds-amdgpu-dp.yaml
kubectl create -f https://raw.githubusercontent.com/ROCm/k8s-device-plugin/master/k8s-ds-amdgpu-labeller.yaml
```

Then, go ahead and make your deployment manifest. Create a directory for your manifest and direct into it, then open up a `deployment.yaml` file.

```bash
mkdir k8s-hello-world
cd k8s-hello-world
nano deployment.yaml
```

From there, paste in the following yaml:

```yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: hello-world
spec:
  replicas: 1
  selector:
    matchLabels:
      app: hello-world
  template:
    metadata:
      labels:
        app: hello-world
    spec:
      containers:
        - name: hello-world
          image: tensorwavehq/hello_world:latest
          resources:
            limits:
              amd.com/gpu: 1
          volumeMounts:
            - name: dev-kfd
              mountPath: /dev/kfd
            - name: dev-dri
              mountPath: /dev/dri
          securityContext:
            runAsGroup: 110
      volumes:
        - name: dev-kfd
          hostPath:
            path: /dev/kfd
        - name: dev-dri
          hostPath:
            path: /dev/dri
```

You'll notice there are a few extra configurations we added. These are necessary to running the pod with GPU acceleration.

* ```yaml
  resources:
    limits:
      amd.com/gpu: 1
  ```
  * This specifies that the container requires 1 AMD GPU. You must explicitly request GPU resources so that Kubernetes can schedule the pod on a node with an available AMD GPU.
* ```yaml
  volumes:
    - name: dev-kfd
      hostPath:
        path: /dev/kfd
    - name: dev-dri
      hostPath:
        path: /dev/dri
  ```
  * These definitions allow the cluster to use the necessary volumes from the host for utilizing the AMD GPUs.
* ```yaml
  volumeMounts:
    - name: dev-kfd
      mountPath: /dev/kfd
    - name: dev-dri
      mountPath: /dev/dri
  ```
  * These mounts correspond to the above volumes, allowing the container to access the GPU hardware.
* ```yaml
  securityContext:
      runAsGroup: 110
  ```
  * This security context runs the containers in the pod as the group ID 110, the render group, which is necessary for PyTorch to detect the devices properly (PyTorch is used in the hello world container).

Continue by applying the manifest using the following:

```bash
kubectl apply -f deployment.yaml
```

This should take a few minutes to create the container. You can monitor the status here:

```bash
kubectl get pods -l app=hello-world
```

Once this output displays that `STATUS` is `Completed`, you're ready to check output. Running:

```bash
kubectl logs -l app=hello-world
```

Should give an output of:

```
CUDA available: True
Number of GPUs: 1
GPU 0: AMD Instinct MI300X
```

You'll notice that you only have one GPU. That's because, as covered earlier, we specified a resource limit of one. You may raise or lower this number as necessary.

***

### Teardown

Navigate back to your base directory and remove your `k8s-hello-world` folder:

```bash
cd ~
rm -rf k8s-hello-world/
```
