Kubernetes Deployment

Introduction

The following guide will help you deploy the Vody Color Classification service on a Kubernetes cluster. Kubernetes provides the capabilities of orchestrating containers, scaling them, and managing workloads in a distributed environment.

Prerequisites

Ensure you have kubectl installed and configured to interact with your Kubernetes cluster.

Ensure your Kubernetes cluster has GPU nodes if you want to leverage GPU computation.
Deployment Steps

  1. Create a Kubernetes Deployment
    • First, we'll describe the deployment in a YAML file:
apiVersion: apps/v1  
kind: Deployment  
metadata:  
  name: vody-color-classification  
spec:  
  replicas: 2  
  selector:  
    matchLabels:  
      app: vody-color-classification  
  template:  
    metadata:  
      labels:  
        app: vody-color-classification  
    spec:  
      containers:  
      - name: color-classification  
        image: registry.vody.ai/color-classification:latest  
        ports:  
        - containerPort: 8080  
        resources:  
          limits:  
            nvidia.com/gpu: 1

This deployment:

  • Creates 2 replicas of the service.
  • Expects nodes with GPU. It limits 1 GPU per pod.
  • Save the above content in a file named vody-color-classification-deployment.yaml.

Apply the deployment with:

kubectl apply -f vody-color-classification-deployment.yaml
  1. Expose the Service
    • To make the service accessible, create a service that exposes the deployment:
kubectl expose deployment vody-color-classification --type=LoadBalancer --port=8080

This will expose the service on port 8080, and if your cluster supports it, will also provision a cloud LoadBalancer.

To get the external IP (after a few minutes):

kubectl get svc vody-color-classification

The service will be accessible via http://:8080.

If your cluster doesn't support LoadBalancer, you can use --type=NodePort and then access the service via any node's IP in the cluster.

Advantages of Kubernetes Deployment

  • High Availability: Kubernetes ensures that a specified number of replicas for your application are maintained. If a pod or even an entire node fails, new instances are automatically created.
  • Scaling: With Kubernetes, you can easily scale up or scale down your service based on demand.
  • Rolling Updates & Rollbacks: Deploy updates without downtime and easily rollback if necessary.
  • Service Discovery & Load Balancing: Kubernetes can distribute network traffic to distribute the load across the pods.
  • Automatic Bin Packing: Kubernetes automatically places containers based on their resource requirements and constraints, without sacrificing availability.

Conclusion

Kubernetes provides an enhanced level of orchestration, management, and scaling capabilities compared to standalone Docker deployments. With Kubernetes, developers can ensure high availability, resilience, and flexibility of their services in a production environment.