Kubernetes Deployment
Introduction
The following guide will help you deploy the Vody Color Classification service on a Kubernetes cluster. Kubernetes provides the capabilities of orchestrating containers, scaling them, and managing workloads in a distributed environment.
Prerequisites
Ensure you have kubectl
installed and configured to interact with your Kubernetes cluster.
Ensure your Kubernetes cluster has GPU nodes if you want to leverage GPU computation.
Deployment Steps
- Create a Kubernetes Deployment
- First, we'll describe the deployment in a YAML file:
apiVersion: apps/v1
kind: Deployment
metadata:
name: vody-color-classification
spec:
replicas: 2
selector:
matchLabels:
app: vody-color-classification
template:
metadata:
labels:
app: vody-color-classification
spec:
containers:
- name: color-classification
image: registry.vody.ai/color-classification:latest
ports:
- containerPort: 8080
resources:
limits:
nvidia.com/gpu: 1
This deployment:
- Creates 2 replicas of the service.
- Expects nodes with GPU. It limits 1 GPU per pod.
- Save the above content in a file named
vody-color-classification-deployment.yaml
.
Apply the deployment with:
kubectl apply -f vody-color-classification-deployment.yaml
- Expose the Service
- To make the service accessible, create a service that exposes the deployment:
kubectl expose deployment vody-color-classification --type=LoadBalancer --port=8080
This will expose the service on port 8080, and if your cluster supports it, will also provision a cloud LoadBalancer.
To get the external IP (after a few minutes):
kubectl get svc vody-color-classification
The service will be accessible via http://:8080.
If your cluster doesn't support LoadBalancer, you can use --type=NodePort
and then access the service via any node's IP in the cluster.
Advantages of Kubernetes Deployment
- High Availability: Kubernetes ensures that a specified number of replicas for your application are maintained. If a pod or even an entire node fails, new instances are automatically created.
- Scaling: With Kubernetes, you can easily scale up or scale down your service based on demand.
- Rolling Updates & Rollbacks: Deploy updates without downtime and easily rollback if necessary.
- Service Discovery & Load Balancing: Kubernetes can distribute network traffic to distribute the load across the pods.
- Automatic Bin Packing: Kubernetes automatically places containers based on their resource requirements and constraints, without sacrificing availability.
Conclusion
Kubernetes provides an enhanced level of orchestration, management, and scaling capabilities compared to standalone Docker deployments. With Kubernetes, developers can ensure high availability, resilience, and flexibility of their services in a production environment.