Google Cloud Vertex AI Endpoint


Vertex AI, a part of Google Cloud's suite, is designed to simplify the deployment, scaling, and management of machine learning (ML) projects. In this guide, we'll deploy the Vody Color Classification service to Vertex AI.


Make sure you have gcloud command-line tool installed and initialized.

Ensure you have the google-cloud-sdk and google-cloud-sdk-ai-platform components installed.

Activate the Vertex AI API for your GCP project.

Ensure the Docker image for your model is hosted on Google Container Registry (GCR) or another accessible container registry.

Deployment Steps

  1. Push the Docker Image to Google Container Registry (GCR)
    • If your image isn't on GCR yet:
# Create Repository
gcloud artifacts repositories create vody-models --repository-format=docker --location=us-central1
gcloud auth configure-docker

# Tag the Docker image

docker pull
docker tag[YOUR_PROJECT_ID]/vody-models/color-classification:latest

# Push the image to GCR 

docker push[YOUR_PROJECT_ID]/vody-models/color-classification:latest

Replace [YOUR_PROJECT_ID] with your GCP project ID.

  1. Configur Vertex AI (These steps can take ~15 min)
    • First, you'll need to create a model resource:
gcloud ai endpoints create \
  --project=[YOUR_PROJECT_ID] \
  --region=us-central1 \

Replace [YOUR_PROJECT_ID] with your GCP project ID.
This will give you a Endpoint ID. Note it down.

Now, upload the model to Vertex:

gcloud ai models upload \
  --container-ports=8080 \
  --container-predict-route="/invocations" \
  --container-health-route="/ping" \
  --region=us-central1 \
  --display-name=vody-color-classification \[YOUR_PROJECT_ID]/color-classification:latest

Replace [YOUR_PROJECT_ID] with your GCP project ID.
This will give you a Model ID. Note it down.

Lastly we can deploy the endpoint for the model

gcloud ai endpoints deploy-model [ENDPOINT_ID] \
  --project=[YOUR_PROJECT_ID] \
  --region=us-central1 \
  --model=[MODEL_ID] \
  --traffic-split=0=100 \
  --machine-type="g2-standard-4" \
  1. Send Requests to the Vertex AI Endpoint
    • After deploying the model, you can use the model ID and endpoint ID to send prediction requests:
gcloud ai endpoints predict [ENDPOINT_ID]  

Replace [ENDPOINT_ID] with the endpoint ID from the model deployment. The request.json should contain the input data for the model in the appropriate format.

Advantages of Deploying on Vertex AI

  • Integrated Platform: Vertex AI provides a unified and integrated platform for all stages of ML workflows.
  • Scalability: It automatically scales based on request volume, without any manual intervention.
  • Version Management: Easily manage and roll back to different versions of your model.
  • Monitoring and Logging: Integrated with Cloud Monitoring and Cloud Logging for real-time monitoring and logging of your deployed models.
  • Optimized Costs: With Vertex AI, you only pay for what you use. No need to reserve resources.
  • Support for Custom Containers: Deploy custom containers like the Vody Color Classification service seamlessly.


Deploying on Vertex AI provides a scalable and managed environment for ML services on Google Cloud. By leveraging its capabilities, developers and data scientists can focus on improving models and features without the overhead of infrastructure management.