GKE Nodepool Add Labels Without overwriting existing labels

Fri 07 April 2023

GKE has a feature to add node labels to all nodes in the nodepool. GKE will add the label to both the nodes already running in the cluster and also to newly added nodes.

You can use the feature like this:

gcloud container node-pools update my-node-pool \
  --cluster my-cluster --labels sam …

GKE list tainted nodepools with a specific taint

Thu 09 March 2023

A use case for upgrades involved being able to list all the node pools that have scaled down back to 0 and have a specific taint. This blog post shows the commands you can use to get this information.

List the GKE nodepools that have been tainted with key=upgrade …

3 tips for GKE ML/batch workloads

Sun 05 March 2023

There has been an influx of large batch and ML training workloads on GKE. I've personally had the please of working with one of those workloads. The things that batch and ML workload often require from GKE are the following:

Minimize pod disruptions since pods often can't simply be restarted …

GKE Safely Drain a Nodepool without pod disruptions

Sat 04 March 2023

GKE/K8s wasn't originally designed for workloads that spin up single pods and want those pods to stay up and running on the same node for very time. That doesn't mean those kind of workloads aren't running on GKE. In fact, there are large GKE ML/batch platform workloads running …

Deploying a Weaviate cluster on GKE

Tue 07 February 2023

Weaviate has great docs on how to deploy on K8s using Helm, however this guide is specifically focused on an end-to-end deployment of Weaviate on GKE with replication turned on. The following topics will be covered:

Creating and configuring your GKE cluster
Deploying Weaviate with Helm
Tweaking the Weaviate helm …

GKE GPU timesharing and resource quotas experiment

Fri 26 August 2022

You only got a few GPUs and want to pretend to your end-users that you got many? Then GKE GPU timesharing might just be the feature for you to save costs on GPUs that are underutilized. In this blog post you will learn:

Creating a GKE nodepool with timesharing enabled …

GKE move system services (kube-dns, calico) to dedicated nodepool

Mon 11 October 2021

GKE by default deploys kube-dns and other system services to any of your nodepools. This is probably fine for most cases, but certain use cases might require preventing system services from running on the same nodes as your where your applications are running. This blog post provides instructions on how …

GKE docker registry with HTTP proxy

Fri 21 May 2021

You are at one of those places that requires you to use a proxy to access your company wide Docker registry. Sometimes HTTP proxies are used to supposedly improve security or to workaround IP based rate limits. Well good luck, you're in for a ride on how to do this …

GKE custom OSS K8s cluster autoscaler

Fri 12 March 2021

Update 2023-03-27: Added instructions for clusters using Workload Identity

This blog post described how to deploy your own K8s cluster autoscaler instead of the cluster autoscaler that's bundled with GKE. This can be helpful in the rare case that the bundled GKE cluster autoscaler doesn't work for you.

Note that …

Custom DNS entry with KubeDNS stubdomain

Thu 11 March 2021

An example use case that I've seen is where you have a K8s service exposed on the ClusterIP and you want to make that service accessible over a domain name that you don't control.

You can do to the following steps to set this up:

Deploy CoreDNS with custom DNS …