Deep Dive into Cilium Multi-cluster

April 08, 2019

Let’s review some of the use cases of connecting multiple Kubernetes clusters before we dive into the implementation details. High availability is the most obvious use case for most. This use case includes operating Kubernetes clusters in multiple regions or availability zones and runs the replicas of the same services in each cluster.

Upon failure, requests can fail over to other clusters. The failure scenario covered in this use case is not primarily the complete unavailability of the entire region or failure domain. A more likely scenario is temporary unavailability of resources or misconfiguration in one cluster leading to inability to run or scale particular services in one cluster.

The initial trend of Kubernetes based platforms was to build large, multi-tenant Kubernetes clusters. It is getting more and more common to build individual clusters per tenant or to build clusters for different categories of services, e.g. different levels of security sensitivity. However, some services such as secrets management, logging, monitoring, or DNS are often still shared between all clusters.

This avoids operational overhead in maintaining these services in each tenant cluster. The primary motivation of this model is isolation between the tenant clusters, in order to maintain that goal, tenant clusters are connected to the shared services clusters but not connected to other tenant clusters. The operational complexity of running stateful or stateless services is very different.

Stateless services are simple to scale, migrate and upgrade. Running a cluster entirely with stateless services keeps the cluster nimble and agile. Migration from one cloud provider to another is possible easily.

Source: cilium.io