You may want to set up multiple Kubernetes clusters, both to have clusters in different regions to be nearer to your users, and to tolerate failures and/or invasive maintenance. This document describes some of the issues to consider when making a decision about doing so.
Note that at present, Kubernetes does not offer a mechanism to aggregate multiple clusters into a single virtual cluster. However, we plan to do this in the future.
On IaaS providers such as Google Compute Engine or Amazon Web Services, a VM exists in a zone or availability zone. We suggest that all the VMs in a Kubernetes cluster should be in the same availability zone, because:
It is okay to have multiple clusters per availability zone, though on balance we think fewer is better. Reasons to prefer fewer clusters are:
Reasons to have multiple clusters include:
The selection of the number of Kubernetes clusters may be a relatively static choice, only revisited occasionally. By contrast, the number of nodes in a cluster and the number of pods in a service may be change frequently according to load and growth.
To pick the number of clusters, first, decide which regions you need to be in to have adequate latency to all your end users, for services that will run
on Kubernetes (if you use a Content Distribution Network, the latency requirements for the CDN-hosted content need not
be considered). Legal issues might influence this as well. For example, a company with a global customer base might decide to have clusters in US, EU, AP, and SA regions.
Call the number of regions to be in
Second, decide how many clusters should be able to be unavailable at the same time, while still being available. Call
the number that can be unavailable
U. If you are not sure, then 1 is a fine choice.
If it is allowable for load-balancing to direct traffic to any region in the event of a cluster failure, then
you need at least the larger of
U + 1 clusters. If it is not (e.g you want to ensure low latency for all
users in the event of a cluster failure), then you need to have
R * (U + 1) clusters
U + 1 in each of
R regions). In any case, try to put each cluster in a different zone.
Finally, if any of your clusters would need more than the maximum recommended number of nodes for a Kubernetes cluster, then you may need even more clusters. Kubernetes v1.0 currently supports clusters up to 100 nodes in size, but we are targeting 1000-node clusters by early 2016.
When you have multiple clusters, you would typically create services with the same config in each cluster and put each of those service instances behind a load balancer (AWS Elastic Load Balancer, GCE Forwarding Rule or HTTP Load Balancer) spanning all of them, so that failures of a single cluster are not visible to end users.