Deploy Kubernetes and Persistent Volumes on a
vSAN
Stretched Cluster

You can deploy a generic Kubernetes cluster and persistent volumes on
vSAN
stretched clusters. You can deploy multiple Kubernetes clusters with different storage requirements in the same
vSAN
stretched cluster.
When you plan to configure a Kubernetes cluster on a
vSAN
stretched cluster, consider the following items:
  • A generic Kubernetes cluster does not enforce the same storage policy on the node VMs and on the persistent volumes. The vSphere administrator is responsible for the correct storage policy configuration, assignment, and use of the storage policies within the Kubernetes clusters.
  • Use the VM storage policy with the same replication and site affinity settings for all storage objects on the Kubernetes cluster. The same storage policy should be used for all node VMs, including the control plane and worker, and all PVs.
  • The topology feature cannot be used to provision a volume that belongs to a specific fault domain within the
    vSAN
    stretched cluster.
vSAN
stretched clusters support file volumes backed by
vSAN
file shares. For more information, see Provisioning File Volumes with vSphere Container Storage Plug-in.
  1. Set up your
    vSAN
    stretched cluster.
    1. Create a
      vSAN
      stretched cluster.
      For more information, search for
      vSAN
      stretched cluster on the VMware vSAN Documentation site.
    2. Turn on DRS on the stretched cluster.
    3. Turn on vSphere HA.
      Make sure to set up
      Host Monitoring
      .
    4. Enable host monitoring and configure host failure response, response for host isolation, and VM monitoring.
      VMware recommends you to disable VM Component Protection (VMCP) when all Node VMs and Volumes are deployed on the vSAN Datastore.
      • Disable Datastore with PDL.
      • Disable Datastore with APD.
  2. Create a VM storage policy compliant with the
    vSAN
    stretched cluster requirements.
    1. Configure
      Site disaster tolerance
      .
      Select
      Dual site mirroring
      to have data mirrored at both sites of the stretched cluster.
      The screenshot shows options available for Site disaster
                                    tolerance.
    2. Specify
      Failures to tolerate
      .
      For the stretched cluster, the setting defines the number of disk or host failures a storage object can tolerate for each of the site. The number of required fault domains, or hosts within a site for the stretched cluster, in order to tolerate
      n
      failures is
      2n + 1
      for mirroring.
      Raid-1
      mirroring provides better performance.
      Raid-5
      and
      Raid-6
      achieve failure tolerance using parity blocks, which provides better space efficiency. These options are available only on all-flash clusters.
      The screenshot shows options available for Failures to
                                        tolerate.
    3. Enable
      Force provisioning
      .
      The screenshot shows the Force provisioning option on the
                                    Advanced Policy Rules tab.
  3. Create VM-Host affinity rules to place Kubernetes nodes on specific primary or secondary site, such as Site-A.
    The screenshot shows VM/Host Rule dialog box.
    For information about affinity rules, see
    Create a VM-Host Affinity Rule
    in the
    vSphere Resource Management
    documentation.
  4. Deploy Kubernetes VMs using the
    vSAN
    stretched cluster storage policy.
  5. Create a storage class using the
    vSAN
    stretched cluster storage policy.
  6. Deploy persistent volumes using the
    vSAN
    stretched cluster storage class.
Depending on your needs and environment, you can use one of the following deployment scenarios when deploying your Kubernetes cluster and workloads on the
vSAN
stretched cluster.

Deployment 1

In this deployment, the control plane and worker nodes are placed on the primary site, but flexible enough to failover on another site, if the primary site fails. You deploy HA Proxy on the primary site. This is also known as an Active-Passive deployment because only one site of the stretched
vSAN
cluster is used to deploy VMs.
If you plan to use file volumes (RWX volumes), it is recommended to configure the vSAN file service domain to place file servers on the active site (preferred site). This reduces the cross-site traffic latency and delivers better performance for applications using file volumes.

Requirements for Deployment 1

Requirements
Parameters
Node Placement
  • The control plane and worker nodes are on the primary site. They are flexible enough to failover to another site, if the primary site fails.
  • HA Proxy on the primary site.
Failure to Tolerate
At least FTT1
DRS
Enabled
Site Disaster Tolerance
Dual Site Mirroring
Storage Policy Force Provisioning
Enabled
vSphere HA
Enabled

Potential Failover Scenarios for Deployment 1

The following table describes potential failover scenarios that might occur when you use deployment model 1.
Scenario
Description
Several
ESXi
hosts fail on the primary site.
  • Kubernetes node VMs move from unavailable hosts to the available hosts within primary sites.
  • If the worker node needs to be restarted, pods running on that node can be rescheduled and recreated on another node.
  • If the control plane node needs to be restarted, the existing application workload does not get affected.
The entire primary site and all hosts on the site fail.
  • Kubernetes node VMs move from the primary site to the secondary site.
  • You experience a complete downtime until node VMs restart on the secondary site.
Several hosts fail on the secondary site.
The failure does not affect the Kubernetes cluster because the entire cluster is at the primary site.
The entire secondary site and all hosts on the site fail.
  • The failure does not affect the Kubernetes cluster because the entire cluster is at the primary site.
  • Replication for storage objects stops because the secondary site is not available.
Intersite network failure occurs.
  • The failure does not affect the Kubernetes cluster because the entire cluster is at the primary site.
  • Replication for storage objects stops because the secondary site is not available.

Deployment 2

With this model, place the control plane nodes on the primary site and worker nodes can be spread across the primary and secondary site. You deploy HA Proxy on the primary site.

Requirements for Deployment 2

Requirements
Parameters
Node Placement
  • The control plane nodes on the primary site.
  • Worker nodes spread across the primary and secondary site.
  • HA Proxy on the primary site.
Failure to Tolerate
At least FTT1
DRS
Enabled
Site Disaster Tolerance
Dual Site Mirroring
Storage Policy Force Provisioning
Enabled
vSphere HA
Enabled

Potential Failover Scenarios for Deployment 2

The following table describes potential failover scenarios that might occur when you deploy a Kubernetes cluster using the Deployment 2 model.
Scenario
Description
Several
ESXi
hosts fail on the primary site.
  • Kubernetes node VMs move from unavailable hosts to the available hosts within the same site. If resources are not available, they move to anther site.
  • If the worker node needs to be restarted, pods running on that node might be rescheduled and recreated on another node.
  • If the control plane node needs to be restarted, the existing application workload does not get affected.
The entire primary site and all hosts on the site fail.
  • Kubernetes control plane node VMs and some worker nodes present on the primary site move from the primary site to the secondary site.
  • Expect the control plane downtime until the control plane nodes restart on the secondary site.
  • Expect partial downtime for pods running on the worker nodes on the primary site.
  • Pods deployed on the worker nodes on the secondary site are not affected.
Several hosts fail on the secondary site.
Node VMs and pods running on the node VMs restart on another host.
The entire secondary site and all hosts on the site fail.
  • Kubernetes control plane is unaffected.
  • Kubernetes control plane nodes move to the primary site.
  • Pods deployed on the worker nodes on the secondary site are affected. They restart along with node VMs.
Intersite network failure occurs.
  • Kubernetes control plane is unaffected.
  • Kubernetes worker nodes move to the primary site.
  • Pods deployed on the worker nodes on the secondary site are affected. They restart along with node VMs.

Deployment 3

In this deployment model, you can place two control plane nodes on the primary site and one control plane node on the secondary site. Deploy HA Proxy on the primary site. Worker nodes can be on any site.

Requirements for Deployment 3

You can use this deployment model if you have equal resources at both the primary, or preferred, fault domain and the secondary, non-preferred, fault domain and you want to use hardware located at both fault domains. Since both fault domains have some workload running, in case of a complete site failure, this deployment model will help with faster recovery.
This model requires specific DRS policy rules. One rule to specify affinity between two control plane nodes and the primary site and another rule for affinity between the third control plane node and the secondary site.
Requirements
Parameters
Node Placement
  • Two control plane nodes on the primary site.
  • One control plane node on the secondary site.
  • HA Proxy on the primary site.
  • Worker nodes on any site.
Failure to Tolerate
At least FTT1
DRS
Enabled
Site Disaster Tolerance
Dual Site Mirroring
Storage Policy Force Provisioning
Enabled
vSphere HA
Enabled

Potential Failover Scenarios for Deployment 3

The following table describes potential failover scenarios that might occur when you use the Deployment 3 model.
Scenario
Description
Several
ESXi
hosts fail on the primary site.
  • Affected nodes get restarted on the available host on the primary site.
  • If both control plane nodes are present on the failed host on the primary site, the control plane will be down until both control plane node recover on the available hosts on the primary site.
  • While nodes are restarting on available hosts, pods might get rescheduled and recreated on available nodes.
The entire primary site and all hosts on the site fail.
  • Node VMs move from the primary site to the secondary site.
  • Expect a downtime until node VMs restart on the secondary site.
Several hosts fail on the secondary site.
  • Node VMs and pods running on the node VMs restart on another host.
  • If a control plane node on the secondary site is affected, Kubernetes control plane remains unaffected. Kubernetes remains accessible through two master nodes on the primary site.
The entire secondary site and all hosts on the site fail.
  • The control plane node and worker nodes from the secondary site are migrated to the primary site.
  • Pods deployed on the worker nodes on the secondary site are affected. They restart along with the node VMs.
Intersite network failure occurs.
  • Kubernetes control plane is unaffected.
  • Kubernetes nodes move to the primary site.
  • Pods deployed on the worker nodes on the secondary site are affected. They restart along with the node VMs.

Upgrade Kubernetes and Persistent Volumes on
vSAN
Stretched Clusters

If you already have Kubernetes deployments on a
vSAN
datastore, you can upgrade your deployments after enabling
vSAN
stretched clusters on the datastore.
  1. Edit existing VM storage policy used for provisioning volumes and node VMs on the
    vSAN
    cluster to add stretched cluster parameters.
  2. Apply updated storage policy on all objects.
  3. Apply updated storage policy on the persistent volumes that have
    Out of date
    status.