Shared Storage Design Decisions for the
Management Domain
Use this design decision list for
reference related to shared storage, vSAN principal storage, and NFS supplemental storage in
an environment with a single or multiple
VMware Cloud Foundation
instances. The design also considers whether an instance
contains a single or multiple availability zones.After you set up the physical storage
infrastructure, the configuration tasks for most design decisions are automated in
VMware Cloud Foundation
. You must
perform the configuration manually only for a limited number of decisions as noted in
the design implication.For full design details, see Shared Storage Design for the Management Domain.
vSAN Deployment
Specification
Decision
ID | Design
Decision | Design
Justification | Design
Implication |
---|---|---|---|
VCF-MGMT-VSAN-CFG-001 | Ensure that the
storage I/O controller that is running the vSAN disk groups is
capable and has a minimum queue depth of 256 set. | Storage controllers
with lower queue depths can cause performance and stability
problems when running vSAN. vSAN ReadyNode
servers are configured with the right queue depths for
vSAN. | Limits the number
of compatible I/O controllers that can be used for
storage. |
VCF-MGMT-VSAN-CFG-002 | Do not use the
storage I/O controllers that are running vSAN disk groups for
another purpose. | Running non-vSAN
disks, for example, VMFS, on a storage I/O controller that is
running a vSAN disk group can impact vSAN
performance. | If non-vSAN disks
are required in ESXi hosts, you must have an additional storage
I/O controller in the host. |
Decision
ID | Design
Decision | Design
Justification | Design
Implication |
---|---|---|---|
VCF-MGMT-VSAN-CFG-003 | Configure vSAN in
all-flash configuration in the default management
cluster. | Meets the
performance needs of the default management cluster. Using high-speed
magnetic disks in a hybrid vSAN configuration can provide
satisfactory performance and is supported. | All vSAN disks must
be flash disks, which might cost more than magnetic
disks. |
Decision ID | Design Decision | Design Justification | Design Implication |
---|---|---|---|
VCF-MGMT-VSAN-CFG-004 | Provide the default management cluster with a minimum of 13.72
TB of raw capacity for vSAN. | The management virtual machines require at least 4.4 TB of raw
storage (before setting FTT to 1) and 8.8 TB when using the
default vSAN storage policy. By allocating at
least 13.72 TB, initially 30% of the space is reserved for vSAN
internal operations and 20% of the space is free which you can
use it for additional growth of management virtual
machines. | If
you scale the environment out with more workloads, additional
storage is required in the management domain. |
VCF-MGMT-VSAN-CFG-005 | On
the vSAN datastore, ensure that at least 30% of free space is
always available. | When vSAN reaches 80% usage, a rebalance task is started which
can be resource-intensive. | Increases the amount of available storage needed. |
Decision ID | Design Decision | Design Justification | Design Implication |
---|---|---|---|
VCF-MGMT-VSAN-CFG-006 | Provide the default cluster in the management with a minimum of
19.86 TB of raw capacity for vSAN. | The management virtual machines require at least 6.36 TB of raw
storage (before setting FTT to 1) and 12.73 TB when using the
default vSAN storage policy. By allocating at
least 19.86 TB, initially 30% of the space is reserved for vSAN
internal operations and 20% of the space is free which you can
use for additional growth of management virtual machines. NFS is used as
secondary shared storage for some management components, for
example, for backups and log archives. | If
you scale the environment out with more workloads, additional
storage is required in the management domain. |
Decision
ID | Design
Decision | Design
Justification | Design
Implication |
---|---|---|---|
VCF-MGMT-VSAN-CFG-007 | The default
management cluster requires a minimum of 4 ESXi hosts to support
vSAN. |
| The availability
requirements for the management cluster might cause
underutilization of the cluster's ESXi hosts. |
Decision
ID | Design
Decision | Design
Justification | Design
Implication |
---|---|---|---|
VCF-MGMT-VSAN-CFG-008 | To support a vSAN
stretched cluster, the default management cluster requires a
minimum of 8 ESXi hosts (4 in each availability zone)
. |
| The capacity of the
additional 4 hosts is not added to capacity of the cluster. They
are only used to provide additional availability. |
Decision
ID | Design
Decision | Design
Justification | Design
Implication |
---|---|---|---|
VCF-MGMT-VSAN-CFG-009 | Configure vSAN with
a minimum of two disk groups per ESXi host. | Reduces the size of
the fault domain and spreads the I/O load over more disks for
better performance. | Multiple disks
groups require more disks in each ESXi host. |
Decision
ID | Design
Decision | Design
Justification | Design
Implication |
---|---|---|---|
VCF-MGMT-VSAN-CFG-010 | For the caching
tier in each disk group, use a flash-based drive that is at
least 600 GB large. | Provides enough
cache for both hybrid or all-flash vSAN configurations to buffer
I/O and ensure disk group performance. Additional space in
the cache tier does not increase performance. | Larger flash disks
can increase initial host cost |
VCF-MGMT-VSAN-CFG-011 | Allocate at least
2.3 TB of flash-based drives for the capacity tier in each disk
group. | Provides enough
capacity for the management virtual machines with a minimum of
30% of overhead and 20% growth when the number of primary
failures to tolerate is 1. | None. |
Decision
ID | Design
Decision | Design
Justification | Design
Implication |
---|---|---|---|
VCF-MGMT-VSAN-CFG-012 | Have at least 3.31
TB of flash-based drives for the capacity tier in each disk
group. | Provides enough
capacity for the management virtual machines with a minimum of
30% of overhead and 20% growth when the number of primary
failures to tolerate is 1. | None. |
Decision
ID | Design
Decision | Design
Justification | Design
Implication |
---|---|---|---|
VCF-MGMT-VSAN-CFG-013 | Use the default
VMware vSAN storage policy. | Provides the level
of redundancy that is needed in the management cluster. Provides the level
of performance that is enough for the individual management
components. | You might need
additional policies for third-party virtual machines hosted in
these clusters because their performance or availability
requirements might differ from what the default VMware vSAN
policy supports. |
VCF-MGMT-VSAN-CFG-014 | Leave the default virtual machine swap file as a sparse object
on VMware vSAN. | Sparse virtual swap files only consume capacity on vSAN as they
are accessed. As a result, you can reduce the consumption on the
vSAN datastore if virtual machines do not experience memory
over-commitment which requires the use of the virtual swap
file. | None. |
Decision
ID | Design
Decision | Design
Justification | Design
Implication |
---|---|---|---|
VCF-MGMT-VSAN-CFG-015 | Add the following
setting to the default vSAN storage policy: Secondary Failures
to Tolerate = 1 | Provides the
necessary protection for virtual machines in each availability
zone, with the ability to recover from an availability zone
outage. | You might need
additional policies if third-party virtual machines are to be
hosted in these clusters because their performance or
availability requirements might differ from what the default
VMware vSAN policy supports. |
VCF-MGMT-VSAN-CFG-016 | Configure two fault
domains, one for each availability zone. Assign each host to
their respective availability zone fault domain. | Fault domains are
mapped to availability zones to provide logical host separation
and ensure a copy of vSAN data is always available even when an
availability zone goes offline. | Additional raw
storage is required when the secondary failure to tolerate
option and fault domains are enabled. |
vSAN Network Design
Decision
ID | Design
Decision | Design
Justification | Design
Implication |
---|---|---|---|
VCF-MGMT-VSAN-NET-001 | Use the existing
vSphere Distributed Switch instances in the default management
cluster. | Provides
guaranteed performance for vSAN traffic in a connection-free
network by using existing networking components. | All traffic paths
are shared over common uplinks. |
VCF-MGMT-VSAN-NET-002 | Configure jumbo
frames on the VLAN for vSAN traffic. |
| Every device in
the network must support jumbo frames. |
vSAN Witness Design
Decision ID | Design Decision | Design
Justification | Design
Implication |
---|---|---|---|
VCF-MGMT-VSAN-WTN-001 | Deploy a vSAN
witness appliance in a location that is not local to the ESXi
hosts in any of the availability zones. | The witness
appliance has these features.
| A third
physically-separate location is required. Such a location must
have a vSphere environment. Another VMware Cloud Foundation
Instance in a separate physical location might be an option. |
VCF-MGMT-VSAN-WTN-002 | Deploy a medium-size
witness appliance. | A medium-size
witness appliance supports up to 500 virtual machines which is
sufficient for high availability of the management components of
the SDDC. | The vSphere
environment at the witness location must satisfy the resource
requirements of the witness appliance. |
Decision ID | Design
Decision | Design
Justification | Design
Implication |
---|---|---|---|
VCF-MGMT-VSAN-WTN-003 | Connect the first
VMkernel adapter of the vSAN witness appliance to the management
network in the witness site. | Connects the
witness appliance to the vCenter Server instance and ESXi hosts
in both availability zones. | The management
networks in both availability zones must be routed to the
management network in the witness site. |
VCF-MGMT-VSAN-WTN-004 | Configure the vSAN
witness appliance to use the first VMkernel adapter, that is the
management interface, for vSAN witness traffic. | Separates the
witness traffic from the vSAN data traffic. Witness traffic
separation provides the following benefits:
| The management
networks in both availability zones must be routed to the
management network in the witness site. |
VCF-MGMT-VSAN-WTN-005 | Place witness
traffic on the management VMkernel adapter of all the ESXi hosts
in the management domain. | Separates the
witness traffic from the vSAN data traffic. Witness traffic
separation provides the following benefits:
| The management
networks in both availability zones must be routed to the
management network in the witness site. |
VCF-MGMT-VSAN-WTN-006 | Allocate a
statically assigned IP address and host name to the management
adapter of the vSAN witness appliance. | Simplifies
maintenance and tracking, and implements a DNS
configuration. | Requires precise IP
address management. |
VCF-MGMT-VSAN-WTN-007 | Configure forward
and reverse DNS records for the vSAN witness appliance assigning
the record to the child domain for the VMware Cloud Foundation
instance. | Enables connecting
the vSAN witness appliance to the management domain vCenter
Server by FQDN instead of IP address. | You must provide DNS
records for the vSAN witness appliance. |
VCF-MGMT-VSAN-WTN-008 | Configure time
synchronization by using an internal NTP time for the vSAN
witness appliance. | Prevents any
failures in the stretched cluster configuration that are caused
by time mismatch between the vSAN witness appliance and the ESXi
hosts in both availability zones and management domain vCenter
Server. |
|
NFS Deployment
Specification
Decision
ID | Design
Decision | Design
Justification | Design
Implication |
---|---|---|---|
VCF-MGMT-NFS-CFG-001 | Ensure that at
least 20% of free space is always available on all non-vSAN
datastores. | If a datastore
runs out of free space, applications and services in the
management domain running on the NFS datastores
fail. | Monitoring and
capacity management must be proactive operations. |
Decision
ID | Design
Decision | Design
Justification | Design
Implication |
---|---|---|---|
VCF-MGMT-NFS-CFG-002 | Use NFS version 3
for all NFS datastores. | You cannot use
Storage I/O Control with NFS version 4.1 datastores. | NFS version 3
does not support Kerberos authentication. |
Decision
ID | Design
Decision | Design
Justification | Design
Implication |
---|---|---|---|
VCF-MGMT-NFS-CFG-003 |
| 10K SAS drives
provide a balance between performance and capacity. You can use
faster drives. vStorage API for Data Protection-based backups
require high- performance datastores to meet backup
SLAs. | 10K SAS drives
are more expensive than other alternatives. |
Decision
ID | Design
Decision | Design
Justification | Design
Implication |
---|---|---|---|
VCF-MGMT-NFS-CFG-004 | Select an array
that supports vStorage APIs for Array Integration (VAAI) over
NAS (NFS). |
| Not all arrays support
VAAI over NFS. For the arrays that support VAAI, to enable VAAI
over NFS, you must install a plug-in from the array vendor
. |
Decision
ID | Design
Decision | Design
Justification | Design
Implication |
---|---|---|---|
VCF-MGMT-NFS-CFG-005 | Use a dedicated
NFS volume to support image-level backup
requirements. | The backup and
restore process is I/O intensive. Using a dedicated NFS volume
ensures that the process does not impact the performance of
other management components. | Dedicated volumes
add management overhead to storage administrators. Dedicated
volumes might use more disks, according to the array and type of
RAID. |
VCF-MGMT-NFS-CFG-006 | Use a shared
volume for other management component datastores. | Non-backup
related management applications can share a common volume
because of the lower I/O profile of these
applications. | Enough storage
space for shared volumes and their associated application data
must be available. |
Decision
ID | Design
Decision | Design
Justification | Design
Implication |
---|---|---|---|
VCF-MGMT-NFS-CFG-007 | For each export,
limit access to the application virtual machines or hosts
requiring the ability to mount the storage only. | Limiting access
helps ensure the security of the underlying data. | Securing exports
individually can introduce operational overhead. |
Decision
ID | Design
Decision | Design
Justification | Design
Implication |
---|---|---|---|
VCF-MGMT-NFS-CFG-008 | Enable Storage
I/O Control with the default values on all supplemental NFS
datastores. | Ensures that all
virtual machines on a datastore receive equal amount of I/O
capacity. | Virtual machines
that use more I/O access the datastore with priority. Other
virtual machines can access the datastore only when an I/O
contention occurs on the datastore. |