Clustering - Installing
and Configuring
Do all platform VMs
have to be on the same L2/L3 segment?
No. However, it is best to
keep all platform nodes on a common network with low latencies between nodes.
This is because many of the distributed components replicate data among the
nodes and high latencies can cause system performance and stability issues.
Can a cluster be
upgraded using in-product upgrade feature?
Online upgrades are not supported for cluster till 3.7.
From 3.8 and the succeeding releases, a cluster can be upgraded using the online
upgrade method.
What happens if there
is a failure during the cluster creation process?
It is a best practice to take a backup of the primary
platform and proxies before starting the cluster creation process. If there is a
failure, delete the secondary platform nodes and recover primary platform and
collector VMs from the backup.
- If you are unable take a backup using EMC Avamar or VMware VDP, take the snapshots of the platform and the collector VMs when the VMs are switched off.
- Snapshots are not recommended in production environments and VMware does not recommend you to run VMs with snapshots for more than 3 days.
- You cannot consider snapshots as a backup.
What happens to the
existing data and configuration when I expand the single node deployment to a
cluster?
All data and configuration is
maintained without any change. The data will be accessible after cluster
creation.
Can you have platform
VM in different regions?
No, we require the Platform nodes to co-located be in
the same site. The collector servers can be geo-distributed.
Can platform hosted on vSAN Stretch clusters
(2 Data centers …)?
Yes, vSAN clusters within same or across data centers
would still ensure certain IO performance like local storage.
Can we host cluster
nodes on different vSAN Clusters?
Yes, Different nodes of a
Platform cluster could be hosted on different underlying datastores.
Do you need to backup
platform nodes?
Yes, backups must be taken using VMware recommended
snapshot/backup technologies.
How to estimate the bandwidth between the
cluster collector VM on a region and the platform VM cluster on another
region?
In some large deployments, we have seen this number
ranging from 1 mbps to 20 mbps. There is much of deduplication or compression that
happens in collector VM before data is sent to platform VM.
How much network
traffic will be between cluster node?
Traffic usually depends on size of cluster & type
of data center environment.
For installations with 30-50k
VMs:
- Between clusters: 50-400Mbps approx.
- Between collector & platform: 100Kbps-15Mbps approx.
What is the maximum
admissible latency between nodes in a cluster?
The platform nodes have to be co-located in the same
site. In such cases, the latency is minimal. If the platform nodes are hosted on
vSAN stretch clusters (two data centers), the vSAN clusters within or across the
clusters ensure certain IO performance like local storage. The applications running
on data centers such as
VMware Aria
Operations for Networks
work fine. You can host different nodes of a platform
cluster on different underlying data stores. But you need to ensure that all the
platform VMs in a cluster are co-located within the same site. What is the maximum admissible latency between
the collector VMs on a region and the platform VM cluster on another region?
You can have geo-distributed proxies in your setup.
There is an HTTPS connection from collector VM to platform VM so it can tolerate
high latencies, to order of few seconds.
VMware Aria
Operations for Networks
supports maximum of 10 nodes in a cluster
(30,000 VMs w/ flows Or 50,000 VMs without flows). What should be size of collector/platform
VM?
Use large brick
configuration: refer installation guide.