Cluster Capacity Dashboard

The

Cluster Capacity

dashboard includes the ESXi host and resource pools as they impact cluster capacity.

Design Considerations

See Capacity Dashboards for common design considerations among all the dashboards for capacity management.

How to Use the Dashboard

The

Cluster Capacity

dashboard is layered, gradually providing details as you work top-down in the dashboard.

Overall Analysis

The three bar charts which are

Clusters by Capacity Remaining

Clusters by Time Remaining

Clusters by VM Remaining

, summarize the overall situation. The first two charts can be used together to identify when you need to add capacity to address growth. Time remaining uses historical growth in a cluster to forecast when more capacity is needed. This allows you to operate more efficiently by making sure you have enough capacity currently and proactively plan for adding capacity. The third bar chart which is

Clusters by VM Remaining

, provides complete contexts, as different clusters can have different VM sizes.

For a large environment, a heat map is helpful. The three heat maps are Time Remaining, Capacity Remaining, and VM Remaining. If your cluster sizes are not standardized, create another heat map, and use the number of ESXi hosts to show the size difference.

Cluster Analysis

The

Clusters Capacity

widget provides a table with details. The number of ESXi hosts are color coded as smaller clusters have a relatively higher overhead. Select a cluster from the table to view the capacity details that are automatically displayed.

Performance

Ensure that the performance of the cluster meets your SLAs.

Utilization

The next two charts are Memory Workload (%) and CPU Workload (%), that show values relative to your usable capacity. Utilization is displayed for three months and not one week. The daily average is displayed and not the hourly average, so you can focus on the overall trend. For memory, the focus is on consumed memory and not active memory.

Allocation

You can view the trend of the three which are CPU, disk, and memory components together on the

Overcommit Ratio

chart. In general, your CPU overcommit should be the highest, followed by the disk (because of thin provision). Memory overcommit tends to be near one due to its nature as cache.

Use the line chart in the

Allocation

widget, to see the trend. The data is averaged hourly.

In the

VM Count

widget, the trend line of the number of VMs over time is important to spot if there are many newly provisioned VM. If you see that the VMs are increasing but demand remains low, it indicates a sign of potential demand in the future.

Reservation

Reservation can impact the efficiency of your cluster. Your cluster could be low on capacity because of real workload or just reservation. If your cluster size varies, complement the reservation number by showing a relative value. Once you have a standardized number, you can visualize them on a heat map.

ESXi Analysis

Good cluster capacity does not indicate that there is no issue at the ESXi level. Unbalance is a common problem, especially in large clusters and stretched clusters.

The

ESXi Hosts in a Cluster

table displays all the member ESXi hosts. You can see the unbalance clearly, thanks to the color code. The color code reflects the unbalance.

The

99th percentile Performance

column takes the 99th percentile value of the ESXi Performance (%) metric.

Select an ESXI host to view the details. Both the

CPU Workload (%)

and the

Memory Workload (%)

trend line charts display if there is a steady demand, cyclical demand, rising demand, or declining demand. The trend is as important as the present value. View trends over a longer time. Utilization is displayed for three months and not one week. The daily average is displayed and not the hourly average. The focus is on memory consumed and not memory active. Memory consumed includes the total memory consumed, so it includes the memory consumed by VMkernel. Both total and usable utilization in terms of memory and CPU are displayed and provides the absolute amount of capacity.

VM Analysis

Use the

VMs in the selected Cluster or Host

table to analyze the cause of the low capacity remaining and which VMs are impacting the infrastructure resources, such as, CPU, memory, and disk space. The table lists either the VMs in the cluster or host. When you select one of the VMs, additional relevant information is displayed.

If there are many large VMs running low on capacity you can stop provisioning until you upsize the existing VMs first.

Capacity Dashboards

Content feedback and comments

VMware Aria Operations 8.17.1

Cluster Capacity Dashboard

Design Considerations

How to Use the Dashboard