Cluster Capacity Dashboard
The
Cluster Capacity
dashboard includes the ESXi host and
resource pools as they impact cluster capacity. Design Considerations
See Capacity Dashboards for
common design considerations among all the dashboards for capacity management.
How to Use the Dashboard
The
Cluster Capacity
dashboard is layered, gradually providing details as you work top-down in the
dashboard.Overall
Analysis
- The three bar charts which areClusters by Capacity Remaining,Clusters by Time Remaining,Clusters by VM Remaining, summarize the overall situation. The first two charts can be used together to identify when you need to add capacity to address growth. Time remaining uses historical growth in a cluster to forecast when more capacity is needed. This allows you to operate more efficiently by making sure you have enough capacity currently and proactively plan for adding capacity. The third bar chart which isClusters by VM Remaining, provides complete contexts, as different clusters can have different VM sizes.For a large environment, a heat map is helpful. The three heat maps are Time Remaining, Capacity Remaining, and VM Remaining. If your cluster sizes are not standardized, create another heat map, and use the number of ESXi hosts to show the size difference.
Cluster
Analysis
- TheClusters Capacitywidget provides a table with details. The number of ESXi hosts are color coded as smaller clusters have a relatively higher overhead. Select a cluster from the table to view the capacity details that are automatically displayed.PerformanceEnsure that the performance of the cluster meets your SLAs.The next two charts are Memory Workload (%) and CPU Workload (%), that show values relative to your usable capacity. Utilization is displayed for three months and not one week. The daily average is displayed and not the hourly average, so you can focus on the overall trend. For memory, the focus is on consumed memory and not active memory.UtilizationAllocationYou can view the trend of the three which are CPU, disk, and memory components together on theOvercommit Ratiochart. In general, your CPU overcommit should be the highest, followed by the disk (because of thin provision). Memory overcommit tends to be near one due to its nature as cache.Use the line chart in theAllocationwidget, to see the trend. The data is averaged hourly.In theVM Countwidget, the trend line of the number of VMs over time is important to spot if there are many newly provisioned VM. If you see that the VMs are increasing but demand remains low, it indicates a sign of potential demand in the future.ReservationReservation can impact the efficiency of your cluster. Your cluster could be low on capacity because of real workload or just reservation. If your cluster size varies, complement the reservation number by showing a relative value. Once you have a standardized number, you can visualize them on a heat map.
- ESXi AnalysisGood cluster capacity does not indicate that there is no issue at the ESXi level. Unbalance is a common problem, especially in large clusters and stretched clusters.TheESXi Hosts in a Clustertable displays all the member ESXi hosts. You can see the unbalance clearly, thanks to the color code. The color code reflects the unbalance.The99th percentile Performancecolumn takes the 99th percentile value of the ESXi Performance (%) metric.Select an ESXI host to view the details. Both theCPU Workload (%)and theMemory Workload (%)trend line charts display if there is a steady demand, cyclical demand, rising demand, or declining demand. The trend is as important as the present value. View trends over a longer time. Utilization is displayed for three months and not one week. The daily average is displayed and not the hourly average. The focus is on memory consumed and not memory active. Memory consumed includes the total memory consumed, so it includes the memory consumed by VMkernel. Both total and usable utilization in terms of memory and CPU are displayed and provides the absolute amount of capacity.
- VM AnalysisUse theVMs in the selected Cluster or Hosttable to analyze the cause of the low capacity remaining and which VMs are impacting the infrastructure resources, such as, CPU, memory, and disk space. The table lists either the VMs in the cluster or host. When you select one of the VMs, additional relevant information is displayed.If there are many large VMs running low on capacity you can stop provisioning until you upsize the existing VMs first.