Cluster Capacity Dashboard

The
Cluster Capacity
dashboard includes the ESXi host and resource pools as they impact cluster capacity.

Design Considerations

See Capacity Dashboards for common design considerations among all the dashboards for capacity management.

How to Use the Dashboard

The
Cluster Capacity
dashboard is layered, gradually providing details as you work top-down in the dashboard.
Overall Analysis
  • The three bar charts which are
    Clusters by Capacity Remaining
    ,
    Clusters by Time Remaining
    ,
    Clusters by VM Remaining
    , summarize the overall situation. The first two charts can be used together to identify when you need to add capacity to address growth. Time remaining uses historical growth in a cluster to forecast when more capacity is needed. This allows you to operate more efficiently by making sure you have enough capacity currently and proactively plan for adding capacity. The third bar chart which is
    Clusters by VM Remaining
    , provides complete contexts, as different clusters can have different VM sizes.
    For a large environment, a heat map is helpful. The three heat maps are Time Remaining, Capacity Remaining, and VM Remaining. If your cluster sizes are not standardized, create another heat map, and use the number of ESXi hosts to show the size difference.
Cluster Analysis
  • The
    Clusters Capacity
    widget provides a table with details. The number of ESXi hosts are color coded as smaller clusters have a relatively higher overhead. Select a cluster from the table to view the capacity details that are automatically displayed.
    Performance
    Ensure that the performance of the cluster meets your SLAs.
    Utilization
    The next two charts are Memory Workload (%) and CPU Workload (%), that show values relative to your usable capacity. Utilization is displayed for three months and not one week. The daily average is displayed and not the hourly average, so you can focus on the overall trend. For memory, the focus is on consumed memory and not active memory.
    Allocation
    You can view the trend of the three which are CPU, disk, and memory components together on the
    Overcommit Ratio
    chart. In general, your CPU overcommit should be the highest, followed by the disk (because of thin provision). Memory overcommit tends to be near one due to its nature as cache.
    Use the line chart in the
    Allocation
    widget, to see the trend. The data is averaged hourly.
    In the
    VM Count
    widget, the trend line of the number of VMs over time is important to spot if there are many newly provisioned VM. If you see that the VMs are increasing but demand remains low, it indicates a sign of potential demand in the future.
    Reservation
    Reservation can impact the efficiency of your cluster. Your cluster could be low on capacity because of real workload or just reservation. If your cluster size varies, complement the reservation number by showing a relative value. Once you have a standardized number, you can visualize them on a heat map.
  • ESXi Analysis
    Good cluster capacity does not indicate that there is no issue at the ESXi level. Unbalance is a common problem, especially in large clusters and stretched clusters.
    The
    ESXi Hosts in a Cluster
    table displays all the member ESXi hosts. You can see the unbalance clearly, thanks to the color code. The color code reflects the unbalance.
    The
    99th percentile Performance
    column takes the 99th percentile value of the ESXi Performance (%) metric.
    Select an ESXI host to view the details. Both the
    CPU Workload (%)
    and the
    Memory Workload (%)
    trend line charts display if there is a steady demand, cyclical demand, rising demand, or declining demand. The trend is as important as the present value. View trends over a longer time. Utilization is displayed for three months and not one week. The daily average is displayed and not the hourly average. The focus is on memory consumed and not memory active. Memory consumed includes the total memory consumed, so it includes the memory consumed by VMkernel. Both total and usable utilization in terms of memory and CPU are displayed and provides the absolute amount of capacity.
  • VM Analysis
    Use the
    VMs in the selected Cluster or Host
    table to analyze the cause of the low capacity remaining and which VMs are impacting the infrastructure resources, such as, CPU, memory, and disk space. The table lists either the VMs in the cluster or host. When you select one of the VMs, additional relevant information is displayed.
    If there are many large VMs running low on capacity you can stop provisioning until you upsize the existing VMs first.