Tanzu Packages latest

Install Prometheus and Grafana for Monitoring

Last Updated February 10, 2025

This topic gives an overview of the Prometheus and Grafana packages, and how you can install them in Tanzu Kubernetes Grid (TKG) workload clusters to provide monitoring services for the cluster.

  • Prometheus is an open-source systems monitoring and alerting toolkit. It can collect metrics from target clusters at specified intervals, evaluate rule expressions, display the results, and trigger alerts if certain conditions arise. For more information about Prometheus, see the Prometheus Overview. The Tanzu Kubernetes Grid implementation of Prometheus includes Alert Manager, which you can configure to notify you when certain events occur.
  • Grafana is open-source visualization and analytics software. It allows you to query, visualize, alert on, and explore your metrics no matter where they are stored. In other words, Grafana provides you with tools to turn your time-series database (TSDB) data into high-quality graphs and visualizations. For more information about Grafana, see What is Grafana?.

See Install Prometheus in Workload Clusters Deployed by a Standalone Management Cluster and Install Grafana in Workload Clusters Deployed by a Standalone Management Cluster.

As of v2.5, TKG does not support clusters on AWS or Azure. See the End of Support for TKG Management and Workload Clusters on AWS and Azure in the Tanzu Kubernetes Grid v2.5 Release Notes.

Prometheus, Alertmanager, and Grafana Components, Configuration, Data Values

Prometheus collects metrics from configured targets at given intervals, evaluates rule expressions, and displays the results.

Alertmanager triggers alerts if some condition is observed to be true.

Grafana lets you query, visualize, alert on, and explore your metrics no matter where they are stored.

The following diagram shows how the monitoring components on a cluster interact.

Prometheus can send alerts via AlertManager and can act as a data source for Grafana

Prometheus Components

The Prometheus package installs on a TKG cluster the containers listed in the table. The package pulls the containers from the VMware public registry specified in Package Repository.

ContainerResource TypeReplicasDescription
prometheus-alertmanagerDeployment1Handles alerts sent by client applications such as the Prometheus server.
prometheus-cadvisorDaemonSet5Analyzes and exposes resource usage and performance data from running containers
prometheus-kube-state-metricsDeployment1Monitors node status and capacity, replica-set compliance, pod, job, and cronjob status, resource requests and limits.
prometheus-node-exporterDaemonSet5Exporter for hardware and OS metrics exposed by kernels.
prometheus-pushgatewayDeployment1Service that allows you to push metrics from jobs which cannot be scraped.
prometheus-serverDeployment1Provides core functionality, including scraping, rule processing, and alerting.

Prometheus Data Values

Below is an example prometheus-data-values.yaml file.

Note the following:

  • Ingress is enabled (ingress: enabled: true).
  • Ingress is configured for URLs ending in /alertmanager/ (alertmanager_prefix:) and / (prometheus_prefix:).
  • The FQDN for Prometheus is prometheus.system.tanzu (virtual_host_fqdn:).
  • Supply your own custom certificate in the Ingress section (tls.crt, tls.key, ca.crt).
  • The pvc for alertmanager is 2GiB. Supply the storageClassName for the default storage policy.
  • The pvc for prometheus is 20GiB. Supply the storageClassName for the vSphere storage policy.
namespace: prometheus-monitoring
alertmanager:
  config:
    alertmanager_yml: |
      global: {}
      receivers:
      - name: default-receiver
      templates:
      - '/etc/alertmanager/templates/*.tmpl'
      route:
        group_interval: 5m
        group_wait: 10s
        receiver: default-receiver
        repeat_interval: 3h
  deployment:
    replicas: 1
    rollingUpdate:
      maxSurge: 25%
      maxUnavailable: 25%
    updateStrategy: Recreate
  pvc:
    accessMode: ReadWriteOnce
    storage: 2Gi
    storageClassName: default
  service:
    port: 80
    targetPort: 9093
    type: ClusterIP
ingress:
  alertmanager_prefix: /alertmanager/
  alertmanagerServicePort: 80
  enabled: true
  prometheus_prefix: /
  prometheusServicePort: 80
  tlsCertificate:
    ca.crt: |
      -----BEGIN CERTIFICATE-----
      MIIFczCCA1ugAwIBAgIQTYJITQ3SZ4BBS9UzXfJIuTANBgkqhkiG9w0BAQsFADBM
      ...
      w0oGuTTBfxSMKs767N3G1q5tz0mwFpIqIQtXUSmaJ+9p7IkpWcThLnyYYo1IpWm/
      ZHtjzZMQVA==
      -----END CERTIFICATE-----
    tls.crt: |
      -----BEGIN CERTIFICATE-----
      MIIHxTCCBa2gAwIBAgITIgAAAAQnSpH7QfxTKAAAAAAABDANBgkqhkiG9w0BAQsF
      ...
      YYsIjp7/f+Pk1DjzWx8JIAbzItKLucDreAmmDXqk+DrBP9LYqtmjB0n7nSErgK8G
      sA3kGCJdOkI0kgF10gsinaouG2jVlwNOsw==
      -----END CERTIFICATE-----
    tls.key: |
      -----BEGIN PRIVATE KEY-----
      MIIJRAIBADANBgkqhkiG9w0BAQEFAASCCS4wggkqAgEAAoICAQDOGHT8I12KyQGS
      ...
      l1NzswracGQIzo03zk/X3Z6P2YOea4BkZ0Iwh34wOHJnTkfEeSx6y+oSFMcFRthT
      yfFCZUk/sVCc/C1a4VigczXftUGiRrTR
      -----END PRIVATE KEY-----
  virtual_host_fqdn: prometheus.system.tanzu
kube_state_metrics:
  deployment:
    replicas: 1
  service:
    port: 80
    targetPort: 8080
    telemetryPort: 81
    telemetryTargetPort: 8081
    type: ClusterIP
node_exporter:
  daemonset:
    hostNetwork: false
    updatestrategy: RollingUpdate
  service:
    port: 9100
    targetPort: 9100
    type: ClusterIP
prometheus:
  pspNames: "vmware-system-restricted"
  config:
    alerting_rules_yml: |
      {}
    alerts_yml: |
      {}
    prometheus_yml: |
      global:
        evaluation_interval: 1m
        scrape_interval: 1m
        scrape_timeout: 10s
      rule_files:
      - /etc/config/alerting_rules.yml
      - /etc/config/recording_rules.yml
      - /etc/config/alerts
      - /etc/config/rules
      scrape_configs:
      - job_name: 'prometheus'
        scrape_interval: 5s
        static_configs:
        - targets: ['localhost:9090']
      - job_name: 'kube-state-metrics'
        static_configs:
        - targets: ['prometheus-kube-state-metrics.prometheus.svc.cluster.local:8080']

      - job_name: 'node-exporter'
        static_configs:
        - targets: ['prometheus-node-exporter.prometheus.svc.cluster.local:9100']

      - job_name: 'kubernetes-pods'
        kubernetes_sd_configs:
        - role: pod
        relabel_configs:
        - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
          action: keep
          regex: true
        - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path]
          action: replace
          target_label: __metrics_path__
          regex: (.+)
        - source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port]
          action: replace
          regex: ([^:]+)(?::\d+)?;(\d+)
          replacement: $1:$2
          target_label: __address__
        - action: labelmap
          regex: __meta_kubernetes_pod_label_(.+)
        - source_labels: [__meta_kubernetes_namespace]
          action: replace
          target_label: kubernetes_namespace
        - source_labels: [__meta_kubernetes_pod_name]
          action: replace
          target_label: kubernetes_pod_name
      - job_name: kubernetes-nodes-cadvisor
        kubernetes_sd_configs:
        - role: node
        relabel_configs:
        - action: labelmap
          regex: __meta_kubernetes_node_label_(.+)
        - replacement: kubernetes.default.svc:443
          target_label: __address__
        - regex: (.+)
          replacement: /api/v1/nodes/$1/proxy/metrics/cadvisor
          source_labels:
          - __meta_kubernetes_node_name
          target_label: __metrics_path__
        scheme: https
        tls_config:
          ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
          insecure_skip_verify: true
        bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
      - job_name: kubernetes-apiservers
        kubernetes_sd_configs:
        - role: endpoints
        relabel_configs:
        - action: keep
          regex: default;kubernetes;https
          source_labels:
          - __meta_kubernetes_namespace
          - __meta_kubernetes_service_name
          - __meta_kubernetes_endpoint_port_name
        scheme: https
        tls_config:
          ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
          insecure_skip_verify: true
        bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
      alerting:
        alertmanagers:
        - scheme: http
          static_configs:
          - targets:
            - alertmanager.prometheus.svc:80
        - kubernetes_sd_configs:
            - role: pod
          relabel_configs:
          - source_labels: [__meta_kubernetes_namespace]
            regex: default
            action: keep
          - source_labels: [__meta_kubernetes_pod_label_app]
            regex: prometheus
            action: keep
          - source_labels: [__meta_kubernetes_pod_label_component]
            regex: alertmanager
            action: keep
          - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_probe]
            regex: .*
            action: keep
          - source_labels: [__meta_kubernetes_pod_container_port_number]
            regex:
            action: drop
    recording_rules_yml: |
      groups:
        - name: kube-apiserver.rules
          interval: 3m
          rules:
          - expr: |2
              (
                (
                  sum(rate(apiserver_request_duration_seconds_count{job="kubernetes-apiservers",verb=~"LIST|GET"}[1d]))
                  -
                  (
                    (
                      sum(rate(apiserver_request_duration_seconds_bucket{job="kubernetes-apiservers",verb=~"LIST|GET",scope=~"resource|",le="0.1"}[1d]))
                      or
                      vector(0)
                    )
                    +
                    sum(rate(apiserver_request_duration_seconds_bucket{job="kubernetes-apiservers",verb=~"LIST|GET",scope="namespace",le="0.5"}[1d]))
                    +
                    sum(rate(apiserver_request_duration_seconds_bucket{job="kubernetes-apiservers",verb=~"LIST|GET",scope="cluster",le="5"}[1d]))
                  )
                )
                +
                # errors
                sum(rate(apiserver_request_total{job="kubernetes-apiservers",verb=~"LIST|GET",code=~"5.."}[1d]))
              )
              /
              sum(rate(apiserver_request_total{job="kubernetes-apiservers",verb=~"LIST|GET"}[1d]))
            labels:
              verb: read
            record: apiserver_request:burnrate1d
          - expr: |2
              (
                (
                  # too slow
                  sum(rate(apiserver_request_duration_seconds_count{job="kubernetes-apiservers",verb=~"LIST|GET"}[1h]))
                  -
                  (
                    (
                      sum(rate(apiserver_request_duration_seconds_bucket{job="kubernetes-apiservers",verb=~"LIST|GET",scope=~"resource|",le="0.1"}[1h]))
                      or
                      vector(0)
                    )
                    +
                    sum(rate(apiserver_request_duration_seconds_bucket{job="kubernetes-apiservers",verb=~"LIST|GET",scope="namespace",le="0.5"}[1h]))
                    +
                    sum(rate(apiserver_request_duration_seconds_bucket{job="kubernetes-apiservers",verb=~"LIST|GET",scope="cluster",le="5"}[1h]))
                  )
                )
                +
                # errors
                sum(rate(apiserver_request_total{job="kubernetes-apiservers",verb=~"LIST|GET",code=~"5.."}[1h]))
              )
              /
              sum(rate(apiserver_request_total{job="kubernetes-apiservers",verb=~"LIST|GET"}[1h]))
            labels:
              verb: read
            record: apiserver_request:burnrate1h
          - expr: |2
              (
                (
                  # too slow
                  sum(rate(apiserver_request_duration_seconds_count{job="kubernetes-apiservers",verb=~"LIST|GET"}[2h]))
                  -
                  (
                    (
                      sum(rate(apiserver_request_duration_seconds_bucket{job="kubernetes-apiservers",verb=~"LIST|GET",scope=~"resource|",le="0.1"}[2h]))
                      or
                      vector(0)
                    )
                    +
                    sum(rate(apiserver_request_duration_seconds_bucket{job="kubernetes-apiservers",verb=~"LIST|GET",scope="namespace",le="0.5"}[2h]))
                    +
                    sum(rate(apiserver_request_duration_seconds_bucket{job="kubernetes-apiservers",verb=~"LIST|GET",scope="cluster",le="5"}[2h]))
                  )
                )
                +
                # errors
                sum(rate(apiserver_request_total{job="kubernetes-apiservers",verb=~"LIST|GET",code=~"5.."}[2h]))
              )
              /
              sum(rate(apiserver_request_total{job="kubernetes-apiservers",verb=~"LIST|GET"}[2h]))
            labels:
              verb: read
            record: apiserver_request:burnrate2h
          - expr: |2
              (
                (
                  # too slow
                  sum(rate(apiserver_request_duration_seconds_count{job="kubernetes-apiservers",verb=~"LIST|GET"}[30m]))
                  -
                  (
                    (
                      sum(rate(apiserver_request_duration_seconds_bucket{job="kubernetes-apiservers",verb=~"LIST|GET",scope=~"resource|",le="0.1"}[30m]))
                      or
                      vector(0)
                    )
                    +
                    sum(rate(apiserver_request_duration_seconds_bucket{job="kubernetes-apiservers",verb=~"LIST|GET",scope="namespace",le="0.5"}[30m]))
                    +
                    sum(rate(apiserver_request_duration_seconds_bucket{job="kubernetes-apiservers",verb=~"LIST|GET",scope="cluster",le="5"}[30m]))
                  )
                )
                +
                # errors
                sum(rate(apiserver_request_total{job="kubernetes-apiservers",verb=~"LIST|GET",code=~"5.."}[30m]))
              )
              /
              sum(rate(apiserver_request_total{job="kubernetes-apiservers",verb=~"LIST|GET"}[30m]))
            labels:
              verb: read
            record: apiserver_request:burnrate30m
          - expr: |2
              (
                (
                  # too slow
                  sum(rate(apiserver_request_duration_seconds_count{job="kubernetes-apiservers",verb=~"LIST|GET"}[3d]))
                  -
                  (
                    (
                      sum(rate(apiserver_request_duration_seconds_bucket{job="kubernetes-apiservers",verb=~"LIST|GET",scope=~"resource|",le="0.1"}[3d]))
                      or
                      vector(0)
                    )
                    +
                    sum(rate(apiserver_request_duration_seconds_bucket{job="kubernetes-apiservers",verb=~"LIST|GET",scope="namespace",le="0.5"}[3d]))
                    +
                    sum(rate(apiserver_request_duration_seconds_bucket{job="kubernetes-apiservers",verb=~"LIST|GET",scope="cluster",le="5"}[3d]))
                  )
                )
                +
                # errors
                sum(rate(apiserver_request_total{job="kubernetes-apiservers",verb=~"LIST|GET",code=~"5.."}[3d]))
              )
              /
              sum(rate(apiserver_request_total{job="kubernetes-apiservers",verb=~"LIST|GET"}[3d]))
            labels:
              verb: read
            record: apiserver_request:burnrate3d
          - expr: |2
              (
                (
                  # too slow
                  sum(rate(apiserver_request_duration_seconds_count{job="kubernetes-apiservers",verb=~"LIST|GET"}[5m]))
                  -
                  (
                    (
                      sum(rate(apiserver_request_duration_seconds_bucket{job="kubernetes-apiservers",verb=~"LIST|GET",scope=~"resource|",le="0.1"}[5m]))
                      or
                      vector(0)
                    )
                    +
                    sum(rate(apiserver_request_duration_seconds_bucket{job="kubernetes-apiservers",verb=~"LIST|GET",scope="namespace",le="0.5"}[5m]))
                    +
                    sum(rate(apiserver_request_duration_seconds_bucket{job="kubernetes-apiservers",verb=~"LIST|GET",scope="cluster",le="5"}[5m]))
                  )
                )
                +
                # errors
                sum(rate(apiserver_request_total{job="kubernetes-apiservers",verb=~"LIST|GET",code=~"5.."}[5m]))
              )
              /
              sum(rate(apiserver_request_total{job="kubernetes-apiservers",verb=~"LIST|GET"}[5m]))
            labels:
              verb: read
            record: apiserver_request:burnrate5m
          - expr: |2
              (
                (
                  # too slow
                  sum(rate(apiserver_request_duration_seconds_count{job="kubernetes-apiservers",verb=~"LIST|GET"}[6h]))
                  -
                  (
                    (
                      sum(rate(apiserver_request_duration_seconds_bucket{job="kubernetes-apiservers",verb=~"LIST|GET",scope=~"resource|",le="0.1"}[6h]))
                      or
                      vector(0)
                    )
                    +
                    sum(rate(apiserver_request_duration_seconds_bucket{job="kubernetes-apiservers",verb=~"LIST|GET",scope="namespace",le="0.5"}[6h]))
                    +
                    sum(rate(apiserver_request_duration_seconds_bucket{job="kubernetes-apiservers",verb=~"LIST|GET",scope="cluster",le="5"}[6h]))
                  )
                )
                +
                # errors
                sum(rate(apiserver_request_total{job="kubernetes-apiservers",verb=~"LIST|GET",code=~"5.."}[6h]))
              )
              /
              sum(rate(apiserver_request_total{job="kubernetes-apiservers",verb=~"LIST|GET"}[6h]))
            labels:
              verb: read
            record: apiserver_request:burnrate6h
          - expr: |2
              (
                (
                  # too slow
                  sum(rate(apiserver_request_duration_seconds_count{job="kubernetes-apiservers",verb=~"POST|PUT|PATCH|DELETE"}[1d]))
                  -
                  sum(rate(apiserver_request_duration_seconds_bucket{job="kubernetes-apiservers",verb=~"POST|PUT|PATCH|DELETE",le="1"}[1d]))
                )
                +
                sum(rate(apiserver_request_total{job="kubernetes-apiservers",verb=~"POST|PUT|PATCH|DELETE",code=~"5.."}[1d]))
              )
              /
              sum(rate(apiserver_request_total{job="kubernetes-apiservers",verb=~"POST|PUT|PATCH|DELETE"}[1d]))
            labels:
              verb: write
            record: apiserver_request:burnrate1d
          - expr: |2
              (
                (
                  # too slow
                  sum(rate(apiserver_request_duration_seconds_count{job="kubernetes-apiservers",verb=~"POST|PUT|PATCH|DELETE"}[1h]))
                  -
                  sum(rate(apiserver_request_duration_seconds_bucket{job="kubernetes-apiservers",verb=~"POST|PUT|PATCH|DELETE",le="1"}[1h]))
                )
                +
                sum(rate(apiserver_request_total{job="kubernetes-apiservers",verb=~"POST|PUT|PATCH|DELETE",code=~"5.."}[1h]))
              )
              /
              sum(rate(apiserver_request_total{job="kubernetes-apiservers",verb=~"POST|PUT|PATCH|DELETE"}[1h]))
            labels:
              verb: write
            record: apiserver_request:burnrate1h
          - expr: |2
              (
                (
                  # too slow
                  sum(rate(apiserver_request_duration_seconds_count{job="kubernetes-apiservers",verb=~"POST|PUT|PATCH|DELETE"}[2h]))
                  -
                  sum(rate(apiserver_request_duration_seconds_bucket{job="kubernetes-apiservers",verb=~"POST|PUT|PATCH|DELETE",le="1"}[2h]))
                )
                +
                sum(rate(apiserver_request_total{job="kubernetes-apiservers",verb=~"POST|PUT|PATCH|DELETE",code=~"5.."}[2h]))
              )
              /
              sum(rate(apiserver_request_total{job="kubernetes-apiservers",verb=~"POST|PUT|PATCH|DELETE"}[2h]))
            labels:
              verb: write
            record: apiserver_request:burnrate2h
          - expr: |2
              (
                (
                  # too slow
                  sum(rate(apiserver_request_duration_seconds_count{job="kubernetes-apiservers",verb=~"POST|PUT|PATCH|DELETE"}[30m]))
                  -
                  sum(rate(apiserver_request_duration_seconds_bucket{job="kubernetes-apiservers",verb=~"POST|PUT|PATCH|DELETE",le="1"}[30m]))
                )
                +
                sum(rate(apiserver_request_total{job="kubernetes-apiservers",verb=~"POST|PUT|PATCH|DELETE",code=~"5.."}[30m]))
              )
              /
              sum(rate(apiserver_request_total{job="kubernetes-apiservers",verb=~"POST|PUT|PATCH|DELETE"}[30m]))
            labels:
              verb: write
            record: apiserver_request:burnrate30m
          - expr: |2
              (
                (
                  # too slow
                  sum(rate(apiserver_request_duration_seconds_count{job="kubernetes-apiservers",verb=~"POST|PUT|PATCH|DELETE"}[3d]))
                  -
                  sum(rate(apiserver_request_duration_seconds_bucket{job="kubernetes-apiservers",verb=~"POST|PUT|PATCH|DELETE",le="1"}[3d]))
                )
                +
                sum(rate(apiserver_request_total{job="kubernetes-apiservers",verb=~"POST|PUT|PATCH|DELETE",code=~"5.."}[3d]))
              )
              /
              sum(rate(apiserver_request_total{job="kubernetes-apiservers",verb=~"POST|PUT|PATCH|DELETE"}[3d]))
            labels:
              verb: write
            record: apiserver_request:burnrate3d
          - expr: |2
              (
                (
                  # too slow
                  sum(rate(apiserver_request_duration_seconds_count{job="kubernetes-apiservers",verb=~"POST|PUT|PATCH|DELETE"}[5m]))
                  -
                  sum(rate(apiserver_request_duration_seconds_bucket{job="kubernetes-apiservers",verb=~"POST|PUT|PATCH|DELETE",le="1"}[5m]))
                )
                +
                sum(rate(apiserver_request_total{job="kubernetes-apiservers",verb=~"POST|PUT|PATCH|DELETE",code=~"5.."}[5m]))
              )
              /
              sum(rate(apiserver_request_total{job="kubernetes-apiservers",verb=~"POST|PUT|PATCH|DELETE"}[5m]))
            labels:
              verb: write
            record: apiserver_request:burnrate5m
          - expr: |2
              (
                (
                  # too slow
                  sum(rate(apiserver_request_duration_seconds_count{job="kubernetes-apiservers",verb=~"POST|PUT|PATCH|DELETE"}[6h]))
                  -
                  sum(rate(apiserver_request_duration_seconds_bucket{job="kubernetes-apiservers",verb=~"POST|PUT|PATCH|DELETE",le="1"}[6h]))
                )
                +
                sum(rate(apiserver_request_total{job="kubernetes-apiservers",verb=~"POST|PUT|PATCH|DELETE",code=~"5.."}[6h]))
              )
              /
              sum(rate(apiserver_request_total{job="kubernetes-apiservers",verb=~"POST|PUT|PATCH|DELETE"}[6h]))
            labels:
              verb: write
            record: apiserver_request:burnrate6h
          - expr: |
              sum by (code,resource) (rate(apiserver_request_total{job="kubernetes-apiservers",verb=~"LIST|GET"}[5m]))
            labels:
              verb: read
            record: code_resource:apiserver_request_total:rate5m
          - expr: |
              sum by (code,resource) (rate(apiserver_request_total{job="kubernetes-apiservers",verb=~"POST|PUT|PATCH|DELETE"}[5m]))
            labels:
              verb: write
            record: code_resource:apiserver_request_total:rate5m
          - expr: |
              histogram_quantile(0.99, sum by (le, resource) (rate(apiserver_request_duration_seconds_bucket{job="kubernetes-apiservers",verb=~"LIST|GET"}[5m]))) > 0
            labels:
              quantile: "0.99"
              verb: read
            record: cluster_quantile:apiserver_request_duration_seconds:histogram_quantile
          - expr: |
              histogram_quantile(0.99, sum by (le, resource) (rate(apiserver_request_duration_seconds_bucket{job="kubernetes-apiservers",verb=~"POST|PUT|PATCH|DELETE"}[5m]))) > 0
            labels:
              quantile: "0.99"
              verb: write
            record: cluster_quantile:apiserver_request_duration_seconds:histogram_quantile
          - expr: |2
              sum(rate(apiserver_request_duration_seconds_sum{subresource!="log",verb!~"LIST|WATCH|WATCHLIST|DELETECOLLECTION|PROXY|CONNECT"}[5m])) without(instance, pod)
              /
              sum(rate(apiserver_request_duration_seconds_count{subresource!="log",verb!~"LIST|WATCH|WATCHLIST|DELETECOLLECTION|PROXY|CONNECT"}[5m])) without(instance, pod)
            record: cluster:apiserver_request_duration_seconds:mean5m
          - expr: |
              histogram_quantile(0.99, sum(rate(apiserver_request_duration_seconds_bucket{job="kubernetes-apiservers",subresource!="log",verb!~"LIST|WATCH|WATCHLIST|DELETECOLLECTION|PROXY|CONNECT"}[5m])) without(instance, pod))
            labels:
              quantile: "0.99"
            record: cluster_quantile:apiserver_request_duration_seconds:histogram_quantile
          - expr: |
              histogram_quantile(0.9, sum(rate(apiserver_request_duration_seconds_bucket{job="kubernetes-apiservers",subresource!="log",verb!~"LIST|WATCH|WATCHLIST|DELETECOLLECTION|PROXY|CONNECT"}[5m])) without(instance, pod))
            labels:
              quantile: "0.9"
            record: cluster_quantile:apiserver_request_duration_seconds:histogram_quantile
          - expr: |
              histogram_quantile(0.5, sum(rate(apiserver_request_duration_seconds_bucket{job="kubernetes-apiservers",subresource!="log",verb!~"LIST|WATCH|WATCHLIST|DELETECOLLECTION|PROXY|CONNECT"}[5m])) without(instance, pod))
            labels:
              quantile: "0.5"
            record: cluster_quantile:apiserver_request_duration_seconds:histogram_quantile
        - interval: 3m
          name: kube-apiserver-availability.rules
          rules:
          - expr: |2
              1 - (
                (
                  # write too slow
                  sum(increase(apiserver_request_duration_seconds_count{verb=~"POST|PUT|PATCH|DELETE"}[30d]))
                  -
                  sum(increase(apiserver_request_duration_seconds_bucket{verb=~"POST|PUT|PATCH|DELETE",le="1"}[30d]))
                ) +
                (
                  # read too slow
                  sum(increase(apiserver_request_duration_seconds_count{verb=~"LIST|GET"}[30d]))
                  -
                  (
                    (
                      sum(increase(apiserver_request_duration_seconds_bucket{verb=~"LIST|GET",scope=~"resource|",le="0.1"}[30d]))
                      or
                      vector(0)
                    )
                    +
                    sum(increase(apiserver_request_duration_seconds_bucket{verb=~"LIST|GET",scope="namespace",le="0.5"}[30d]))
                    +
                    sum(increase(apiserver_request_duration_seconds_bucket{verb=~"LIST|GET",scope="cluster",le="5"}[30d]))
                  )
                ) +
                # errors
                sum(code:apiserver_request_total:increase30d{code=~"5.."} or vector(0))
              )
              /
              sum(code:apiserver_request_total:increase30d)
            labels:
              verb: all
            record: apiserver_request:availability30d
          - expr: |2
              1 - (
                sum(increase(apiserver_request_duration_seconds_count{job="kubernetes-apiservers",verb=~"LIST|GET"}[30d]))
                -
                (
                  # too slow
                  (
                    sum(increase(apiserver_request_duration_seconds_bucket{job="kubernetes-apiservers",verb=~"LIST|GET",scope=~"resource|",le="0.1"}[30d]))
                    or
                    vector(0)
                  )
                  +
                  sum(increase(apiserver_request_duration_seconds_bucket{job="kubernetes-apiservers",verb=~"LIST|GET",scope="namespace",le="0.5"}[30d]))
                  +
                  sum(increase(apiserver_request_duration_seconds_bucket{job="kubernetes-apiservers",verb=~"LIST|GET",scope="cluster",le="5"}[30d]))
                )
                +
                # errors
                sum(code:apiserver_request_total:increase30d{verb="read",code=~"5.."} or vector(0))
              )
              /
              sum(code:apiserver_request_total:increase30d{verb="read"})
            labels:
              verb: read
            record: apiserver_request:availability30d
          - expr: |2
              1 - (
                (
                  # too slow
                  sum(increase(apiserver_request_duration_seconds_count{verb=~"POST|PUT|PATCH|DELETE"}[30d]))
                  -
                  sum(increase(apiserver_request_duration_seconds_bucket{verb=~"POST|PUT|PATCH|DELETE",le="1"}[30d]))
                )
                +
                # errors
                sum(code:apiserver_request_total:increase30d{verb="write",code=~"5.."} or vector(0))
              )
              /
              sum(code:apiserver_request_total:increase30d{verb="write"})
            labels:
              verb: write
            record: apiserver_request:availability30d
          - expr: |
              sum by (code, verb) (increase(apiserver_request_total{job="kubernetes-apiservers",verb="LIST",code=~"2.."}[30d]))
            record: code_verb:apiserver_request_total:increase30d
          - expr: |
              sum by (code, verb) (increase(apiserver_request_total{job="kubernetes-apiservers",verb="GET",code=~"2.."}[30d]))
            record: code_verb:apiserver_request_total:increase30d
          - expr: |
              sum by (code, verb) (increase(apiserver_request_total{job="kubernetes-apiservers",verb="POST",code=~"2.."}[30d]))
            record: code_verb:apiserver_request_total:increase30d
          - expr: |
              sum by (code, verb) (increase(apiserver_request_total{job="kubernetes-apiservers",verb="PUT",code=~"2.."}[30d]))
            record: code_verb:apiserver_request_total:increase30d
          - expr: |
              sum by (code, verb) (increase(apiserver_request_total{job="kubernetes-apiservers",verb="PATCH",code=~"2.."}[30d]))
            record: code_verb:apiserver_request_total:increase30d
          - expr: |
              sum by (code, verb) (increase(apiserver_request_total{job="kubernetes-apiservers",verb="DELETE",code=~"2.."}[30d]))
            record: code_verb:apiserver_request_total:increase30d
          - expr: |
              sum by (code, verb) (increase(apiserver_request_total{job="kubernetes-apiservers",verb="LIST",code=~"3.."}[30d]))
            record: code_verb:apiserver_request_total:increase30d
          - expr: |
              sum by (code, verb) (increase(apiserver_request_total{job="kubernetes-apiservers",verb="GET",code=~"3.."}[30d]))
            record: code_verb:apiserver_request_total:increase30d
          - expr: |
              sum by (code, verb) (increase(apiserver_request_total{job="kubernetes-apiservers",verb="POST",code=~"3.."}[30d]))
            record: code_verb:apiserver_request_total:increase30d
          - expr: |
              sum by (code, verb) (increase(apiserver_request_total{job="kubernetes-apiservers",verb="PUT",code=~"3.."}[30d]))
            record: code_verb:apiserver_request_total:increase30d
          - expr: |
              sum by (code, verb) (increase(apiserver_request_total{job="kubernetes-apiservers",verb="PATCH",code=~"3.."}[30d]))
            record: code_verb:apiserver_request_total:increase30d
          - expr: |
              sum by (code, verb) (increase(apiserver_request_total{job="kubernetes-apiservers",verb="DELETE",code=~"3.."}[30d]))
            record: code_verb:apiserver_request_total:increase30d
          - expr: |
              sum by (code, verb) (increase(apiserver_request_total{job="kubernetes-apiservers",verb="LIST",code=~"4.."}[30d]))
            record: code_verb:apiserver_request_total:increase30d
          - expr: |
              sum by (code, verb) (increase(apiserver_request_total{job="kubernetes-apiservers",verb="GET",code=~"4.."}[30d]))
            record: code_verb:apiserver_request_total:increase30d
          - expr: |
              sum by (code, verb) (increase(apiserver_request_total{job="kubernetes-apiservers",verb="POST",code=~"4.."}[30d]))
            record: code_verb:apiserver_request_total:increase30d
          - expr: |
              sum by (code, verb) (increase(apiserver_request_total{job="kubernetes-apiservers",verb="PUT",code=~"4.."}[30d]))
            record: code_verb:apiserver_request_total:increase30d
          - expr: |
              sum by (code, verb) (increase(apiserver_request_total{job="kubernetes-apiservers",verb="PATCH",code=~"4.."}[30d]))
            record: code_verb:apiserver_request_total:increase30d
          - expr: |
              sum by (code, verb) (increase(apiserver_request_total{job="kubernetes-apiservers",verb="DELETE",code=~"4.."}[30d]))
            record: code_verb:apiserver_request_total:increase30d
          - expr: |
              sum by (code, verb) (increase(apiserver_request_total{job="kubernetes-apiservers",verb="LIST",code=~"5.."}[30d]))
            record: code_verb:apiserver_request_total:increase30d
          - expr: |
              sum by (code, verb) (increase(apiserver_request_total{job="kubernetes-apiservers",verb="GET",code=~"5.."}[30d]))
            record: code_verb:apiserver_request_total:increase30d
          - expr: |
              sum by (code, verb) (increase(apiserver_request_total{job="kubernetes-apiservers",verb="POST",code=~"5.."}[30d]))
            record: code_verb:apiserver_request_total:increase30d
          - expr: |
              sum by (code, verb) (increase(apiserver_request_total{job="kubernetes-apiservers",verb="PUT",code=~"5.."}[30d]))
            record: code_verb:apiserver_request_total:increase30d
          - expr: |
              sum by (code, verb) (increase(apiserver_request_total{job="kubernetes-apiservers",verb="PATCH",code=~"5.."}[30d]))
            record: code_verb:apiserver_request_total:increase30d
          - expr: |
              sum by (code, verb) (increase(apiserver_request_total{job="kubernetes-apiservers",verb="DELETE",code=~"5.."}[30d]))
            record: code_verb:apiserver_request_total:increase30d
          - expr: |
              sum by (code) (code_verb:apiserver_request_total:increase30d{verb=~"LIST|GET"})
            labels:
              verb: read
            record: code:apiserver_request_total:increase30d
          - expr: |
              sum by (code) (code_verb:apiserver_request_total:increase30d{verb=~"POST|PUT|PATCH|DELETE"})
            labels:
              verb: write
            record: code:apiserver_request_total:increase30d
    rules_yml: |
      {}
  deployment:
    configmapReload:
      containers:
        args:
          - --volume-dir=/etc/config
          - --webhook-url=http://127.0.0.1:9090/-/reload
    containers:
      args:
        - --storage.tsdb.retention.time=42d
        - --config.file=/etc/config/prometheus.yml
        - --storage.tsdb.path=/data
        - --web.console.libraries=/etc/prometheus/console_libraries
        - --web.console.templates=/etc/prometheus/consoles
        - --web.enable-lifecycle
    replicas: 1
    rollingUpdate:
      maxSurge: 25%
      maxUnavailable: 25%
    updateStrategy: Recreate
  pvc:
    accessMode: ReadWriteOnce
    storage: 20Gi
    storageClassName: default
  service:
    port: 80
    targetPort: 9090
    type: ClusterIP
pushgateway:
  deployment:
    replicas: 1
  service:
    port: 9091
    targetPort: 9091
    type: ClusterIP

Prometheus Configuration Parameters (Standalone MC)

The following table lists configuration parameters of the Prometheus package and describes their default values.

You can set the following configuration values in your prometheus-data-values.yaml file.

To review and edit the Prometheus package’s current configuration parameters and values, retrieve its values schema:

tanzu package available get prometheus.tanzu.vmware.com/2.45.0+vmware.2-tkg.2 -n AVAILABLE-PACKAGE-NAMESPACE --values-schema
ParameterDescriptionTypeDefault
namespaceNamespace where Prometheus will be deployed.Stringtanzu-system-monitoring
prometheus.deployment.replicasNumber of Prometheus replicas.String1
prometheus.deployment.containers.argsPrometheus container arguments. You can configure this parameter to change retention time. For information about configuring Prometheus storage parameters, see the Prometheus documentation. Note Longer retention times require more storage capacity than shorter retention times. It might be necessary to increase the persistent volume claim size if you are significantly increasing the retention time.Listn/a
prometheus.deployment.containers.resourcesPrometheus container resource requests and limits.Map{}
prometheus.deployment.podAnnotationsThe Prometheus deployments pod annotations.Map{}
prometheus.deployment.podLabelsThe Prometheus deployments pod labels.Map{}
prometheus.deployment.configMapReload.containers.argsConfigmap-reload container arguments.Listn/a
prometheus.deployment.configMapReload.containers.resourcesConfigmap-reload container resource requests and limits.Map{}
prometheus.service.typeType of service to expose Prometheus. Supported Values: ClusterIP.StringClusterIP
prometheus.service.portPrometheus service port.Integer80
prometheus.service.targetPortPrometheus service target port.Integer9090
prometheus.service.labelsPrometheus service labels.Map{}
prometheus.service.annotationsPrometheus service annotations.Map{}
prometheus.pvc.annotationsStorage class annotations.Map{}
prometheus.pvc.storageClassNameStorage class to use for persistent volume claim. By default this is null and default provisioner is used.Stringnull
prometheus.pvc.accessModeDefine access mode for persistent volume claim. Supported values: ReadWriteOnce, ReadOnlyMany, ReadWriteMany.StringReadWriteOnce
prometheus.pvc.storageDefine storage size for persistent volume claim.String150Gi
prometheus.config.prometheus_ymlFor information about the global Prometheus configuration, see the Prometheus documentation.YAML fileprometheus.yaml
prometheus.config.alerting_rules_ymlFor information about the Prometheus alerting rules, see the Prometheus documentation.YAML filealerting_rules.yaml
prometheus.config.recording_rules_ymlFor information about the Prometheus recording rules, see the Prometheus documentation.YAML filerecording_rules.yaml
prometheus.config.alerts_ymlAdditional prometheus alerting rules are configured here.YAML filealerts_yml.yaml
prometheus.config.rules_ymlAdditional prometheus recording rules are configured here.YAML filerules_yml.yaml
alertmanager.deployment.replicasNumber of alertmanager replicas.Integer1
alertmanager.deployment.containers.resourcesAlertmanager container resource requests and limits.Map{}
alertmanager.deployment.podAnnotationsThe Alertmanager deployments pod annotations.Map{}
alertmanager.deployment.podLabelsThe Alertmanager deployments pod labels.Map{}
alertmanager.service.typeType of service to expose Alertmanager. Supported Values: ClusterIP.StringClusterIP
alertmanager.service.portAlertmanager service port.Integer80
alertmanager.service.targetPortAlertmanager service target port.Integer9093
alertmanager.service.labelsAlertmanager service labels.Map{}
alertmanager.service.annotationsAlertmanager service annotations.Map{}
alertmanager.pvc.annotationsStorage class annotations.Map{}
alertmanager.pvc.storageClassNameStorage class to use for persistent volume claim. By default this is null and default provisioner is used.Stringnull
alertmanager.pvc.accessModeDefine access mode for persistent volume claim. Supported values: ReadWriteOnce, ReadOnlyMany, ReadWriteMany.StringReadWriteOnce
alertmanager.pvc.storageDefine storage size for persistent volume claim.String2Gi
alertmanager.config.alertmanager_ymlFor information about the global YAML configuration for Alert Manager, see the Prometheus documentation.YAML filealertmanager_yml
kube_state_metrics.deployment.replicasNumber of kube-state-metrics replicas.Integer1
kube_state_metrics.deployment.containers.resourceskube-state-metrics container resource requests and limits.Map{}
kube_state_metrics.deployment.podAnnotationsThe kube-state-metrics deployments pod annotations.Map{}
kube_state_metrics.deployment.podLabelsThe kube-state-metrics deployments pod labels.Map{}
kube_state_metrics.service.typeType of service to expose kube-state-metrics. Supported Values: ClusterIP.StringClusterIP
kube_state_metrics.service.portkube-state-metrics service port.Integer80
kube_state_metrics.service.targetPortkube-state-metrics service target port.Integer8080
kube_state_metrics.service.telemetryPortkube-state-metrics service telemetry port.Integer81
kube_state_metrics.service.telemetryTargetPortkube-state-metrics service target telemetry port.Integer8081
kube_state_metrics.service.labelskube-state-metrics service labels.Map{}
kube_state_metrics.service.annotationskube-state-metrics service annotations.Map{}
node_exporter.daemonset.replicasNumber of node-exporter replicas.Integer1
node_exporter.daemonset.containers.resourcesnode-exporter container resource requests and limits.Map{}
node_exporter.daemonset.hostNetworkHost networking requested for this pod.booleanfalse
node_exporter.daemonset.podAnnotationsThe node-exporter deployments pod annotations.Map{}
node_exporter.daemonset.podLabelsThe node-exporter deployments pod labels.Map{}
node_exporter.service.typeType of service to expose node-exporter. Supported Values: ClusterIP.StringClusterIP
node_exporter.service.portnode-exporter service port.Integer9100
node_exporter.service.targetPortnode-exporter service target port.Integer9100
node_exporter.service.labelsnode-exporter service labels.Map{}
node_exporter.service.annotationsnode-exporter service annotations.Map{}
pushgateway.deployment.replicasNumber of pushgateway replicas.Integer1
pushgateway.deployment.containers.resourcespushgateway container resource requests and limits.Map{}
pushgateway.deployment.podAnnotationsThe pushgateway deployments pod annotations.Map{}
pushgateway.deployment.podLabelsThe pushgateway deployments pod labels.Map{}
pushgateway.service.typeType of service to expose pushgateway. Supported Values: ClusterIP.StringClusterIP
pushgateway.service.portpushgateway service port.Integer9091
pushgateway.service.targetPortpushgateway service target port.Integer9091
pushgateway.service.labelspushgateway service labels.Map{}
pushgateway.service.annotationspushgateway service annotations.Map{}
cadvisor.daemonset.replicasNumber of cadvisor replicas.Integer1
cadvisor.daemonset.containers.resourcescadvisor container resource requests and limits.Map{}
cadvisor.daemonset.podAnnotationsThe cadvisor deployments pod annotations.Map{}
cadvisor.daemonset.podLabelsThe cadvisor deployments pod labels.Map{}
ingress.enabledActivate/Deactivate ingress for prometheus and alertmanager.Booleanfalse
ingress.virtual_host_fqdnHostname for accessing promethues and alertmanager.Stringprometheus.system.tanzu
ingress.prometheus_prefixPath prefix for prometheus.String/
ingress.alertmanager_prefixPath prefix for alertmanager.String/alertmanager/
ingress.prometheusServicePortPrometheus service port to proxy traffic to.Integer80
ingress.alertmanagerServicePortAlertmanager service port to proxy traffic to.Integer80
ingress.tlsCertificate.tls.crtOptional certificate for ingress if you want to use your own TLS certificate. A self signed certificate is generated by default. Note tls.crt is a key and not nested.StringGenerated cert
ingress.tlsCertificate.tls.keyOptional certificate private key for ingress if you want to use your own TLS certificate.
Note tls.key is a key and not nested.
StringGenerated cert key
ingress.tlsCertificate.ca.crtOptional CA certificate. Note ca.crt is a key and not nested.StringCA certificate

Prometheus Server Configuration Parameters

You can set the following fields in the Prometheus Server ConfigMap.

ParameterDescriptionTypeDefault
evaluation_intervalfrequency to evaluate rulesduration1m
scrape_intervalfrequency to scrape targetsduration1m
scrape_timeoutHow long until a scrape request times outduration10s
rule_filesRule files specifies a list of globs. Rules and alerts are read from all matching filesyaml file??
scrape_configsA list of scrape configurations.list??
job_nameThe job name assigned to scraped metrics by defaultstring??
kubernetes_sd_configsList of Kubernetes service discovery configurations.list??
relabel_configsList of target relabel configurations.list??
actionAction to perform based on regex matching.string??
regexRegular expression against which the extracted value is matched.string??
source_labelsThe source labels select values from existing labels.string??
schemeConfigures the protocol scheme used for requests.string??
tls_configConfigures the scrape request’s TLS settings.string??
ca_fileCA certificate to validate API server certificate with.filename??
insecure_skip_verifyDisable validation of the server certificate.boolean??
bearer_token_fileOptional bearer token file authentication information.filename??
replacementReplacement value against which a regex replace is performed if the regular expression matches.string??
target_labelLabel to which the resulting value is written in a replace action.string??

Alert Manager Configuration Parameters

You can set the following fields in the Alert Manager ConfigMap.

ParameterDescriptionTypeDefault
resolve_timeoutResolveTimeout is the default value used by alertmanager if the alert does not include EndsAtduration5m
smtp_smarthostThe SMTP host through which emails are sent.duration1m
slack_api_urlThe Slack webhook URL.stringglobal.slack_api_url
pagerduty_urlThe pagerduty URL to send API requests to.stringglobal.pagerduty_url
templatesFiles from which custom notification template definitions are readfile path??
group_bygroup the alerts by labelstring??
group_intervalset time to wait before sending a notification about new alerts that are added to a groupduration5m
group_waitHow long to initially wait to send a notification for a group of alertsduration30s
repeat_intervalHow long to wait before sending a notification again if it has already been sent successfully for an alertduration4h
receiversA list of notification receivers.list??
severitySeverity of the incident.string??
channelThe channel or user to send notifications to.string??
htmlThe HTML body of the email notification.string??
textThe text body of the email notification.string??
send_resolvedWhether or not to notify about resolved alerts.filename??
email_configsConfigurations for email integrationboolean??

Prometheus Pod Annotations

Annotations on pods allow a fine control of the scraping process. These annotations must be part of the pod metadata. They will have no effect if set on other objects such as Services or DaemonSets.

Pod AnnotationDescription
prometheus.io/scrapeThe default configuration will scrape all pods and, if set to false, this annotation will exclude the pod from the scraping process.
prometheus.io/pathIf the metrics path is not /metrics, define it with this annotation.
prometheus.io/portScrape the pod on the indicated port instead of the pod???s declared ports (default is a port-free target if none are declared).

The DaemonSet manifest below will instruct Prometheus to scrape all of its pods on port 9102.


apiVersion: apps/v1beta2 # for versions before 1.8.0 use extensions/v1beta1
kind: DaemonSet
metadata:
  name: fluentd-elasticsearch
  namespace: weave
  labels:
    app: fluentd-logging
spec:
  selector:
    matchLabels:
      name: fluentd-elasticsearch
  template:
    metadata:
      labels:
        name: fluentd-elasticsearch
      annotations:
        prometheus.io/scrape: 'true'
        prometheus.io/port: '9102'
    spec:
      containers:
      - name: fluentd-elasticsearch
        image: gcr.io/google-containers/fluentd-elasticsearch:1.20

Grafana Components

The Grafana package installs on the cluster the container listed in the table. For more information, see https://grafana.com/. The Grafana package pulls the container from the VMware public registry specified in Package Repository.

ContainerResource TypeReplicasDescription
GrafanaDeployment2Data visualization

Grafana Data Values

Below is an example grafana-data-values.yaml file with the following customizations:

  • ingress is enabled (ingress: enabled: true)
  • ingress is configured for URLs ending in / (prefix:)
  • The FQDN for Grafana is grafana.system.tanzu (virtual_host_fqdn:)
  • The pvc for grafana is 2GB and will be created under the default vSphere storageClass
  • The secret: admin_user and admin_password are for the Grafana UI and both values must be base64 encoded (in the example, “admin” is used for each and base64 encoded; you should use secure credentials for your installation)
namespace: grafana-dashboard
grafana:
  pspNames: "vmware-system-restricted"
  deployment:
    replicas: 1
    updateStrategy: Recreate
  pvc:
    accessMode: ReadWriteOnce
    storage: 2Gi
    storageClassName: default
  secret:
    admin_user: YWRtaW4=
    admin_password: YWRtaW4=
    type: Opaque
  service:
    port: 80
    targetPort: 3000
    type: LoadBalancer
ingress:
  enabled: true
  prefix: /
  servicePort: 80
  virtual_host_fqdn: grafana.system.tanzu

Grafana Configuration Parameters (Standalone MC)

The following table lists configuration parameters of the Grafana package and describes their default values.

You can set the following configuration values in your grafana-data-values.yaml file.

To review and edit the Grafana package’s current configuration parameters and values, retrieve its values schema:

tanzu package available get grafana.tanzu.vmware.com/10.0.1+vmware.2-tkg.1 -n AVAILABLE-PACKAGE-NAMESPACE --values-schema
ParameterDescriptionTypeDefault
namespaceNamespace where Grafana will be deployed.Stringtanzu-system-dashboards
grafana.deployment.replicasNumber of Grafana replicas.Integer1
grafana.deployment.containers.resourcesGrafana container resource requests and limits.Map{}
grafana.deployment.k8sSidecar.containers.resourcesk8s-sidecar container resource requests and limits.Map{}
grafana.deployment.podAnnotationsThe Grafana deployments pod annotations.Map{}
grafana.deployment.podLabelsThe Grafana deployments pod labels.Map{}
grafana.service.typeType of service to expose Grafana. Supported Values: ClusterIP, NodePort, LoadBalancer. (For vSphere set this to NodePort)StringLoadBalancer
grafana.service.portGrafana service port.Integer80
grafana.service.targetPortGrafana service target port.Integer9093
grafana.service.labelsGrafana service labels.Map{}
grafana.service.annotationsGrafana service annotations.Map{}
grafana.config.grafana_iniFor information about Grafana configuration, see Grafana Configuration Defaults in GitHub.Config filegrafana.ini
grafana.config.datasource_yamlFor information about datasource config, see the Grafana documentation.Stringprometheus
grafana.config.dashboardProvider_yamlFor information about dashboard provider config, see the Grafana documentation.YAML fileprovider.yaml
grafana.pvc.annotationsStorage class to use for persistent volume claim. By default this is null and default provisioner is used.Stringnull
grafana.pvc.storageClassNameStorage class to use for persistent volume claim. By default this is null and default provisioner is used.Stringnull
grafana.pvc.accessModeDefine access mode for persistent volume claim. Supported values: ReadWriteOnce, ReadOnlyMany, ReadWriteMany.StringReadWriteOnce
grafana.pvc.storageDefine storage size for persistent volume claim.String2Gi
grafana.secret.typeSecret type defined for Grafana dashboard.StringOpaque
grafana.secret.admin_userBase64-encoded username to access Grafana dashboard. Defaults to YWRtaW4=, which is equivalent to admin in plain text.StringYWRtaW4=
grafana.secret.admin_passwordBase64-encoded password to access Grafana dashboard. Defaults to YWRtaW4=, which is equivalent to admin in plain text.StringYWRtaW4=
ingress.enabledActivate/Deactivate ingress for grafana.Booleantrue
ingress.virtual_host_fqdnHostname for accessing grafana.Stringgrafana.system.tanzu
ingress.prefixPath prefix for grafana.String/
ingress.servicePortGrafana service port to proxy traffic to.Integer80
ingress.tlsCertificate.tls.crtOptional certificate for ingress if you want to use your own TLS cert. A self signed certificate is generated by default. Note tls.crt is a key and not nested.StringGenerated cert
ingress.tlsCertificate.tls.keyOptional certificate private key for ingress if you want to use your own TLS certificate. Note tls.key is a key and not nested.StringGenerated cert private key
ingress.tlsCertificate.ca.crtOptional CA certificate. Note ca.crt is a key and not nested.StringCA certificate