Monitoring and Alerting
Stack Overview
| Component | Version | Namespace |
|---|---|---|
| Prometheus Operator | v80.4.2 | monitoring |
| Grafana | (bundled) | monitoring |
| Portworx metrics | Integrated | portworx |
| Autopilot | Integrated | portworx |
Grafana Access
Grafana is accessible via ingress. Credentials are managed through Palette variables in the cluster profile.
TLS
The prometheus-operator manifest includes an issuer-selfsigned resource for TLS certificate generation. Browser warnings are expected.
Portworx Metrics
Portworx exports metrics directly to Prometheus with exportMetrics: true in the Portworx pack configuration.
Autopilot uses these Prometheus metrics to make automated storage scaling decisions (e.g., expanding volumes when usage exceeds thresholds).
Key Metrics
VM Status
Track the phase distribution of all VMIs:
Filter by specific phase:
Migration Performance
Measure time from migration creation to completion:
Track migration success/failure rates:
kubevirt_vmi_migrations_in_pending_phase
kubevirt_vmi_migrations_in_scheduling_phase
kubevirt_vmi_migrations_in_running_phase
Storage
Portworx cluster available disk space:
Per-volume usage:
Storage Alerts
Set alerts when px_cluster_disk_available_bytes drops below 20% of total capacity or when individual volumes exceed 80% usage. Autopilot can handle automatic expansion but should be monitored.
Node Health
Detect nodes with memory or disk pressure:
kube_node_status_condition{condition="MemoryPressure", status="true"}
kube_node_status_condition{condition="DiskPressure", status="true"}
Useful Grafana Dashboards
The Prometheus Operator deployment includes several pre-configured dashboards:
- Kubernetes / Compute Resources / Cluster - Overall cluster CPU and memory usage.
- Kubernetes / Compute Resources / Node - Per-node resource consumption.
- Node Exporter / Nodes - Hardware-level metrics (disk I/O, network, CPU).
Tip
Import the KubeVirt Grafana dashboard for VM-specific metrics visualization. Search for "KubeVirt" in the Grafana dashboard marketplace.
Checking Prometheus Targets
Verify all expected targets are being scraped:
```bash copy kubectl port-forward svc/prometheus-operated 9090:9090 -n monitoring
Then open `http://localhost:9090/targets` in a browser.
---
## Alertmanager
Check active alerts:
```bash copy
kubectl port-forward svc/alertmanager-operated 9093:9093 -n monitoring
Then open http://localhost:9093/#/alerts in a browser.
List firing alerts via CLI:
```bash copy kubectl get prometheusrules -n monitoring