Prometheus cheatsheet
Service and UI
| Check | Description |
|---|---|
systemctl status prometheus | Service status |
ss -tlnp | grep 9090 | UI and API port |
http://HOST:9090/targets | Scrape target health |
http://HOST:9090/graph | PromQL query UI |
http://HOST:3000 | Grafana (default port) |
promtool
| Command | Description |
|---|---|
promtool check config prometheus.yml | Validate main config |
promtool check rules rules/*.yml | Validate alert/recording rules |
promtool query instant http://localhost:9090 'up' | Test PromQL via API |
Useful PromQL
up # 1 = target up, 0 = down
rate(http_requests_total[5m]) # per-second rate
histogram_quantile(0.99, sum(rate(http_request_duration_seconds_bucket[5m])) by (le))
node_memory_MemAvailable_bytes / node_memory_MemTotal_bytes
100 - (avg(rate(node_cpu_seconds_total{mode="idle"}[5m])) * 100)
Scrape config snippet
scrape_configs:
- job_name: prometheus
static_configs:
- targets: ['localhost:9090']
- job_name: node
static_configs:
- targets: ['node1:9100', 'node2:9100']
- job_name: myapp
metrics_path: /metrics
static_configs:
- targets: ['app:8080']
Reload and API
curl -X POST http://localhost:9090/-/reload
curl -s 'http://localhost:9090/api/v1/query?query=up' | jq .
curl -s 'http://localhost:9090/api/v1/targets' | jq '.data.activeTargets[].health'
Grafana quick setup
# Grafana UI → Connections → Data sources → Prometheus
# URL: http://prometheus:9090 (or localhost:9090)
# Save & Test → Explore or Dashboard → New panel → PromQL query
# Example panel: rate of HTTP 5xx
sum(rate(http_requests_total{status=~"5.."}[5m])) by (job)
Alert rule example
groups:
- name: example
rules:
- alert: InstanceDown
expr: up == 0
for: 5m
labels:
severity: critical
annotations:
summary: "Target down"
Self-monitoring metrics
| Metric | Description |
|---|---|
prometheus_tsdb_storage_blocks_bytes | TSDB disk use |
prometheus_rule_evaluation_failures_total | Broken rules |
scrape_duration_seconds | Slow scrapes |
prometheus_target_scrapes_exceeded_sample_limit_total | Cardinality blow-up |
Pro tips
- No data in Grafana? Check Prometheus
/targetsfirst — then the panel query up == 0is the fastest “is scraping working?” query- Use
rate()on counters, not rawincreasefor graphs - Limit label cardinality — never put unbounded IDs on metrics
- Validate config with
promtool check configbefore reload
Practice scenarios
Hands-on Prometheus scenarios on live Linux VMs: prometheus