Prometheus Metrics
Every collector Portunus exposes, with cardinality budget notes.
portunus-server exposes Prometheus metrics on
metrics_listen (default 127.0.0.1:7081). The endpoint is
loopback-pinned — scrape from a sibling Prometheus on the same host or
a sidecar.
curl -s http://127.0.0.1:7081/metricsThe same payload is also available at /v1/metrics on the operator
HTTP listener, gated by superadmin RBAC. The Web UI dashboard reads
that path so it doesn't have to cross listeners.
Cardinality budget
Per-rule collectors emit one row per live rule (labels
{client, rule, owner}). Per-port and per-target detail surface
only on demand via ?per_port=true / ?per_target=true — never as
default /metrics series.
When a rule is removed, most per-rule rows are removed with it. The
cumulative byte counters (portunus_rule_bytes_in_total /
portunus_rule_bytes_out_total) are kept by Prometheus convention.
Server-level
portunus_clients_connected
portunus_auth_failures_total{reason}
portunus_operator_requests_total{outcome, reason}
portunus_audit_buffer_drops_total
portunus_audit_durable_writer_lag_seconds
portunus_store_busy_totaloutcome ∈ {allow, deny}; reason is "ok" on allow or the static
RbacError::code() string on deny (bounded label set).
portunus_audit_durable_writer_lag_seconds is the age of the oldest
entry in the durable-audit hand-off queue (0 when idle).
portunus_store_busy_total counts SQLITE_BUSY events mapped to a
transient store error; it should stay near zero.
Per-rule TCP
portunus_rule_bytes_in_total{client, rule, owner}
portunus_rule_bytes_out_total{client, rule, owner}
portunus_rule_active_connections{client, rule, owner}
portunus_rule_dns_failures_total{client, rule, owner}
portunus_rule_target_failovers_total{client, rule, owner}portunus_rule_target_failovers_total emits one row per multi-target
rule (counting Healthy↔Failed transitions); single-target rules never
emit a row.
Per-rule UDP
portunus_rule_udp_datagrams_in_total{client, rule, owner}
portunus_rule_udp_datagrams_out_total{client, rule, owner}
portunus_rule_active_flows{client, rule, owner}
portunus_rule_flows_dropped_overflow_total{client, rule, owner}TLS SNI routing (v0.9+)
portunus_tls_sni_route_total{client, rule, owner, result}
portunus_tls_sni_listener_miss_total{client, port}
portunus_tls_sni_listener_parse_failures_total{client, port}
portunus_tls_sni_routes_activeresult ∈ {exact, wildcard, fallback}. A connection whose SNI
matches no rule (and has no fallback) is counted on
portunus_tls_sni_listener_miss_total instead, not on
tls_sni_route_total.
SNI peek histogram (v0.10+)
portunus_tls_client_hello_peek_duration_seconds_bucket{client, port, le}
portunus_tls_client_hello_peek_duration_seconds_sum
portunus_tls_client_hello_peek_duration_seconds_countFinite buckets up to 3 s; observations above 3 s increment _count and
le="+Inf" without bumping le="3". Only emitted for SNI-mode listeners.
Rate limiting (v0.11+)
portunus_rate_limit_reject_total{client, rule, owner, reason}
portunus_rate_limit_throttle_seconds_total{client, rule, owner, direction}
portunus_rate_limit_active_connections{client, rule, owner}Reject reasons: conn_concurrent, conn_rate, udp_flow_rate,
owner_concurrent, owner_conn_rate, owner_udp_flow_rate.
Per-rule rows carry the rule id in rule and the owner in owner.
Owner-aggregated rows (cross-rule totals for an owner) set rule=""
and keep owner populated. This applies to all three collectors,
including portunus_rate_limit_active_connections, which also emits an
owner-aggregate row with rule="". Slice with {rule!=""} for per-rule
rows or {rule=""} for the owner aggregate.
Traffic quotas (v0.13+)
portunus_traffic_quota_bytes_used{user, client}
portunus_traffic_quota_bytes_limit{user, client}
portunus_traffic_quota_exhausted{user, client}
portunus_traffic_quota_period_resets_total{user, client}
portunus_traffic_quota_exhausted_total{user, client}These are keyed by {user, client} — not owner — and track the
per-(user, client) monthly byte budget. bytes_used is the cumulative
bytes consumed in the current period and bytes_limit is the budget.
portunus_traffic_quota_exhausted is a gauge (1 while the quota is
currently exhausted, else 0). period_resets_total counts period
boundary rollovers and exhausted_total counts first-time period
exhaustions.
Useful queries
# Top 5 rules by ingress bytes/sec over last 5m
topk(5, sum by (rule) (rate(portunus_rule_bytes_in_total[5m])))
# Reject ratio per rule
rate(portunus_rate_limit_reject_total[5m]) /
rate(portunus_rule_active_connections[5m])
# Throttle wall-clock per rule (per direction)
rate(portunus_rate_limit_throttle_seconds_total[5m])
# Auth failures by reason
sum by (reason) (rate(portunus_auth_failures_total[5m]))