Monitoring with Prometheus

uhub can export runtime metrics in the Prometheus text format so you can scrape, graph and alert on your hub.

The metrics endpoint is served on the same port as ADC — there is no separate listener. The hub demultiplexes ADC, TLS and HTTP on server_port, so an HTTP GET for the metrics path returns the metrics document while ADC clients keep using the same port. It is available from uhub 0.7.0 and is off by default.

1. Enable the endpoint

Add these directives to uhub.conf and restart the hub:

metrics_enable = yes
metrics_token  = "a-long-random-secret"
metrics_path   = "/metrics"

A non-empty metrics_token is required — if no token is set the endpoint stays disabled even with metrics_enable = yes. Scrapers must present the token in an Authorization: Bearer <token> header; any request without it, or to a different path, is rejected. Generate a token with, for example:

openssl rand -hex 32

The token is a shared secret. On a hub with tls_enable = yes the same port also answers HTTPS, so scrape over https:// to keep the token confidential. Over plain HTTP the token travels in clear text — only acceptable over loopback or a trusted private network, which is exactly the localhost case below.

2. Check it by hand

With the hub running locally on the default port (1511), confirm the endpoint answers before pointing Prometheus at it:

curl -H "Authorization: Bearer a-long-random-secret" \
     http://localhost:1511/metrics

You should see a block of metrics in the Prometheus exposition format:

# HELP uhub_users Currently logged-in users.
# TYPE uhub_users gauge
uhub_users 3
# HELP uhub_uptime_seconds Seconds since the hub started.
# TYPE uhub_uptime_seconds gauge
uhub_uptime_seconds 4210
...

A missing or wrong token returns 403 Forbidden; a request for any path other than metrics_path returns 404 Not Found; anything but GET returns 405 Method Not Allowed.

3. Point Prometheus at it

Install Prometheus (apt install prometheus, brew install prometheus, or the tarball from prometheus.io) and add a scrape job to prometheus.yml. The bearer token goes in the job configuration:

scrape_configs:
  - job_name: uhub
    metrics_path: /metrics
    scheme: http
    authorization:
      type: Bearer
      credentials: "a-long-random-secret"
    static_configs:
      - targets: ["localhost:1511"]

Start Prometheus with that configuration:

prometheus --config.file=prometheus.yml

Open http://localhost:9090, go to Status → Targets, and the uhub target should read UP. You can now query any metric — try uhub_users or rate(uhub_chat_messages_total[5m]) in the expression browser.

If your hub runs with TLS, set scheme: https instead. For a self-signed certificate add tls_config: with insecure_skip_verify: true (or point ca_file at your certificate) under the same job.

What is exposed

Every metric is prefixed uhub_. Counters only ever increase (use rate() to graph them); gauges reflect the value at scrape time.

Users and shares

MetricTypeMeaning
uhub_usersgaugeCurrently logged-in users.
uhub_users_peakgaugePeak number of logged-in users.
uhub_users_maxgaugeConfigured max_users.
uhub_users_ipv4 / uhub_users_ipv6gaugeUsers connected over IPv4 / IPv6.
uhub_users_active / uhub_users_passivegaugeUsers that do / do not advertise a routable address.
uhub_users_by_credentialgaugeUsers by class, labelled credential="guest|registered|bot|operator|admin".
uhub_shared_bytes / uhub_shared_filesgaugeTotal bytes / files shared by connected users.

Activity counters

MetricTypeMeaning
uhub_logins_totalcounterSuccessful user logins.
uhub_login_failures_totalcounterRejected or failed login attempts.
uhub_logouts_totalcounterLogged-in users that disconnected.
uhub_chat_messages_totalcounterPublic chat messages accepted for routing.
uhub_private_messages_totalcounterPrivate messages accepted for routing.
uhub_searches_total / uhub_search_results_totalcounterSearch requests accepted / results relayed.
uhub_connect_requests_total / uhub_rev_connect_requests_totalcounterActive (CTM) / passive (RCM) connect requests.
uhub_broadcasts_total / uhub_feature_casts_totalcounterMessages broadcast to all / feature-cast to subscribers.

Network and TLS

MetricTypeMeaning
uhub_uptime_secondsgaugeSeconds since the hub started.
uhub_net_tx_bytes_total / uhub_net_rx_bytes_totalcounterTotal bytes transmitted / received.
uhub_net_tx_rate_bytes / uhub_net_rx_rate_bytesgaugeCurrent send / receive rate (bytes/sec); _peak variants hold the maxima.
uhub_send_queue_bytesgaugeTotal bytes queued for sending across all users.
uhub_connections_accepted_total / _closed_total / _errors_totalcounterConnection lifecycle counts.
uhub_tls_accept_total / _connect_total / _close_total / _error_totalcounterTLS handshake and error counts.
uhub_event_loop_secondshistogramEvent-loop processing time per iteration (excludes the idle poll wait) — a direct measure of main-thread saturation.

Useful queries

A few starting points for the Prometheus expression browser or a Grafana panel:

  • uhub_users — current online users, and uhub_users / uhub_users_max for hub fullness.
  • rate(uhub_chat_messages_total[5m]) — chat messages per second.
  • rate(uhub_login_failures_total[5m]) — a rising rate can indicate a brute-force attempt.
  • rate(uhub_net_tx_bytes_total[1m]) — outbound throughput in bytes/sec.
  • histogram_quantile(0.99, rate(uhub_event_loop_seconds_bucket[5m])) — 99th-percentile event-loop latency; watch this to spot the hub becoming CPU-bound.

See the configuration reference for the full description of the metrics_enable, metrics_token and metrics_path directives.