Observability
Shell-Chain exposes Prometheus metrics, structured tracing, and HTTP health probes out of the box. This document describes every signal available and how to consume them.
See also: Testnet Operator Guide · JSON-RPC API Reference · Storage Profiles
Table of Contents
Metrics HTTP Server
Configure the observability server in node.toml:
[metrics]
enabled = true
listen_addr = "0.0.0.0:9000" # default
Or via CLI:
shell-node --metrics-addr 0.0.0.0:9000
Endpoints
| Path | Method | Description |
|---|---|---|
/metrics |
GET | Prometheus text exposition (v0.0.4) |
/health / /healthz |
GET | Liveness probe — always 200 when process is up |
/ready / /readyz |
GET | Readiness probe — 503 until first block imported |
Health Probes
/healthz (liveness)
Always returns HTTP 200 when the node process is running.
{
"status": "ok",
"version": "0.18.0",
"block_height": 12345,
"peer_count": 4,
"syncing": false
}
/readyz (readiness)
Returns HTTP 200 once the node has imported at least one block. Returns HTTP 503 before that.
Ready:
{ "ready": true }
Not ready (HTTP 503):
{ "ready": false, "reason": "node has not imported any blocks yet" }
These endpoints are designed for Kubernetes / Docker health checks:
livenessProbe:
httpGet:
path: /healthz
port: 9000
initialDelaySeconds: 5
readinessProbe:
httpGet:
path: /readyz
port: 9000
initialDelaySeconds: 10
Prometheus Metrics Reference
All metrics are prefixed shell_.
Chain
| Metric | Type | Description |
|---|---|---|
shell_block_height |
Gauge | Current canonical chain tip height |
shell_blocks_imported_total |
Counter | Cumulative blocks imported since startup |
shell_block_production_duration_seconds |
Histogram | Wall-clock time to produce one block |
shell_txs_received_total |
Counter | Cumulative transactions admitted to mempool |
shell_tx_pool_size |
Gauge | Pending transactions in mempool |
Network
| Metric | Type | Labels | Description |
|---|---|---|---|
shell_peer_count |
Gauge | — | Connected libp2p peers |
shell_libp2p_errors_total |
Counter | error_kind |
libp2p transport/protocol errors |
Consensus (wPoA)
| Metric | Type | Labels | Description |
|---|---|---|---|
shell_epoch_number |
Gauge | — | Current wPoA epoch |
shell_validator_active_count |
Gauge | — | Active validators in current epoch |
shell_validator_weight |
Gauge | validator |
Proposer weight per validator address |
shell_consensus_rounds_total |
Counter | outcome |
Consensus rounds by outcome (committed, timeout, skip) |
RPC
| Metric | Type | Labels | Description |
|---|---|---|---|
shell_rpc_requests_total |
Counter | method, status |
RPC calls by method and HTTP status |
shell_rpc_duration_seconds |
Histogram | method |
RPC handler latency |
Account Abstraction (v0.18.0)
| Metric | Type | Description |
|---|---|---|
shell_aa_bundles_received_total |
Counter | AA bundle transactions admitted to mempool |
shell_aa_bundles_executed_total |
Counter | AA bundles executed (including reverted) |
shell_aa_inner_calls_total |
Counter | Total inner calls executed across all bundles |
shell_aa_sponsored_txs_total |
Counter | Transactions where paymaster paid gas |
Structured Tracing
Shell-Chain uses the tracing crate for structured, async-aware logging.
Control the log level via environment variable:
# Levels: error, warn, info, debug, trace
RUST_LOG=info shell-node
# Per-crate control
RUST_LOG=shell_chain=debug,shell_rpc=info,libp2p=warn shell-node
Spans are instrumented on:
- RPC handler entry/exit (method, params summary, latency)
- Block production tick (height, tx count, duration)
- Consensus step transitions (epoch, round, outcome)
- EVM execution (tx hash, gas used, status)
- Mempool admission (tx hash, type, outcome)
JSON logging
For production deployments, enable structured JSON output:
[logging]
format = "json" # default: "pretty"
This produces newline-delimited JSON for log aggregators (Loki, Elasticsearch, etc.):
{"timestamp":"2026-04-24T12:00:00Z","level":"INFO","target":"shell_rpc","message":"RPC request","method":"eth_blockNumber","latency_ms":0.4}
Grafana Dashboard
A starter Grafana dashboard JSON is included in shell-chain at
infra/grafana/shell-chain-dashboard.json. To import:
- Open Grafana → Dashboards → Import
- Upload
shell-chain-dashboard.json - Set the Prometheus data source to your node's
/metricsendpoint
The starter dashboard includes panels for:
- Block production rate
- Mempool size over time
- Peer count
- RPC latency by method (p50 / p95 / p99)
- Consensus round outcomes
- AA bundle execution rate
Added in shell-chain v0.18.0.