The numbers we had to fix
A Dilithium3 transaction carries:
- 3,309 bytes of signature
- ~1,952 bytes of public key (for first-time senders)
- ~170 bytes of metadata
- Plus the actual transfer payload
A worst-case 30M-gas, 2-second block carrying 1,428 simple transfers is ~7.76 MB of raw payload. Multiplied by the 31.5 M blocks in a year that is ~244 TB/year per archive node.
That is not a chain anyone runs. We had to build the storage layer down to something a normal operator can afford to keep — without sacrificing the ability of audit nodes and explorers to serve full history when they want to.
The answer is two pieces of work shipped in M15 and M16: a three-way column-family pruning model and a storage-profile selector that tells the rest of the network what each node is actually keeping.
v0.15: Three-way block storage
Each block in Shell Chain v0.15+ is split across three column families with three different lifetimes:
b/<hash> — StrippedBlock (header + tx detail, no signatures) ← permanent
w/<hash> — WitnessBundle (Dilithium3 signatures + pubkeys) ← shed after proof
pa/<hash> — ProofAmendment (STARK aggregate of all w/ entries) ← permanent
The lifecycle:
- Block is sealed →
b/,w/, header, all written. - A prover node generates a STARK proof of every signature in
w/→ broadcasts aProofAmendment→ all peers store it aspa/. - After
proof_replacement_graceblocks (default 0),w/is dropped.
b/ is permanent because explorers and eth_getTransactionByHash need
transaction detail forever. pa/ is permanent because it is the
cryptographic certificate that the now-deleted w/ was valid. Only w/
is ephemeral — and w/ was 80% of the block weight.
The three compression layers
The 3-way column-family split is the A3 layer. Two more layers compose underneath:
| Layer | Mechanism | Saving |
|---|---|---|
| A1 | Zstd compression on cold column families (b/, pa/) |
8–15% |
| A2 | Public-key dedup — store each pubkey once per sender, reference by hash | ~34% at 95% sender repeat rate |
| A3 | STARK aggregation + witness shedding | up to ~7.1× per batch (see STARK post) |
End-to-end, the worst-case 7.76 MB raw block becomes:
- ~1.5 MB if you keep the proof and the witness for forensic purposes
- ~425 KB if you let the witness shed normally — ~18× reduction
Source of truth: docs/BENCHMARKS.md.
Why three CFs and not one big blob
A single-CF block would force a re-write of the whole block when the
signatures are pruned. With three CFs, the prune is a CF delete on
w/<hash> — a constant-time operation in RocksDB that does not touch the
permanent data.
It also makes the storage model legible to a third party: du -sh on each
CF directly answers "how much of my disk is signatures vs. proofs vs.
transactions?"
v0.16: Storage profiles
v0.15 made the storage cheap. v0.16 made it operationally legible.
A single CLI flag now tells the node what it is:
shell-node run --storage-profile archive # keep b/, w/, pa/ for all blocks
shell-node run --storage-profile full # keep b/ and pa/ permanently; w/ shed
shell-node run --storage-profile light # keep only recent N blocks; back-fill on demand
What each profile means in practice
| Profile | b/ | w/ | pa/ | Disk (1y) | Use case |
|---|---|---|---|---|---|
archive |
All | All | All | ~244 TB worst-case | Audit nodes, explorer back-ends |
full |
All | None* | All | ~14 TB worst-case | Standard validators, most operators |
light |
Last N | None | Last N | ~150 GB | Wallets, indexers, RPC providers |
*full keeps w/ for the proof grace window only — operationally that is
typically a few minutes to a few hours.
Storage capability advertisement
The new StorageCapability message in the P2P layer announces each node's
profile and the historical block range it can serve. When a light node
needs an old block to answer an eth_getBlockByNumber call, it asks its
peers; only full and archive peers respond.
This means light nodes can join the network without hand-configuring a "trusted RPC fallback" — the network self-organises.
Auto back-fill
If a node was started in full mode and the operator later switches to
archive, a back-fill protocol pulls the missing historical bodies from
peers without disturbing the hot path. The back-fill rate is rate-limited
and runs at idle priority, so a back-filling node does not hurt its
forwarding latency.
Operational impact (measured)
We ran the full network with a mix of profiles for 8 weeks before tagging v0.16. Observed steady-state disk growth on Apple-M-series VMs at a sustained 50 tx/s gossip rate:
| Profile | Daily growth | After 30 days |
|---|---|---|
archive |
~67 GB/day | 2.0 TB |
full |
~3.9 GB/day | 117 GB |
light |
constant ~120 GB | 120 GB |
The 17× difference between archive and full is what makes operating a
post-quantum chain practical. The 1× constant footprint of light is
what enables a Chrome-extension wallet to ship without a back-end.
What is not yet shipped
- L3 trie pruning — the ability to drop historical state trie nodes
once the proof chain certifies the current state root. Refcount
infrastructure (
refs/<node_hash>) is in place; the cut-over waits on recursive STARK proving (M19+). - Cross-tier read coalescing — when an
eth_getLogscall spans hot, warm and cold, we currently issue separate reads. Coalescing is a ~30% wallclock win on benchmark traces. - Snapshot streams for fast sync — partially in v0.16; the
light → fullupgrade path still requires a full back-fill, not a snapshot delta.
Why this matters for investors and operators
A blockchain whose nodes cannot be operated affordably becomes centralised — only large entities can keep history. Centralisation is exactly the failure mode we built Shell Chain to prevent.
The 3-way storage model and the storage-profile system are not glamorous features. They are the difference between a chain that decentralises sustainably and one that does not.
Read more: STARK Signature Aggregation · Benchmarks · Operator Guide