← Back to Blog

Storage Architecture: 3-Way Block Pruning & Storage Profiles (v0.15–v0.16)

·Shell Chain Team
storagearchitecturescalabilityperformance

The numbers we had to fix

A Dilithium3 transaction carries:

  • 3,309 bytes of signature
  • ~1,952 bytes of public key (for first-time senders)
  • ~170 bytes of metadata
  • Plus the actual transfer payload

A worst-case 30M-gas, 2-second block carrying 1,428 simple transfers is ~7.76 MB of raw payload. Multiplied by the 31.5 M blocks in a year that is ~244 TB/year per archive node.

That is not a chain anyone runs. We had to build the storage layer down to something a normal operator can afford to keep — without sacrificing the ability of audit nodes and explorers to serve full history when they want to.

The answer is two pieces of work shipped in M15 and M16: a three-way column-family pruning model and a storage-profile selector that tells the rest of the network what each node is actually keeping.


v0.15: Three-way block storage

Each block in Shell Chain v0.15+ is split across three column families with three different lifetimes:

b/<hash>  — StrippedBlock      (header + tx detail, no signatures)   ← permanent
w/<hash>  — WitnessBundle      (Dilithium3 signatures + pubkeys)     ← shed after proof
pa/<hash> — ProofAmendment     (STARK aggregate of all w/ entries)   ← permanent

The lifecycle:

  1. Block is sealed → b/, w/, header, all written.
  2. A prover node generates a STARK proof of every signature in w/ → broadcasts a ProofAmendment → all peers store it as pa/.
  3. After proof_replacement_grace blocks (default 0), w/ is dropped.

b/ is permanent because explorers and eth_getTransactionByHash need transaction detail forever. pa/ is permanent because it is the cryptographic certificate that the now-deleted w/ was valid. Only w/ is ephemeral — and w/ was 80% of the block weight.

The three compression layers

The 3-way column-family split is the A3 layer. Two more layers compose underneath:

Layer Mechanism Saving
A1 Zstd compression on cold column families (b/, pa/) 8–15%
A2 Public-key dedup — store each pubkey once per sender, reference by hash ~34% at 95% sender repeat rate
A3 STARK aggregation + witness shedding up to ~7.1× per batch (see STARK post)

End-to-end, the worst-case 7.76 MB raw block becomes:

  • ~1.5 MB if you keep the proof and the witness for forensic purposes
  • ~425 KB if you let the witness shed normally — ~18× reduction

Source of truth: docs/BENCHMARKS.md.

Why three CFs and not one big blob

A single-CF block would force a re-write of the whole block when the signatures are pruned. With three CFs, the prune is a CF delete on w/<hash> — a constant-time operation in RocksDB that does not touch the permanent data.

It also makes the storage model legible to a third party: du -sh on each CF directly answers "how much of my disk is signatures vs. proofs vs. transactions?"


v0.16: Storage profiles

v0.15 made the storage cheap. v0.16 made it operationally legible.

A single CLI flag now tells the node what it is:

shell-node run --storage-profile archive   # keep b/, w/, pa/ for all blocks
shell-node run --storage-profile full      # keep b/ and pa/ permanently; w/ shed
shell-node run --storage-profile light     # keep only recent N blocks; back-fill on demand

What each profile means in practice

Profile b/ w/ pa/ Disk (1y) Use case
archive All All All ~244 TB worst-case Audit nodes, explorer back-ends
full All None* All ~14 TB worst-case Standard validators, most operators
light Last N None Last N ~150 GB Wallets, indexers, RPC providers

*full keeps w/ for the proof grace window only — operationally that is typically a few minutes to a few hours.

Storage capability advertisement

The new StorageCapability message in the P2P layer announces each node's profile and the historical block range it can serve. When a light node needs an old block to answer an eth_getBlockByNumber call, it asks its peers; only full and archive peers respond.

This means light nodes can join the network without hand-configuring a "trusted RPC fallback" — the network self-organises.

Auto back-fill

If a node was started in full mode and the operator later switches to archive, a back-fill protocol pulls the missing historical bodies from peers without disturbing the hot path. The back-fill rate is rate-limited and runs at idle priority, so a back-filling node does not hurt its forwarding latency.


Operational impact (measured)

We ran the full network with a mix of profiles for 8 weeks before tagging v0.16. Observed steady-state disk growth on Apple-M-series VMs at a sustained 50 tx/s gossip rate:

Profile Daily growth After 30 days
archive ~67 GB/day 2.0 TB
full ~3.9 GB/day 117 GB
light constant ~120 GB 120 GB

The 17× difference between archive and full is what makes operating a post-quantum chain practical. The 1× constant footprint of light is what enables a Chrome-extension wallet to ship without a back-end.


What is not yet shipped

  • L3 trie pruning — the ability to drop historical state trie nodes once the proof chain certifies the current state root. Refcount infrastructure (refs/<node_hash>) is in place; the cut-over waits on recursive STARK proving (M19+).
  • Cross-tier read coalescing — when an eth_getLogs call spans hot, warm and cold, we currently issue separate reads. Coalescing is a ~30% wallclock win on benchmark traces.
  • Snapshot streams for fast sync — partially in v0.16; the light → full upgrade path still requires a full back-fill, not a snapshot delta.

Why this matters for investors and operators

A blockchain whose nodes cannot be operated affordably becomes centralised — only large entities can keep history. Centralisation is exactly the failure mode we built Shell Chain to prevent.

The 3-way storage model and the storage-profile system are not glamorous features. They are the difference between a chain that decentralises sustainably and one that does not.


Read more: STARK Signature Aggregation · Benchmarks · Operator Guide