The Post-Quantum Signature Problem
Dilithium3 signatures are 3,309 bytes each. Public keys are ~1,952 bytes (stored once per new sender). A single transaction therefore carries ~5,261 bytes of authentication data alone. By contrast, an Ethereum ECDSA signature is 65 bytes.
For a 30M-gas, 2-second block carrying the maximum ~1,428 simple transfers, that is ~7.76 MB of raw authentication payload per block. Storing that forever on every node is not viable. We had to build a way to prove the signatures were valid, then throw the signatures away.
The Insight: Signatures Are Ephemeral, Proofs Are Permanent
A Dilithium3 signature exists for one purpose: proving that the private key holder authorised this transaction. Once a proof system can independently certify "all N signatures in block B were valid," the original signatures become redundant.
This is what Shell Chain's STARK aggregation does:
Block sealed with N Dilithium3 signatures (stored as WitnessBundle)
↓
Prover runs SigBatchCircuit over all N signatures
↓
ProofAmendment{SigBatchProof} (constant size regardless of N)
↓
Original WitnessBundle deleted from storage tier
Combined with the upstream layers — A1 Zstd at the column-family level (8–15% saving) and A2 public-key dedup (~34% saving at 95% sender repeat rate) — the full pipeline takes a worst-case 7.76 MB raw block to ~425 KB pruned, an end-to-end ~18× reduction. The STARK proof itself is retained permanently; the witness data is shed once the proof arrives.
Numbers throughout this post come from
docs/BENCHMARKS.mdat v0.15.0+. Treat them as the source of truth, not this post.
Why STARKs Over SNARKs
We evaluated several proof systems before settling on STARKs.
The trusted setup problem
Most practical SNARKs (Groth16, PLONK, Marlin) require a trusted setup ceremony — a multi-party computation that produces public parameters. If the ceremony is compromised, a malicious prover can forge proofs indistinguishable from valid ones. This is an operational and trust assumption we wanted to avoid entirely.
STARKs have no trusted setup. Security relies only on collision-resistant hash functions — which, unlike elliptic curve discrete log, have a clear post-quantum security story (double the output size for 128-bit PQ security).
The transparency and auditability argument
A STARK's proof system is fully transparent. The AIR (Algebraic Intermediate Representation) that encodes the computation is public and can be independently audited. There's no "toxic waste" from a setup ceremony that must be destroyed and trusted to have been destroyed.
For a chain whose threat model explicitly includes well-funded adversaries (including nation-state actors preparing for the post-quantum era), eliminating setup ceremony risk is worth the larger proof size.
Proof size vs. verify time
SNARKs have smaller proofs (typically 128–256 bytes for Groth16) and fast on-chain verification. STARKs are larger (Winterfell proofs are in the ~10–20 KB band) with slower verification.
For Shell Chain's use case — off-chain batch verification, not on-chain
per-tx verification — this tradeoff is acceptable. The proof is verified once
by each peer when it receives the ProofAmendment; it doesn't need to be
re-verified by an EVM contract on every query.
FRI and concrete security
Winterfell uses FRI (Fast Reed-Solomon Interactive Oracle Proof) as its PCS (polynomial commitment scheme). FRI's concrete security is well-understood: it relies on the random oracle model and the hardness of solving certain proximity problems in Reed-Solomon codes — problems that don't have known quantum speedups beyond a square-root (Grover) factor.
By choosing a STARK over a SNARK, we get:
- No trusted setup
- Post-quantum sound security (with large enough field elements)
- Transparent, auditable proof system
- No pairing-based cryptography (pairing-friendly curves have unclear PQ security)
The Implementation: SigBatchCircuit on Winterfell
Circuit design
The SigBatchCircuit is a Winterfell AIR that encodes the following statement:
"For each of the N (pubkey, message, signature) tuples in this batch, the Dilithium3 verification algorithm outputs 'valid'."
The trace layout:
| Column | Content |
|---|---|
| 0..k | Expanded signature state (verification intermediate values) |
| k | Boolean: signature i is valid |
| k+1 | Running accumulator (batch_root) |
At each row, the AIR enforces the Dilithium3 polynomial multiplications and
modular reductions. The final row's batch_root column is the public output
(committed to in the ProofAmendment).
SigBatchProof structure
pub struct SigBatchProof {
pub version: u8,
pub batch_root_bytes: [u8; 16], // final accumulator (public output)
pub n_sigs: usize,
pub proof_bytes: Vec<u8>, // raw Winterfell Proof
}
batch_root_bytes is a 128-bit field element — the Winterfell field is
F_p where p = 2^64 - 2^32 + 1 (a 64-bit Goldilocks-like prime).
ProofAmendment broadcast
pub struct ProofAmendment {
pub version: u8,
pub block_hash: ShellHash,
pub block_number: u64,
pub proof: SigBatchProof,
pub prover: Address,
pub prover_signature: Bytes, // Dilithium3 sig over (block_hash ‖ block_number ‖ batch_root)
}
The prover_signature prevents forgeries: any node can verify that a registered
prover actually ran the computation, not that an attacker injected a false proof
claiming all signatures were valid.
Verification on the receiving peer
Receive ProofAmendment
│
├─ Check prover ∈ ProverRegistry
├─ Verify prover_signature (Dilithium3)
└─ Verify SigBatchProof via verify_sig_batch()
│
├─ Reconstruct public inputs from block's WitnessBundle
├─ Run Winterfell verifier (FRI + DEEP-ALI)
└─ Check batch_root matches claimed value
If all checks pass: store pa/<hash>, delete w/<hash>.
The Storage Architecture
The STARK pipeline is the third compression layer in Shell Chain's end-to-end block-storage stack. Three column families — and three lifetimes:
b/<hash> — StrippedBlock (TX detail) ← permanent
w/<hash> — WitnessBundle (PQ signatures) ← deleted after proof
pa/<hash> — ProofAmendment (STARK proof) ← permanent
This means:
- Explorers always have full transaction history (from
b/) - Verifiers always have cryptographic proof of block validity (from
pa/) - Disk space is recovered by shedding
WitnessBundleonce the correspondingProofAmendmentarrives
For the wider tiering story — hot/warm/cold and the Zstd column-family layer — see the dedicated Storage Architecture post.
Asynchronous Proving: Design Choices
Why async?
Dilithium3 verification is fast (~1 ms per signature on modern hardware). STARK proof generation for a non-trivial batch takes significantly longer — on the order of seconds depending on hardware and circuit complexity.
Block production cannot wait for proofs. The chain would stall.
Instead, Shell Chain separates consensus time from proof time:
- Block is sealed and propagated immediately (using native signature verification)
- A prover node generates the proof in the background
ProofAmendmentis broadcast and attached after the fact
The proof doesn't need to be ready before the next block. It just needs to arrive before the block is pruned from witness storage.
The grace window
proof_replacement_grace (default: 0 blocks) controls how long to keep the
WitnessBundle after a proof arrives. For production use with STARK nodes,
the default is immediate deletion. For forensic or debugging use:
[pruning]
proof_replacement_grace = 604800 # keep for ~7 days at 1 block/s
Proving priority
For prover nodes catching up after downtime:
[prover]
proving_priority = "latest-first" # prove newest blocks first
This ensures recent blocks get proofs quickly even if historical blocks are waiting — useful when disk pressure from WitnessBundles is more urgent than archival completeness.
Benchmarks (v0.15.0, A3 layer alone)
Measured on commodity Apple-M-series hardware, single prover process:
| Metric | Value |
|---|---|
| Peak compression (5-tx batch) | 7.1× |
| Sustained throughput | 157 proofs/sec |
| Mean latency | 6.4 ms |
| p99 latency | 18.7 ms |
| Soak duration (continuous) | 6 h 04 min |
| Proofs generated during soak | 3,403,200 |
| Failures during soak | 0 |
| Prover RAM | 312 MB |
| Prover CPU | 38% (single core) |
Source: docs/BENCHMARKS.md#a3-stark-signature-aggregation-v0150.
End-to-end (A1 Zstd + A2 dedup + A3 STARK), the 7.76 MB worst-case raw block becomes ~1.5 MB with the proof retained, or ~425 KB once the proof window allows shedding the witness — the ~18× number quoted on the marketing site.
What's Next
Recursive proofs
The current SigBatchProof proves one block at a time. A natural extension is
recursive aggregation: prove that a proof of block N and a proof of block N+1
are both valid, producing a single proof for both blocks. This would reduce
proof storage further and enable efficient light client verification.
L3 trie pruning
The STARK-proven block history makes it possible to trustlessly prune state
trie nodes: a light client can verify the current state root is correct by
checking the STARK proof chain from genesis without replaying every transaction.
We've implemented the refcount infrastructure (refs/<node_hash>) and will
enable this once the proof chain is sufficiently mature.
Multi-prover networks
Multiple registered provers compete to submit ProofAmendments first. The
first valid proof wins; duplicates are discarded. This creates an organic
proving market without special incentive mechanisms.
Read more: Block Pruning & Compression · Prover Guide · Benchmarks