The NaPTAN change stream, made to misbehave on purpose.

A working model of a data-platform ingestion pipeline — validation, dedup, exactly-once apply, backpressure, and an atomic backfill cutover. Inject bad records and duplicates, push the rate past capacity, and trigger a backfill while the stream is live. The guarantee at the bottom proves nothing was lost or double-applied. Runs entirely in your browser.

Rate80 eps

Inject invalid 15% Inject duplicates 20%

Source

events ingested

→

Validate

rejected at gate

→

Dedup

duplicates dropped

→

Apply · exactly-once

committed to store

Backpressure queue

0 buffered

Backfill cutover — building shadow version

Live stream keeps arriving and buffers against backpressure; nothing is applied to the live store until the shadow is ready and the pointer swaps atomically.

throughput · eps

0 ms

apply p50

0 ms

apply p95

store size · unique

peak queue depth

Exactly-once: applied (0) = unique valid keys (0) · 0 lost · 0 double-applied

Event log · most recent

Validation is the gate

Bad records — impossible coordinates, invalid status transitions — are rejected before anything downstream sees them. On NaPTAN, a bad record published to an unbounded consumer set is unrecallable, so the only safe place to be strict is before the commit.

Exactly-once, not at-least-once

Streams redeliver. The pipeline keys on an idempotency token: a key it has already applied is counted as a duplicate and dropped, never applied twice. The guarantee line stays green because applied = unique valid keys.

Backpressure is a feature

When arrivals outrun apply capacity, events buffer instead of dropping. The queue depth is the honest signal of how far behind you are — and the thing you alert on, long before data loss.

Backfills are migrations

Reprocessing history is the riskiest routine a platform runs. Here it's an atomic cutover: build the corrected version in the shadow, keep buffering live events, then swap the pointer in one step. Zero loss, zero double-apply — read the essay.

This is a faithful simulation of the mechanisms, running in your browser — a discrete-event model, not a production broker. The same patterns (idempotent apply keyed on a token, a validation gate before commit, queue-depth backpressure, shadow-build-then-atomic-swap backfills) are how I approach the real NaPTAN ingestion pipeline. A runnable docker-compose reference implementation is a separate project.