Context
NaPTAN is the National Public Transport Access Nodes dataset — every bus stop, rail station entrance, ferry berth and tram platform in Great Britain, identified, located and classified. It is open data: journey planners, transit apps, local authorities and other government systems consume it directly, with no registration and no controllable list of integrators.
That openness is the whole job. You cannot survey who depends on a field before you change it, because you cannot see them. The dataset is a public API to the country's transport geography, and the contract is implicit, permanent, and held by strangers.
The hard part
The hard part is not the volume of records — it is that backwards-compatibility is an invariant, not a preference. A breaking change ships to consumers who never agreed to a migration window and cannot be reached to coordinate one. There is no rollback for data that has already been downloaded and trusted.
The second hard part is quality. A single bad record — an impossible coordinate, an invalid status transition, a stop quietly marked active years after it was removed — propagates into every downstream journey plan. Bad data fails quietly, in someone else's system, long after it left yours.
Architecture
Key decisions
Related writing
I wrote more about the discipline behind this in Modernising a National Dataset and Designing for the Hard Part.