Index Theme
November 2024 7 min read DeliveryHigh-Pressure Systems

Shipping When the Clock Is Real

Lessons from building vaccination operations software during COVID, where the deadline was not a sprint boundary but a queue of people in a car park.

Most software deadlines are negotiable, even when everyone insists they are not. The launch slips, the quarter absorbs it, the world keeps turning. Then occasionally you work on something where the deadline is a physical fact in the world. I led frontend engineering for mass COVID-19 vaccination operations in Sydney, and the thing that reorganised how I think about delivery was simple: the clock was not a project artifact. It was real people, standing in a real line, on a real morning, and the system had to work before they arrived because they were arriving regardless.

That changes the engineering, not just the stress level. When the operation cannot be paused, your software inherits a different physics. Below are the things that stuck.

A normal bug costs you a retry. In an operation with people physically queuing, a bug costs you a queue, and a queue costs you trust you do not get back.

Make the failure mode boring

The first instinct under pressure is to make the software more capable. The correct instinct is to make its failures more boring. In an environment where people are physically present and the operation cannot stop, the worst outcome is not a missing feature, it is an ambiguous state: a screen that has frozen in a way nobody on the floor can interpret, a record that might or might not have saved, a flow that dead-ends with no obvious next action for a clinician who is not a software person and should never have to be.

So we designed for the floor, not the demo. Every important screen had to answer, instantly and to a tired human, two questions: did the last thing I did actually happen, and what do I do now. That meant unglamorous choices. Optimistic feedback that was honest about uncertainty rather than pretending. Clear, recoverable states instead of clever ones. A bias toward letting an operator proceed manually when the system was unsure, rather than blocking the whole line on a single uncertain record. A frozen queue is a failure of engineering judgement, not just of code.

Resilience is a frontend concern too

There is a habit of thinking about resilience as a backend property: retries, failover, idempotency, the server-side machinery. In a live operation the frontend is where resilience either becomes real or becomes a lie, because the frontend is the only part the operator actually experiences. The backend can be heroically robust and it will not matter if the screen in front of a clinician goes blank when the connection wobbles.

Venues are not data centres. Connectivity is uneven, devices are whatever was available, and the people using the software are concentrating on the person in front of them, not on your app. So the client had to assume the network was a privilege, not a guarantee. Hold state locally and reconcile, do not assume the round-trip. Degrade visibly and predictably rather than failing silently. Never put an operator in a position where the safe thing to do and the thing the UI allows are different things. The measure of a resilient frontend in that setting is not how well it runs when everything is fine. It is how it behaves in the ninety seconds when something is not, because those ninety seconds are when it matters.

Pressure is not an excuse to abandon craft, it is the reason for it

The cliché is that under pressure you cut corners, you go fast and dirty, you fix it later. I came out of that work believing close to the opposite. Pressure is exactly when discipline pays for itself, because pressure is when you have the least time to recover from a mess you made. The corner you cut on a calm Tuesday is an annoyance. The corner you cut before a live operation is the thing that breaks while there is a queue, when there is no slack to debug and no one free to help.

What actually let us move fast was the boring stuff done beforehand. Tight, legible code that a teammate could change at speed without fear. Clear boundaries so a problem in one area stayed in one area. The relentless removal of ambiguity from states, because ambiguity is what costs you when you are tired and the clock is real. Speed under pressure is not recklessness rewarded. It is preparation cashed in.

The deeper lesson outlasted the project. High-pressure delivery does not call for heroics, the all-nighter, the last-minute save. Those are usually symptoms that the real work was skipped earlier. What it calls for is calm: a system whose behaviour you can predict, whose failures you have already imagined, whose worst day you have rehearsed. When the queue forms and the morning starts, you do not want to be brave. You want to be unsurprised. That is the whole job, and it is the kind of work I want to be brought into.