AddyOsmani.com - Bias Toward Action | stdlib

Bias toward action isn't about being reckless or brave. It's about taking the smallest step that produces real feedback while knowing exactly how you'll recover when it breaks. Fast teams don't skip thinking - they have really tight feedback loops and the infrastructure to make safe things easy. Research on fast decision-making shows that quick-moving teams actually use more information than slow teams, not less. They just don't wait for perfect information. They make decisions with about 70% of what they want to know, ship something small, and learn from what happens.

The difference between fast and slow teams is infrastructure, not courage. Fast teams have feature flags for instant rollbacks, monitoring that actually tells them when things break, practiced rollback procedures, and small changes that are easy to debug. Slow teams are stuck because every deploy feels risky - and it is risky, because they lack safety nets. Most decisions are reversible if you design them that way. The highest-leverage move is often making a one-way door into a two-way door by adding a feature flag, starting with 1% of traffic, or dual-writing to both systems.

Error budgets replace philosophical debates about speed versus reliability with objective thresholds. Your service has an SLO - say 99.9% uptime. That 0.1% is your error budget, your allowed breakage. The rule is simple: if you're within budget, keep shipping. If you've blown your budget, stop and fix reliability first. This completely changes the conversation from arguing about whether to slow down to just looking at the numbers. About 70% of outages come from changes, which is why you need mechanisms that make change safer.

The practical approach looks like this: define what broken means before you start, put the change behind a feature flag, ship to production with the flag off, then turn it on for 1% of traffic and watch your metrics. If things look good, ramp to 5%, then 25%, then 50%, then 100%. If anything breaks at any point, flip the flag off. You're shipping to production multiple times, but the blast radius of being wrong stays tiny. Compare this to spending three weeks testing in staging, then doing a big-bang deploy to 100% of production where everyone sees breakage and rollback is a production emergency. Which one is actually safer?

Feature flags are incredibly useful but they have a cost - they introduce combinatorial state complexity. Two flags create four states, three flags create eight states, ten flags create 1,024 states you can't possibly test. Treat flags like inventory with owners, expiry dates, clear purposes, and removal tasks. The goal isn't to ship fearlessly - it's to ship with well-founded confidence that being wrong won't be catastrophic. Real slowness comes from large change batches, flaky tests, poor observability, complicated rollbacks, and alert noise that trains people to ignore warnings.

AddyOsmani.com - Bias Toward Action

Problems this helps solve:

Explore more resources