Move Fast and Break Things: The Misunderstood Facebook Principle
February 13, 2021
The Facebook principle of "Move Fast and Break Things" has been widely ridiculed and even blamed for undermining democracy. In 2014, Mark Zuckerberg updated it to "Move Fast with Stable Infra" but the original is just too catchy for people to forget. I think the original principle contains valuable lessons, here's why.
First, "Move Fast..." was intended as a software development principle. Mark's reasoning was: "unless you are breaking stuff, you are not moving fast enough." It was not an evil plan to flout laws, smash democracy, or fracture society. Imagine you're a software engineering who has never shipped a bug. Is it possible to move faster and still ship bug-free code? How would you know your own limit/potential? During my time at Facebook, I did not witness any cases where "Move Fast..." was used to compel engineers to ship broken software.
Second, "Move Fast..." encourages experimentation. It asks engineers to take calculated risks as part of a tight feedback loop. If you Broke Something, you would not be dragged in front of a tribunal for career-ending cross-examination. This environment encouraged experimentation, especially when iterating for product-market-fit.
Third, "Move Fast..." is an effective way to build teams that rarely break things and fix broken things quickly. Consider every possible variation of the original motto:
- Move Fast and Break Things
- Move Slow and Break Things
- Move Fast and Break Nothing
- Move Slow and Break Nothing
Unfortunately, it's impossible to Break Nothing. All but the simplest software contains bugs, no matter how slowly and carefully the developer moves. Some bugs are never discovered, others have bankrupted companies, but nobody can practically deliver defect-free software.
Then there are intentional "breaks"; for example, scheduled deprecations and experimental features are usually not considered "defects" regardless of whether they violate the user's expectations.
Just as there is an optimal amount of dirt in your house, there is an optimal amount of defects in your software
Now that we have eliminated the "Break Nothing" options, we are left to choose between:
Move Fast and Break Things
Move Slow and Break Things
"Fast" and "slow" refer to how long it takes to identify and deliver features to users, measurable as lead time. While "velocity vs quality" is often couched as a tradeoff, in my experience they are positively correlated. My hypothesis is that lead time and defect rate can both be improved across individuals and teams through deliberate practice.
Imagine you are a novice tennis player. Each time you serve the ball, it becomes invisible as soon as it leaves the racket. Three months later, you receive a spreadsheet summarizing your serving performance. Now imagine implementing a feature and waiting three months for users to start using it. Reducing the time between action and feedback is critical for deliberate practice and improvement.
In other words, "Move Fast..." can help teams move faster and break fewer things.
In the book Art and Fear, we see the same phenomenon:
The ceramics teacher announced on opening day that he was dividing the class into two groups. All those on the left side of the studio, he said, would be graded solely on the quantity of work they produced, all those on the right solely on its quality. His procedure was simple: on the final day of class he would bring in his bathroom scales and weigh the work of the "quantity” group: fifty pounds of pots rated an "A”, forty pounds a "B”, and so on. Those being graded on "quality”, however, needed to produce only one pot — albeit a perfect one — to get an "A”. Well, came grading time and a curious fact emerged: the works of highest quality were all produced by the group being graded for quantity.
Another benefit of Breaking Things in a tight feedback loop is the practice and experience of Fixing Broken Things. In the ceramics class example, imagine students from both groups sharing the shop pottery wheel, sculpting medium, and kiln, all of which fail ten percent of the time. Which group is more likely to deliver completed work despite the unreliable tools and infrastructure? As long as Breaking Things quickly leads to Fixing Things, teams that "Move Fast..." gain the skills and experience necessary for reducing Time to Resolution.
What if a software company figured out how to "Break Nothing" and maintains a spotless track record through processes and culture? Imagine joining this company and learning the consequences of being the first person to break something. Knowing the high cost of failure, what risks would you shy away from? What ideas and experiments would you never pursue? Are you willing to pay the cost of perfection?
Are there times when you should slow down and try extra hard to not break something? Absolutely. Are there times when you should optimize for speed and tolerate a few bugs? Yes. Never blindly follow a principle.
Despite its unpopularity, "Move Fast and Break Things" remains a valuable engineering principle to me. Breaking things is inevitable, but building tight feedback loops can reduce defect rate and time to resolution while improving velocity.
Thanks to James Gill and Jeff Morrison for reviewing drafts of this post.