How human systems fail

Notes from paper How Complex Systems Fail, Richard I. Cook, MD. Insights from years of observing health systems.

🧠 Any system that has managed to survive will have evolved many layers of defense. In human systems this looks like backups, training, organizations, institutions, regulations, policies, procedures, certifications, training.

🔥 Don’t rip out redundant layers just because they are “inefficient”! They are likely still around because they defend system resilience. There is a fundamental tradeoff between efficiency and resilience.

🤯 Catastrophe requires multiple failures. Complex systems evolve to become resilient to small failures. There is never a single point of failure. It takes a cascade. There is no such thing as a "root cause".

🦠 Failures are always latent in a complex system. You can’t get rid of them. Reasons: (1) $$$ (2) they change constantly (3) you don’t always know what they are.

🛠 All complex systems are broken. But they work.

👮‍♀️ People are always in proximity to failure points in human systems. They were put there for a reason —the failure mode is probably too complex for a policy or procedure to fix it. You need a person to defend it.

🙃 All practitioners balance 2 roles: produce / defend. When the system works, we complain that they are wasting production resources on defense. When the system fails, we complain that they ignored defense because they were greedy for production.

🎲 Every critical decision about a system is a gamble. After a failure, it is suddenly clear that the decision was a gamble. What we don’t see is that successes were gambles too.

☁️ Perfect efficiency is not a good idea. The relationship between production efficiency, resources, costs, and risks in a system is intentionally fuzzy. Fuzziness = optionality. Fuzziness makes space for the system to adapt.

🙋‍♀️ People are the adaptable element in a system.

💥 Failure in a complex system is emergent. Safety in a complex system is also emergent.

🙅‍♀️ When systems break, we tend to pin the blame on a single cause, usually a person. The fixes we put in place are usually end-of-chain. They add complexity but don’t usually solve the problem, because failure in a complex system is emergent.

⚛️ System safety is not a feature, it is not a component, it is not a department, it is not a job title, it cannot be purchased, it can’t be manufactured. People continuously create safety.


Complex human systems often evolve from simple systems.

Complex human systems are broken and inefficient. Yet when you tear up an evolved mess, and impose order you frequently fail by Second System Syndrome.

"We must build on a clear site" is the failure mode of Seeing Like a State.