DRMacIver's Notebook
Learning from the real
Learning from the real
Back in Skirting the edge of disaster I pointed out that most complex systems that have been running in production for a while (e.g. a legacy software system, the London underground, humanity) have the form that they do as a result of the following two rules:
- If an important system is broken, we fix it.
- If a system has spare capacity, we place more demands on it.
As you can tell from the title of the linked piece, I was mostly framing this as a bad thing: The consequence of being at a stable balance of these two things is that the system is constantly almost but not-quite broken, because every time it gets too far from broken it gets loaded more, and every time it breaks it gets fixed.
But there’s another side to this that I think is interesting: If you encounter a reasonably stable complex system in the wild (and you do, very regularly), this reliably means:
- It does a lot of things, because capabilities kept accreting over time, and they’re often not things you would have thought of without that infrastructure because it changes what’s easy and cheap to do.
- Everything that would have stopped it from doing those things has been solved.
A lot of time, work, and ingenuity has often gone into getting it to that state, much of which would have been impossible to figure out from first principles because you had to try things and see what happened before you could even learn that you needed to solve the problems, let alone how.
Gall’s law is “A complex system that works is invariably found to have evolved from a simple system that worked. A complex system designed from scratch never works and cannot be patched up to make it work. You have to start over with a working simple system.”
But this then leads to the question: OK smart guy, how do I make a working simple system?
And I think the answer is often this: Go find a complex system that does what you want, and study it until you understand it well enough to extract a simple system from it.