We are a problem solving company first, specialised in HPC – building software close to the processor. The more projects we finish, the more it’s clear that without our problem solving skills, we could not tackle the complexity of a GPU and CPU-clusters. While I normally shield off how we do and how we continuously improve ourselves, it would be good to share a bit more so both new customers and new recruits know what to expect form the team.
Black boxes will never be transparent
Assumption is the mother of all mistakes
Eugene Lewis Fordsworthe
A colleague put “Assumptions is the mother of all fuckups” on the wall, because we should be assuming we assume. Problem is that we want to have full control and make faster decisions, and then assuming fits in all these scary unknowns.
So if you use the words “solving a problem”, you visualise it as a problem disappearing. Other descriptions like “patching a problem” (which is often more accurate) is seen as bad, incomplete and such. Therefore I prefer describing problems as black boxes that will only get smaller or bigger, not more or less transparent, or even disappear – this is both visual and more accurate.
Black boxes are the unknown knowns and unknown unknows
Donald Rumsfeld handled assumptions by putting the knowns into four categories (and how I translate them):
|Known knowns (facts)||Known unknowns (unanswered questions)|
|Unknown knowns (assumptions)||Unknown unknowns (missing questions)|
Here the second row describes what is inside the black box – it is an illusion that you will ever know everything. Rumsfeld probably based this on a classic quote:
The more you know the more you know you don’t know
Practical example: porting code
Let’s start with reverse engineering code, before going to porting. That’s often seen as a very special task, where nobody knows what the code is actually doing. Documentation never written, sources lost, people left – it’s one huge black box. This sounds like a horror for many managers, but the skills needed to do that is actually required to do code porting from the CPU to the GPU. One of the skills is black box thinking.
The reason that even the world’s clearest code contains black boxes, is that developer have assumptions on anything they don’t fully master. And we can all assume that nobody can master everything, as that would require predicting the future. So when making a flow diagram of the code, you see the data flow from black box to black box, where each box has a list of observations around the known knowns and known unknowns.
Some black boxes are a bit larger than others, as the team thinks the chances are higher that something is not fully under control there but cannot specify what – something with unexpected input, whatever that could be. In such case we simply decide we keep the box large. There is also a list of known unknowns (assumptions from the original developer), we haven’t tested yet, but we know from experience that these happen. We assume the list is incomplete, and when we get feedback from benchmarks, tests or the customer, we already have ideas where to find a solution.
If we would have a list of problems (known knowns) only, we would have finished the project earlier, but it would not be as good. Simply because the first question is automatically “what did we assume wrong?”.
A small change
Just describe a problem as ‘a network of black boxes’. The rest will follow trough. It’s as simple as that!