‘In God we trust, all others bring data’ . This statement is generally attributed to W. Edwards Deming, way back in the 50’s.
It is as true now as it was then, with some pretty significant caveats.
You need to know the provenance of any data you choose to use, as data is just data, and it can be managed, coalesced, manipulated, misrepresented, and outright lied about, so be careful.
Some things to remember.
- Data is mostly history, and the future is rarely the same as the past. Data that is really a forecast should not be called data, it is a ‘best guess,’ or wishful thinking,’ or ‘what the boss told me to say,’ or ‘I need this to keep my job,’ or a thousand other things.
- Data is always incomplete no matter how complete you think it is. There is always some level of context that can give it greater, or even a different meaning, that is missing. Part of the challenge is being able to make a decision without being overwhelmed and developing a form of ‘common sense blindness’, caused by a tsunami of data.
- Data provenance is always useful to know. What is the source of the data? When it comes from customers, it may be more useful than when it comes from a supplier trying to sell you something. When you see claims like ‘This lotion has been scientifically proven to cure male pattern baldness in 92.5% of cases’, you know the ‘scientific test’ was done on 10 hairy blokes from the local footy side.
- Data is objective, but the analysis of data is not. It is subject to a host of human emotions and contexts, and can be interpreted in a number of ways, depending on the mood, experience, domain knowledge and a host of other things, of the analyser.
- Data ages quickly. What was pretty right now, might not be so right in a years time.
- The world is full of conflicting data, the challenge is to know which pieces to believe and use, and which to discard. Anticipating the actions of your competitor with the same set of data is a very useful exercise. Put yourself in their position and ask yourself what would I do now?
- Data can distract, as we are visual animals, and visuals are a powerful way of communicating, so be sure that what is being communicated by those fancy graphs is actually what the data says.
- Data should be able to tell us which is correlation, and what is simply some random causation. These two may be the two most confused states, and are certainly amongst the most used red herrings
Data should be one of the foundations of all our decision making, and we rarely have all we need. Therefore we are forced to make often difficult choices with limited data, implementing those decisions, measuring the impacts, and adjusting tactically as you learn. It pays to understand what you are relying on when you make those choices.
Cartoon header: courtesy www.XKCD.com