Disadvantages of the measures
This article warns, among others, of the risk of falling in love with measurement systems that seem very precise, for example, because they offer results with decimals, but have other drawbacks.
Sometimes, not having a measure can be better than having a bad measure. And sometimes, not measuring can be better than measuring and causing a perverse effect, just the opposite of the desired effect.
The advantages of measurements
The advantages of having measurements are evident, so this article focuses on the inconveniences. However, it is fair to first recognize the advantages. Measurements provide us with knowledge, which can be achieved to achieve our goals, for example, happiness.
The knowledge provided by the measurements has different aspects:
· The qualitative aspect, which allows us to know what is happening.
· But can also be quantitative, providing a precise, quantified idea.
· Since there is a measure, this also allows us relative or comparable knowledge. Even when we have errors in the measurement, and as long as the errors maintain a certain coherence, we can compare two erroneous measurements to have a correct idea of the evolution in time, in space, or in any other dimension.
Measurements can also cause virtuous, perhaps unforeseen effects.
· Several studies on the influence of lighting on workers’ productivity showed that it increased in cases where lighting levels were higher, but also in those cases where lighting was reduced, simply because it was conducting the study (Hawthorne effect).
The inconveniences of the measurements
These are different ways in which making or having a measurement can be inconvenient: 1) Cost 2) Error 3) Modification of the measured object (and even of the measure itself) 4) Unwanted side effects 5) Misinterpretation 6) Invisibilization.
1. Cost. Obtaining a measurement has a cost that we would prefer if it did not exist. This always happens, or is there a case where taking a measure is free?
2. Error. The measurement could be significantly incorrect. If the measurement were very bad, it might have been better not to have it. Sometimes this occurs because of the context in which the measurement is performed.
· A conflict of interest should make us distrust a measure, however precise it may seem. It may be reasonable to distrust the studies of the goodness of the vegetable diet conducted by vegans, as well as studies of the goodness of dairy foods financed by livestock companies, studies of the benefits of psychology conducted by psychologists, or studies of the safety of mobile telephony conducted by mobile phone companies.
· Sometimes feedback is collected in situations where it would be disturbing to make an open criticism (for example, the boss asks us in public for our opinion on the last decision he has made). The results may be very different from those obtained in an anonymous survey.
3. Modification of the measured object (and even of the measure itself). A measurement usually affects the measured element, even if it is minimally. In addition, measuring can precisely modify the variable we intend to measure. This is known as observer’s paradox.
· Contact with a thermometer will modify the temperature of the object we intend to measure.
· A cytology extracts part of what is intended to study.
· When shooting photons (light) to an object to visualize it, we will be modifying it.
4. Unwanted side effects. The previous examples do not seem very serious, but unfortunately sometimes the measurement has a perverse effect, just the opposite of the desired one:
· Some prenatal diagnostic tests (such as amniocentesis and corial biopsy) are invasive and have a risk. Although they are performed to detect and avoid certain complications, they can also cause others.
· Installing an office access control system can make employees more punctual, but it can also motivate them to spend hours drinking coffee chatting, instead of working or visiting customers.
· Speed tests designed to assess the performance of computers, and in particular, the performance of CPUs (a fundamental element in a computer) can cause manufacturers to design computers not intended to be very fast when working with them, but to be very fast when the speed test is applied (Goodhart’s law).
· If you have decided that your children go to school on the bus, there are a number of possible indicators or KPIs that may be of interest to give an idea of the quality of the school transportation service, such as punctuality of the bus, cleanliness, and road safety. Punctuality of the bus can be one of the objectives; and keeping track of the punctuality of the bus is extremely cheap and accurate, but it can have a perverse effect on safety. No doubt you do not want to convey to the driver of the school bus a great motivation to be punctual, as this could encourage him to increase the speed and risk to compensate for the delay caused by incidents of any kind.
· Trying to measure happiness seems to frustrate it. Asking someone at a party (or in a love or sexual act) to assess how well they are having fun can decrease satisfaction (although in other cases recognizing or thanking positive things can enhance them). “Happiness is found only in little moments of inattention” —João Guimarães Rosa. “Ask yourself whether you are happy, and you cease to be so” –John Stuart Mill”. https://en.m.wikipedia.org/wiki/Paradox_of_hedonism
5. Misinterpretation. Measurement can confuse and give the wrong idea of reality, since it measures what it measures, not what it seems to measure:
· If in sex surveys people respond that they have sex four days a week, this does not mean that people have sex four days a week. What this survey tells us for sure is that when you ask them about sex, people say they have sex about four days a week.
· Vitamin D tests are not reliable (https://www.youtube.com/watch?v=I1uoc8ZN0m4)
· Estimated levels of calcium in the body are not reliable. The only good way to make a measurement of calcium in the body is an autopsy.
· Vitamin B12 measurements are controversial.
· In the same way that a telephone survey leaves those individuals who do not have a telephone out of the study, a survey on well-being or suffering leaves out those individuals who cannot answer us.
6. Invisibilization. We usually start measuring what we can measure well, and we lack motivation to try to measure what we cannot measure well. In this way, the measurement makes invisible the elements that are more difficult to measure although they could be much more relevant. This increases the risk of ignoring those other elements and even in some cases, promoting the idea that they do not exist.
· Many of the victims of sexual assault (women and men, girls and boys; on the street, in homes, and in prisons) do not report and hide the event, which makes their accounting difficult. There could be totally undervalued types or contexts of sexual assault due to the difficulty of measuring them.
· The death toll (number of deaths) in an accident or in a conflict can make suffering -much more difficult to measure- invisible. For example, surely 10 individuals who die burned alive in an aviation accident, as a whole, suffer much more than 100 individuals who die from concussion. But suffering is difficult to measure, and attention goes to the number we can easily get. This could cause the establishment of wrong priorities, especially if we add to this the effect that certain things have to repel the attention.
These inconveniences should make us think twice before discarding anecdotal evidence versus other more formal evidence. In some cases, it is possible that formal and well-structured evidence has been obtained in a way that discards the reality that anecdotal evidence is suggesting.
How to avoid measurement inconvenience
The drawbacks of the measurements that have been identified are: 1) Cost 2) Error 3) Modification of the measured object (and even of the measure itself) 4) Unwanted side effects 5) Misinterpretation 6) Invisibilization.
To avoid these inconveniences, we can ask ourselves the following questions:
· How much does it cost to get that data? Is it worth it?
· Could the data be incorrect? How? What is the margin of error?
· Is there any other source or evidence that we can consult to collate the data? Without overwhelming, the more, the better. It is said that a sailor must sail with a compass or with three but never with two, but it is not true: it is better to have two compasses than just one. What credibility do these other evidences have? Should we totally discard them?
· Who has obtained or will get the data? Is there a conflict of interest? Is there any ideology behind who is behind that data?
· Does obtaining the data modify the object we are measuring? How? Is it acceptable?
· Can obtaining the data have other side effects? Which? Are they perverse or virtuous effects? For whom?
· What exactly does that data represent? How do I interpret it? And how will others interpret it? Are these interpretations correct? Is the meaning of the data likely to be misunderstood?
· What has been left out when conducting the study? What does the data omit? What can be the effect of that omission?
The Tyranny of Metrics is largely about this topic. Some of the points made there are that:
Metrics can distort info by:
Measuring the most easily measurable
Measuring the simple when the desired outcome is complex
Measuring inputs rather than outcomes
Degrading information quality through standardization
Metrics can be gamed by:
Creaming (e.g. surgeons only taking on the easiest surgeries to improve their stats)
Improving numbers by lowering standards
Improving numbers through omission or distortion of data
Cheating
Negative consequences of metrics include:
Goal displacement through diversion of effort to what gets measured.
Promoting short-termism.
Costs in employee time.
Diminishing utility.
Rule cascades. In an attempt to staunch the flow of faulty metrics through gaming, cheating, and goal diversion, organizations institute a cascade of rules.
Rewarding luck. Measuring outcomes when the people involved have little control over the results is tantamount to rewarding luck.
Discouraging risk-taking.
Discouraging innovation.
Discouraging cooperation and common purpose.
Degradation of work.
Costs to productivity.
I guess one question is:
There are definitely times where specific metrics are EV-negative, but I’d hope that can be determined with other metrics. (Like a forecast of the estimated value)