For instance, the worst possible health state would be represented by “11111”.
I think “11111” usually refers to full health. (cf. the “EQ-5D Value Sets: Inventory, Comparative Review and User Guide” by Szende, Oppe & Devlin, 2007).
As part of a bigger project on descriptive (population) ethics, I’ve been working on a literature review of health economics. It also contains a section on the EQ-5D and its weaknesses. Here some excerpts:
Problem II: Impossible health states
Another problem is that many health states, such as e.g. 22123 are psychologically impossible or at least very implausible. E.g. if you have “no problems with performing your usual activities (work, study, housework, family or leisure activities, etc.) ”, you can’t, simultaneously, suffer from “extreme depression”. This is immediately obvious to anyone who ever suffered from severe depression.
I’d guess that almost as much as 20% of all EQ-5D health states are psychologically impossible. This indicates that the whole system is suboptimal.
Problem III: Using “immediate death”
Another problem is that subjects are often asked to choose between “immediate death” vs. the alternative scenario. However, this means that the subject is unable to say goodbye to their loved ones, or get their affairs in order. Arguably, the difference between dying immediately and dying in e.g. 3 months can make an enormous difference.”
(Incorporating the TTO lead-time approach can easily overcome this problem.)
Anway, you write:
First, DALYs are biased towards physical health. The instruments used for eliciting them and affective forecasting errors cause mental health to be underrepresented.
I couldn’t agree more.
IMHO, another big problem is the evaluation of states worse than death (SWD) (and states of severe mental illness such as depression arguably belong in this category). For example, most studies don’t even allow for SWD assessments. Furthermore, most researchers transform negative evaluations, limiting them to a lower bound of −1. Assuming that people with a history of mental illness more often evaluate health states indicating severe mental illness as highly negative (i.e. give utilities as lower than −1), then this ex-post transformation causes their judgments to have less influence than the judgments of uninformed people who underestimate the severity of mental illness.
I discuss this problem, as well as other problems, in much greater detail in my doc.
I plan on publishing the doc within the next months, but if you’re interested I’m happy to send you a link to the current version.
I’d guess that almost as much as 20% of all EQ-5D health states are psychologically impossible. This indicates that the whole system is suboptimal.
I don’t think this follows. If these states are impossible (I don’t disagree) then they’ll never come in real life, so it won’t matter what people say in the TTOs. As long as people make sensible judgements about the health states that actually occur, it doesn’t matter what they say in impossible ones. I think you should push the fact they don’t make sensible judgements in general - affective forecasting stuff, etc.
IMHO, another big problem is the evaluation of states worse than death (SWD) (and states of severe mental illness such as depression arguably belong in this category). For example, most studies don’t even allow for SWD assessments. Furthermore, most researchers transform negative evaluations, limiting them to a lower bound of −1. Assuming that people with a history of mental illness more often evaluate health states indicating severe mental illness as highly negative (i.e. give utilities as lower than −1), then this ex-post transformation causes their judgments to have less influence than the judgments of uninformed people who underestimate the severity of mental illness.
Curious. Hmm. IIRC, DALYs and QALYs don’t have a neutral point: 1 is healthy, 0 is dead, but it’s not specified where between 0 and 1 is neutral. Is neutral 0.5? 0? Unless you know where neutral is you can’t specify the minimum point on the scale, because it doesn’t make sense.
Assuming that people with a history of mental illness more often evaluate health states indicating severe mental illness as highly negative (i.e. give utilities as lower than −1)
What would −1 mean here? DALYs and QALYs aren’t well-being scales and can’t straightforwardly be interpreted as such.
As long as people make sensible judgements about the health states that actually occur, it doesn’t matter what they say in impossible ones.
Good point. But I wonder whether they reinterpret the meanings of some of the dimensions of the ED-Q5 in order to make sense of some of the health states they are asked to rate.
Unless you know where neutral is you can’t specify the minimum point on the scale, because it doesn’t make sense.
Agree.
What would −1 mean here? DALYs and QALYs aren’t well-being scales and can’t straightforwardly be interpreted as such.
This depends on the study. I’m afraid it will take me a couple of paragraphs to explain the methodology, but I hope you’ll bear with me :)
The literature review by Tilling et al. (2010) concluded that only 8% of all TTO studies even allow for subjects to rate health states as worse than death (i.e. as below 0), so for the vast majority of studies, the minimum point on the scale is indeed 0. I think this is problematic since e.g. health states like 33333 (if they are permanent) are probably worse than death for many, maybe even most people.
Of the few TTO studies that allow for negative values, the protocols by Torrence et al. (1982) and Dolan (1997) are used by almost all of them. Below a quote by Tilling et al. (2010), describing these two methods:
The method developed by Torrance et al. (1982) gives respondents a choice between a scenario of living in full health for ti years followed by the state to be valued for tj years (ti + tj= T), followed by death, and an alternative scenario, which is to die immediately. The value T is fixed (e.g., 10 y). The value of ti (and therefore also the value of tj) is varied until a point of indifference is found between the 2 scenarios. The utility value for that health state is then given by – ti/tj. [… Dolan (1997)] used a method similar to this, but the 1st scenario is to live in the health state to be valued for tj years followed by full health for ti years (i.e., the ordering of the 2 states is reversed).”
These two TTO protocols, in theory, would allow for extremely negative (and even infinite) negative values. Tilling et al. (2010) explain:
“[...] negative values can be extremely negative. A participant who would not accept any amount of time, however short, in a poor state of health is implying that such a state is infinitely bad.”
How do researchers respond? Again, I’ll quote Tilling et al. (2010, emphasis mine):
“Given the mathematical intractability of dealing with negative infinity (a single value of negative infinity in a sample of respondents would give a mean value of negative infinity), researchers usually censor such responses. Under such censoring, the lower bound is determined by the (relatively arbitrary) choice of the smallest unit of time the TTO procedure will iterate toward.”
In the two most commonly used TTO protocols, the smallest unit of time the TTO procedure iterates toward for SWD is 1 year. Consequently, the lower bound is −9. (Sometimes, the smallest united of time is 3 months, so the lowest possible value is −39.)
To give a concrete example: The subject is indifferent between A) living for 2 years in full health and for 8 years in health state 33333 and B) dying immediately. Thus, the value for health state 33333, for this subject, is − 8⁄2 = − 4.
Now almost all researchers then transform these values, such that the lowest possible value is −1. In my view, this is somewhat arbitrary.
Below some quotes by Devlin et al. (2011) on the matter:
“Because the elicitation procedure produces such extreme negative values, researchers have responded by doing ex post transformations to bound negative valuations to − 1 in various ways (Lamers, 2007). Crucially, once transformed, the negative numbers for SWD can no longer be interpreted as ‘utility’ scores, measured on the same scale as those for SBD (Patrick et al., 1994). Yet standard practice in calculating QALYs is to treat all values reported in value sets as commensurable. For example, an improvement from − 0.2 (an SWD) to 0, experienced over one year is interpreted as, producing a gain of 0.2 QALYs, and this is treated [...] as identical to an improvement from 0 to 0.2 experienced for one year, whereas the underlying ‘untransformed value’ for the SWD might suggest these two improvements in health are valued quite differently.”
...
“A related issue is whether or not values of negative states should be bounded to 1. It is not obvious why there should be no states worse than 1. For example, the phrase ‘it would have been better if he had never been born’ could truly be applied to people who have undergone torture and other types of brief but extreme suffering. There is no theoretical basis for imposing a limit on the level of disutility associated with these extreme sufferings.”
And here another quote by Tilling et al (2010):
“[...] it is not obvious why there should be no states worse than –1. Although it makes data analysis easier to transform values in this fashion, arguably 1 y of extreme pain and discomfort might provide as much disutility as 2 y of full health provides in utility.”
I hope this explains my previous comment.
References:
Devlin, N. J., Tsuchiya, A., Buckingham, K., & Tilling, C. (2011, 02). A uniform time trade off method for states better and worse than dead: Feasibility study of the ‘lead time’ approach. Health Economics, 20(3), 348-361.
Dolan, P. (1997). Modeling Valuations for EuroQol Health States. Medical Care, 35(11), 1095-1108.
Tilling, C., Devlin, N., Tsuchiya, A., & Buckingham, K. (2010, 09). Protocols for Time Tradeoff Valuations of Health States Worse than Dead: A Literature Review. Medical Decision Making, 30(5), 610-619.
Torrance, G. W., Boyle, M. H., & Horwood, S. P. (1982, 12). Application of Multi-Attribute Utility Theory to Measure Social Preferences for Health States. Operations Research, 30(6), 1043-1069.
Great post!
Nitpick:
I think “11111” usually refers to full health. (cf. the “EQ-5D Value Sets: Inventory, Comparative Review and User Guide” by Szende, Oppe & Devlin, 2007).
As part of a bigger project on descriptive (population) ethics, I’ve been working on a literature review of health economics. It also contains a section on the EQ-5D and its weaknesses. Here some excerpts:
(Incorporating the TTO lead-time approach can easily overcome this problem.)
Anway, you write:
I couldn’t agree more.
IMHO, another big problem is the evaluation of states worse than death (SWD) (and states of severe mental illness such as depression arguably belong in this category). For example, most studies don’t even allow for SWD assessments. Furthermore, most researchers transform negative evaluations, limiting them to a lower bound of −1. Assuming that people with a history of mental illness more often evaluate health states indicating severe mental illness as highly negative (i.e. give utilities as lower than −1), then this ex-post transformation causes their judgments to have less influence than the judgments of uninformed people who underestimate the severity of mental illness.
I discuss this problem, as well as other problems, in much greater detail in my doc.
I plan on publishing the doc within the next months, but if you’re interested I’m happy to send you a link to the current version.
Some nitpicks in turn!
I don’t think this follows. If these states are impossible (I don’t disagree) then they’ll never come in real life, so it won’t matter what people say in the TTOs. As long as people make sensible judgements about the health states that actually occur, it doesn’t matter what they say in impossible ones. I think you should push the fact they don’t make sensible judgements in general - affective forecasting stuff, etc.
Curious. Hmm. IIRC, DALYs and QALYs don’t have a neutral point: 1 is healthy, 0 is dead, but it’s not specified where between 0 and 1 is neutral. Is neutral 0.5? 0? Unless you know where neutral is you can’t specify the minimum point on the scale, because it doesn’t make sense.
What would −1 mean here? DALYs and QALYs aren’t well-being scales and can’t straightforwardly be interpreted as such.
Good point. But I wonder whether they reinterpret the meanings of some of the dimensions of the ED-Q5 in order to make sense of some of the health states they are asked to rate.
Agree.
This depends on the study. I’m afraid it will take me a couple of paragraphs to explain the methodology, but I hope you’ll bear with me :)
The literature review by Tilling et al. (2010) concluded that only 8% of all TTO studies even allow for subjects to rate health states as worse than death (i.e. as below 0), so for the vast majority of studies, the minimum point on the scale is indeed 0. I think this is problematic since e.g. health states like 33333 (if they are permanent) are probably worse than death for many, maybe even most people.
Of the few TTO studies that allow for negative values, the protocols by Torrence et al. (1982) and Dolan (1997) are used by almost all of them. Below a quote by Tilling et al. (2010), describing these two methods:
These two TTO protocols, in theory, would allow for extremely negative (and even infinite) negative values. Tilling et al. (2010) explain:
How do researchers respond? Again, I’ll quote Tilling et al. (2010, emphasis mine):
In the two most commonly used TTO protocols, the smallest unit of time the TTO procedure iterates toward for SWD is 1 year. Consequently, the lower bound is −9. (Sometimes, the smallest united of time is 3 months, so the lowest possible value is −39.)
To give a concrete example: The subject is indifferent between A) living for 2 years in full health and for 8 years in health state 33333 and B) dying immediately. Thus, the value for health state 33333, for this subject, is − 8⁄2 = − 4.
Now almost all researchers then transform these values, such that the lowest possible value is −1. In my view, this is somewhat arbitrary.
Below some quotes by Devlin et al. (2011) on the matter:
...
And here another quote by Tilling et al (2010):
I hope this explains my previous comment.
References:
Devlin, N. J., Tsuchiya, A., Buckingham, K., & Tilling, C. (2011, 02). A uniform time trade off method for states better and worse than dead: Feasibility study of the ‘lead time’ approach. Health Economics, 20(3), 348-361.
Dolan, P. (1997). Modeling Valuations for EuroQol Health States. Medical Care, 35(11), 1095-1108.
Tilling, C., Devlin, N., Tsuchiya, A., & Buckingham, K. (2010, 09). Protocols for Time Tradeoff Valuations of Health States Worse than Dead: A Literature Review. Medical Decision Making, 30(5), 610-619.
Torrance, G. W., Boyle, M. H., & Horwood, S. P. (1982, 12). Application of Multi-Attribute Utility Theory to Measure Social Preferences for Health States. Operations Research, 30(6), 1043-1069.