A Patient’s Perspective of Measuring Health Outcomes

quinngraceNov 28, 2024, 8:46 PM

18 points

Cause prioritization Systemic change Global health & development Ethics of personal consumption Adjusted life year Research Opinion Cost-effectiveness analysis Measuring and comparing value

CW: This post delves into death, euthanasia, suffering, and discrimination against minorities verging on eugenics, which may be disturbing to some readers.

TL;DR

My main issues with current measurements for health outcomes (particularly QALYs) are:

They are decided by value sets from people who have not experienced the health state they are judging, which varies wildly from what people who experienced the health state judge it to be.
They de-value the cost-effectiveness of treatments in groups where people are very ill.
Maximising these measurements leads to very dark places.
There are no weights to address systemic issues.
They punish the groups in society who need health interventions the most.
They don’t capture the full impact of health interventions.

Possible solutions include creating a new measure and getting governments and policy-makers to adopt alternative measures.

*******

For about a year, I was bedbound. Had I not gotten better, I surely would have perished to my illness. Many of my doctors had given up hope, other than suggesting some supplements. I was in a State Worse than Dead.

I suffered from profound fatigue that made even simple tasks almost impossible. Eventually, I even stopped eating. I slept 22 hours a day. The chest pain was constant and I struggled to breathe so severely that I ran out of breath when talking. My muscles were so weak that I couldn’t get up off the floor without assistance. I struggled to remember things and my attention span dissolved. Light and sound hurt. I couldn’t use my brain like I used to.

I got better, evidently.

Now that I am better, I have more brain power for retrospection. If this was just a one-in-a-million-lost-the-birth-lottery isolated incident kinda thing, I’d move on. But it’s not. It’s a 1 in 77 kind of thing which it seems we’re just neglecting^[1]. Why? Honestly, I’m not sure. But in all likelihood, QALYs played some part, since it is often used to measure health outcomes and the effectiveness of health interventions.

I’m not a big fan of QALYs as a measurement for health outcomes for two main reasons:

While I understand that many with the EA community already oppose QALYs, I want to drive home what the actual impact on a person’s life can be and why even if we as a community move away from it, why it’s important to get decision-makers on board too as a cause area.

The rest of this post will take the form of critiques of current QALY calculations interspersed with my own experiences of such.

It can be negative

One pivotal issue that has thus far not been considered in this debate is that HRQoL can not only be low, but also negative: ‘health states worse than dead’ (SWD) get assigned negative values. Extending the live of a person who lives in a SWD generates negative QALYs.
...
The QALY is defined as the arithmetic product of survival time and HRQoL. HRQoL, in turn, is determined by the health state an individual is living in. This means, ‘measuring’ QALYs usually involves two components: firstly, a set of health states; and secondly, numeric scores that reflect their respective desirability. These values are often also referred to as utilities, social values, preference-, (health-related) quality of life-, or QALY-weights. Customarily they are supposed to reflect the preferences of the general public
...
When it comes to SWD, however, the preferences of the general public are (1) ill-informed, (2) misconstrued, and/or (3) irrelevant. In the following, I shall further elaborate on these three points.
1. Ill-informed: The preferences of the general public do not correspond to patients’ evaluation of their own situations; they should not be confused with a measure of patients’ self-assessed HRQoL.

The number of people who upon hearing about my previous condition have said ‘If it were me, I’d rather die’ is astonishing. People so healthy that they don’t even remember what it feels like to have a common cold are harshly judging my quality of existence. I can’t blame them, a few years back when researching euthanasia for debating, I remember reading about some cases and thinking ‘Wow, if that were me, I’d want to be euthanised for sure’

But imagining an experience and living it are galaxies apart.

When ill, I remember reading about MAID in Canada (a euthanasia program). People who were healthier than me had qualified for it, so had I been Canadian I would easily have qualified. It’s terrifying to think that had I been born in Canada, my only ‘treatment’ option would have been death.^[2] Not even at my most ill did I wish for death, all I wanted was to feel like myself again. In fact, I read a study that claimed that the average lifespan of someone with my condition was about 47 years and actually lamenting the loss of life.

If I had to choose between being severely ill for a year or going back in time to high school for a year, I’d take being severely ill hands down. Yes, I hated high school that much and yes, there are things worse than a SWD (in my opinion).

2. Misconstrued: Social value sets do not reflect the general public’s preferences for the allocation of resources.

Participants in health valuation studies are not asked how they prefer resources to be allocated. This would require using a method like the person trade-off, for example, in which participants are asked to make choices about two groups of people, which differ in size and in their health states [37]. Instead, TTO or SG are used, which ask participants to imagine being in a particular health state themselves. Yet, this one type of preferences can not easily be translated into another. Some people may, for example, say that they would rather prefer to be dead, than to be confined to bed [38]. Yet, the very same people will probably consider their preferences being misrepresented, if they led to the evaluation that people who are confined to bed should not be offered life-extending treatments. They may rightly object that this is just not what they meant.

If someone took my statement about my experience of high school being worse than a SWD and used that to justify not improving those high schools, I would most certainly object. Even if we wanted to allocate resources based on societal preferences, QALYs are not the way to do that (nor does it pretend to be).

3. Irrelevant: Even if social value sets would accurately reflect the general public’s preferences, in the context of SWD, those should be considered irrelevant.
It seems rather improbable that members of the general public in the UK, or anywhere else for that matter, would actually support the concept of SWD and their implicit value judgement—which we will discuss in more detail in the next section. However, even if some individuals wanted some other individuals to die earlier rather than later, those preferences should be deemed irrelevant for treatment reimbursement decisions.
While everyone has, of course, the right to consider their own life in a certain health state to be worse than dead and to refuse life-extending treatments, considering someone else’s life in a certain health state worse than dead is morally a completely different issue. To then also prefer that life-extending treatments are withheld from certain (other) individuals, because one prefers them to be dead, would undoubtedly be reprehensible. It would constitute an objectionable preference [11, 23].
(Emphasis added)

The end result of negative QALYs, is that treatments that otherwise would be considered cost-effective, are no longer evaluated as cost effective. Schneider goes into great detail to illustrate this in the above-quoted paper, which I recommend reading in its entirety. The darker flipside to this, is that letting people in SWD die generates QALYs.^[3] That would surely create perverse incentives to kill off sick people. But that’s just conspiracy talk surely? No one would actually argue in favour of that, right?

In this paper, we propose and defend three economic arguments for permitting assisted dying. These arguments are not intended to provide a rationale for legalising assisted suicide or euthanasia in and of themselves; rather, they are supplementary arguments that should not be neglected when considering the ethics of assisted dying. The first argument is that permitting assisted dying allows consenting patients to avoid negative quality-adjusted life years, enabling avoidance of suffering. The second argument is that the resources consumed by patients who are denied assisted dying could instead be used to provide additional (positive) quality-adjusted life years for patients elsewhere in the healthcare system who wish to continue living and to improve their quality of life. The third argument is that organ donation may be an additional potential source of quality-adjusted life years in this context. We also anticipate and provide counterarguments to several objections to our thesis. Taken together, the cumulative avoidance of negative quality-adjusted life years and gain in positive quality-adjusted life years suggest that permitting assisted dying would substantially benefit both the small population that seeks assisted suicide or euthanasia, and the larger general population. As such, denying assisted dying is a lose–lose situation for all patients.

Source

A healthcare metric should not be designed in such a way that killing a group of people to prevent them from using further health resources and then extracting their organs is evaluated as a good thing.

In fact, the authors have to specify:

...we stress that avoiding suffering and respecting autonomy remain the primary arguments for assisted dying;

The fact that this needs to be added because using only QALYs as an evaluation metric would generate a dystopian nightmare, is the very problem.

All of this can be avoided by removing the concept of SWD from QALYs/not having negative QALYs. My next gripe with QALYs cannot be so easily solved.

It perpetuates existing inequalities

I love watching medical dramas because they offer a fantasy to escape into, one where doctors will go to any length to solve your medical mystery. When you have anything even vaguely unusual, trying to get effective treatment is very difficult.

(Just a note, doctors are only human. They err. In fact, a lot of these mistakes come from systemic problems and biases.)

Things I have been told by actual doctors:

“The best treatment we know of is time, let’s see how you are after 5 years.”—For a severely disabling condition, that it turns out sooner treatment makes a huge difference for.

“That’s not a real diagnosis, who said you have that?” About another specialist’s diagnosis.

“It’s probably just anxiety.” Then offered no treatment or advice for my probable anxiety. Plot twist, it wasn’t anxiety. This happened multiple times.

“Maybe the reason you feel so sleepy is because you aren’t trying hard enough to stay awake.”—It couldn’t possibly be a common side effect of the medication I was taking, which immediately disappeared when I stopped it. /s

“You’re feeling suicidal? Stop being such a drama queen!”

For a significant portion of medical research, women are excluded and we keep seeing the knock-on effects. Compounded by this issue are illnesses that mostly affect women, especially autoimmune diseases (especially EDS, fibromyalgia, CFS/ME, MCAS). They have been around for ages, but sexism in the medical field meant that these diseases are ignored by doctors and researchers.

Furthermore, even when women aren’t being ignored or excluded, the sex-based differences aren’t adjusted for, creating unfair comparisons.

One domain common to all of the typically used health utility instruments and instruments with which readers might be more familiar such as the SF-36 (Ware & Sherbourne, 1992) is pain.1 Men and women have been found to respond differently to similar pain stimuli (Kimble et al., 2003) and to report pain differently (Soetanto, Chung, & Wong, 2004). One possible result of these differences is that men and women who report the same level of pain may have different levels of health utility. If women experience greater dysfunction at the same level of pain as reported by men, the effect of relieving the pain among women will be underestimated when using societal preferences, even if it is partially picked up by other functional domains. Another possible gender difference arises if men and women report different levels of pain when the objective pain stimulus is identical (Jackson, Iezzi, Chen, Ebnet, & Eglitis, 2005). If women report higher levels of pain for the same stimulus, relieving the pain will lead to a greater improvement in health utility for women than for men. If a treatment could completely relieve the pain and a societal preference weighted instrument were used to score health utility, this would result in an underestimate of the actual gain for women and a less favorable cost-effectiveness ratio than if a gender-specific health utility measure were used.

Source

If a measure doesn’t take systemic bias into account, it will perpetuate it. (The mechanism for how this happens with QALYs will follow.)

Getting to those doctors, when I was really sick, was a mission. On good days, I could use a wheelchair, but I can’t predict when those days would be in advance to reschedule the appointment if need be. Most specialist doctors don’t do home visits either, so I had to come to them, in the process making myself sicker.

Strangely, most healthcare isn’t designed for sick or chronically ill people. It’s designed for healthy people, which is completely counterintuitive. But the more you actually need healthcare, the more barriers you’ll face trying to access it. When you are really, really sick, it often morphs into a disability because the illness starts to impede daily activities.

What if you need a wheelchair? Is your transport and the place you are going to wheelchair accessible? What if you are too sick to get out of bed? Will your doctor make a home visit or do telemedicine? Can you reschedule if you are too sick that day and it would take too much of a toll? Does the doctor know how to deal with multiple medication interactions? Does the doctor know how to deal with co-morbidities or do you need another doctor? If you can’t see, are there braille and screen readers? Can appointments only be made through phone call or are online bookings an option as well, especially for highly anxious people, autistic people, and people with difficulty speaking.

There’s also a theme developing here, chronically ill and disabled people face numerous barriers to getting treatment^[4].

The result is not getting necessary treatment and having worse outcomes, resulting in earlier death.

This creates a negative feedback loop because having a lower life expectancy means that person can’t generate as many QALYs as a healthy and/or able-bodied counterpart. So a treatment that improves their quality of life is less likely to get funded because it can’t generate as many QALYs as a treatment that treats conditions that improves the quality of life of people with long lifespans. The same barriers still continue to exist. This phenomenon is not small, in fact, the average amount having a disability shortens lives by is decades in LMIC.

As Andrade puts it:

… QALY sets up a rigid social hierarchy, a caste system, so to speak. At the top are those who are eligible for treatment options, and who receive (in the form of subsidies and taxpayers’ funds) privileges from the entire society. At the bottom are those who are not eligible for treatment, and therefore, their lives are cut short. Viciously, this very fact further contributes to their illegibility for treatment, as a shorter life expectancy makes the QALY score ever lower. Ultimately, with QALY, not all people are created equal. They may be formally equal before the law, but some are certainly not equal in terms of their worth to public health officials.
The fundamental ethical shortcoming of QALY is that in its orientation toward numbers, it fails to value life itself. Consider two treatments that use the same amount of funds. In treatment A, the life of a 35-year-old man with a rare disease is saved. In treatment B, the erectile dysfunction of 200 35-year-old patients is successfully treated. Given the large amount of people in scenario B, QALY will probably be higher. Yet, it seems more sensible to admit that even with reduced QALY, choosing A is more ethically appropriate, as a life is being saved.
...
Since QALY relies on life expectancy, young people are privileged with this criterion. And the quality-of-life years also play a role, as young people are also significantly more likely to fully recover from interventions.
This has had the practical implications. For example, NICE has pursued a policy of not including drugs for dementia on the grounds that they are too expensive and do not provide sufficient QALYs, as compared to other treatments, to which funds are allocated.
...
QALY’s discrimination is also salient in terms of race and ethnicity. Groups with higher life expectancy receive preferential treatment, given that they are in a better position to maximize QALY. ²⁷ This preferential treatment often falls along racial lines, with whites being offered treatment ahead of ethnic minorities. ²⁸ To the extent that life expectancy is shorter among ethnic minorities, ²⁹ they are excluded from the priorities.
However, it has been firmly established by research that social variables play a major role in life expectancy,³⁰ ^,³¹ ^, ³² ^, ³³ ^, ³⁴ and oppression and discrimination are central in this regard. Therefore, a vicious circle comes into effect. Systemic racism in society shortens life expectation of disadvantaged ethnic and racial groups. As per QALY standards, reduced life expectancy takes away priority in the allocation of resources. ³⁵ Lack of medical assistance increases stress. ³⁶ and further decreases life expectancy,³⁷ causing minority ethnic groups to be further displaced in the order of priorities in QALY allocation.

There are other reasons to question these sorts of measurements

As mentioned in this post about the HALY+ and this post about HALYs in general:

They neglect non-health consequences of health interventions.
They rely on poorly-informed judgements of the general public.
They fail to acknowledge extreme suffering (and happiness).
They capture some but not all spillover effects.
They are of little use in prioritising across sectors or cause areas.

Both posts are worth reading in their entirety, especially if the reader is interested in the more technical aspects of measuring health outcomes.

I’d like to just briefly touch on point 1. Measured in QALYs/DALYs, my personal improvement in health is only a fraction of the result of getting treated. (Tooting my own horn incoming): I now earn an income which I have publicly pledged to donate a portion of for the rest of my life, where I would previously be dependent on others. I also work in a high impact career deriving insights from digitising clinics in rural Rwanda. None of this is captured by current HALY frameworks, even though they are intertwined with my health state.

In summary, my main issues with current measurements for health outcomes (particularly QALYs) are:

They are decided by value sets from people who have not experienced the health state they are judging, which varies wildly from what people who experienced the health state judge it to be.
They de-value the cost-effectiveness of treatments in groups where people are very ill.
Maximising these measurements leads to very dark places.
There are no weights to address systemic issues.
They punish the groups in society who need health interventions the most.
They don’t capture the full impact of health interventions.

What now?

We should create a new metric that doesn’t fall prey to the above issues. At the very least, it should take patient experience input with weights for systemic issues and have death as the lowest point on the scale, so that no person is more valuable dead than alive.
We should prioritise getting governments and policy-makers to move away from QALYs in favour of alternatives.

Image source

^
This source uses QALYs to compare different diseases to their relative funding, which is a measurement I criticise heavily throughout this post. However, this source is still useful to show how even under the QALY framework, ME/CFS is a neglected cause in global health and development.
^
This policy has come under intense fire from disability activists. Here are some good analyses of why: https://www.aljazeera.com/opinions/2024/2/16/canadas-assisted-dying-regime-should-not-be-expanded-to-include-children https://www.cambridge.org/core/journals/palliative-and-supportive-care/article/realities-of-medical-assistance-in-dying-in-canada/3105E6A45E04DFA8602D54DF91A2F568
^
Take the example of a QALY value of −0.2 placed on someone’s health state. If they died 10 years prematurely, they would go up to 0, gaining 2 QALYs. Hopefully, this illustrates the danger of negative QALYs.
^
World Health Organisation even highlights it here: https://www.who.int/news-room/fact-sheets/detail/disability-and-health

quinngraceNov 28, 2024, 8:46 PM

18 points

4 comments11 min readEA link

Cause prioritization Systemic change Global health & development Ethics of personal consumption Adjusted life year Research Opinion Cost-effectiveness analysis Measuring and comparing value

MHR🔸Nov 30, 2024, 1:59 PM
13 points
2 ∶ 0
Thanks for sharing your perspective and experiences here! I think it’s really valuable for EAs with first-hand experience to write about these issues, and I’m really sorry you went through such a difficult time. You might be interested to read this piece I wrote last year about EA and disability based on my own experiences. It includes some discussion of HALYs, though that’s focused more on the history and perception of HALYs rather than the issues you touched on.
Reading your piece, I very much agree with you that the current methods of constructing HALY weights generally have methodological issues and would greatly benefit from more focus on actual experiences of people in the health states in question. I also agree that the naive application of HALYs as units of utility or “goodness” can lead to some very dark places (especially in light of the methodological concerns you mentioned, it frustrates me that EAs often slip into using DALYs as units of utility when the post-2010 weights are specifically intended to only be a measure of health status).
I have a slightly different perspective on a couple of the other issues you mention, and I’d be curious to talk more with you about these.
First, I think that the existence of difficult tradeoffs is inevitable as long as resources are limited, and moving to HALY alternatives won’t necessarily eliminate these challenges. For example, one of the articles you quote mentions NICE guidelines about drugs for dementia. I would really, really like better medications to exist, but my (admittedly not very in-depth) understanding of the field is that the drugs are quite expensive and don’t do much to improve symptoms or alter the course of the disease. As long we have limited healthcare resources, it’s not clear that there’s an alternate weighting strategy that would justifiably recommend allocating limited resources to buying these medications rather than spending on other health interventions that work better. Here, it seems like the problem is much more the paucity of options and limited resources than the particular weighting scheme.
Another item is the role of HALYs in perpetuating healthcare inequalities. I do agree that there is absolutely a straightforward way in which this is true, but I’ve come to think there’s some more complexity here than I at least initially thought. HALY maximization in some situations encourages improving the wellbeing of people with chronic illnesses/disabilities over extending the lives of able-bodied people. For example, the 2021 GBD disability weights give post-viral chronic fatigue syndromes a weight of 0.22. A policymaker trying solely to maximize DALYs averted would, if such a treatment were available for the same cost, choose to invest in curing five people of post-viral chronic fatigue over saving one fully healthy person’s life (if all the individuals were the same age). I absolutely agree that ME/CFS is underfunded overall, and that there is probably a role for HALYs in that underfunding (in particular, the 2021 GBD disability weights only include values for a small number of recognized post-viral chronic fatigue syndromes, so policymakers may not be able justify investments in broader ME/CFS research/treatment in terms of HALY benefits). I just think the overall picture here is a bit complicated.
The last item is the existence of states worse than death. I very much agree that deciding at a population level that certain people’s lives are worse than death, then making policy on that basis, can lead to very dark and wrong places. However, I really do think that some people in some cases experience states worse than death, both from my own experiences and from the testimonies of others. In my own life, I have had experiences that were bad enough that I absolutely would have traded off shortening my own life to avoid them (for example, I would have been happy to lose a week of healthy life to avoid the worst moments of a shoulder dislocation). More broadly, I do think we should listen to ill or disabled individuals such as Gloria Taylor who’ve described their own conditions as worse than death and advocated for the right to access medical assistance in dying. I think it would be wrong to say that for individuals who describe their lived experience as worse than death and express a desire to access medical assistance in dying, they have not benefited by being able to fulfill their desire. And moreover, not having a weighting scheme that allows for states worse than death I think risks underemphasizing the suffering involved for certain people in cases of extreme pain. As a result, I worry that such a scheme could lead to underprioritizing interventions that improve quality of life and alleviate pain in favor of interventions that save lives. Again, I think it’s absolutely wrong to decide based on population-level statistics that an individual person’s life is worse for them than death, but I think there’s a balance to walk here and I worry that it’s as bad or worse to not listen when people describe their lived experiences as worse than death based on their own values.
Thanks again for writing your piece. I hope these thoughts are useful!
Camille Nov 29, 2024, 9:35 AM
3 points
0 ∶ 0
Thanks for this posting this.

First of, I want to acknowledge that discussing this issues is indeed very difficult. I’m happy that you made it through whatever you had to go through (I could qualify this experience, but I expect any effort on my side to fall short of being helpful), and I’m immensely sorry that you had to face all these different issues, in lack of a better term. I also want to pre-emptively say that I share some of your critiques and don’t want to come off as judging your experience.

However, I have some questions on my mind. I’ll just leave one here, in the hope that it doesn’t come off as insensitive.

I’d be curious to see how switching from QALYs to something else would re-order EA priorities. What would your guess be ? Would SWD plausibly be above e.g. malaria prevention?

I’m not requesting anything extremely specific or committed, but I think it would help me paint a more complete picture of the critique, and potentially identify clearer points of disagreement.
- quinngrace Nov 29, 2024, 11:37 AM
  3 points
  1 ∶ 0
  Parent
  Thank you for raising this question, it is certainly not insensitive. Feel free to ask more questions.
  I am also wondering how switching from QALYs would change EA priorities. My guess is that it depends entirely on the weights in the model. I want to do some comparisons with different alternatives to see how they would inform priorities. Some alternatives I would like to test are:
  - DCEA, distributional cost-effectiveness analysis;
  - MCDA, multi-criteria decision analysis;
  - ECEA, extended cost-effectiveness analysis;
  - EBW, equity-based weighting;
  - MP, mathematical programming;
  - stratified CEA;
  from the studies mentioned in this article. I’m not very clued up on alternative CEA measurements, so I was hoping someone more knowledgeable would mention an alternative.
  Once I have done that analysis, I’ll post a follow up to this post and that would clear up that confusion. It’s not something that will happen quickly though.
  The point of this post was to explain gaps in current measurements of health outcomes from the point of view of the tangible day-to-day effects, as well as how what is being measured often doesn’t match reality. I’m not an expert in mathematical models or health economics but I am an expert in being chronically ill, so that’s the lens I was offering.
  One guess is that doing away with negative QALYs would mess with animal welfare calculations because a lot of them rely on negative QALYs. However, if I play devil’s advocate, it could be argued that animals should get a very high weight in terms of historical disenfranchisement, in which case, the calculations would change but I suspect animal welfare would still be one of the top issues.
SummaryBot 28 Nov 2024 22:22 UTC
1 point
0 ∶ 0
Executive summary: Current health outcome measurements like QALYs have serious flaws that discriminate against chronically ill and disabled people, potentially incentivize harmful policies, and fail to capture the full impact of health interventions.
Key points:
1. QALYs allow negative values for “states worse than death,” which can make treatments appear cost-ineffective and potentially justify withholding care from severely ill patients.
2. The system relies on healthy people’s uninformed judgments about health states they haven’t experienced, which often drastically differ from actual patient experiences.
3. Current measurements perpetuate systemic inequalities by devaluing treatments for groups with shorter life expectancies, including disabled people and racial minorities.
4. The metrics fail to capture non-health benefits of treatments (like enabling someone to work and contribute to society) and spillover effects.
5. Proposed solutions include creating new metrics that eliminate negative values, incorporate patient experiences, and add weights for systemic issues.
6. Getting governments and policymakers to adopt alternative measures is crucial for improving healthcare resource allocation.
This comment was auto-generated by the EA Forum Team. Feel free to point out issues with this summary by replying to the comment, and contact us if you have feedback