Thanks for the link! I was aware of the most recent study, but you prompted me to dig deep and see what they said about their survey methodology.
The most relevant bits I found were sections 4.8 and 4.8.1 in this PDF, which describe multiple surveys done across a bunch of countries.
Iâm still not sure where to find actual response counts by country or demographic data on respondents â itâs easy to find tons of data on how different health issues are ranked and how common they are, but not to find a full âfactory tourâ of how the estimates were put together. Iâd still be interested in more data on those points (I have to imagine itâs buried somewhere in those 1800 pages).
+1 to the question, I tried to figure this out a couple years ago and all the footnotes and citations kept bottoming out without much information having been provided.
Yes, for the YLL estimates they combined different datasets to find accurate causes of death disaggregated by age, sex, location, and year. There should be little bias since data is objective and âcleanedâ using relevant expert knowledge. The authors
Used vital registration (VR)[1] data and combined them with other sources if these were incomplete (2.2.1, p. 22 the PDF)[2]
Disaggregated the data by âage, sex, location, year GBD causeâ (p. 32 the PDF) and made various adjustments for mis-diagnoses and mis-classifications, noise, non-representative data, shocks, and distributed the cause of death data where it made most sense to them, using different complex modeling methods (Section 2 the PDF)
Calculated YLL by summing the products of âestimated deaths by the standard life expectancy at age of deathâ[3]
For YLD estimates, where subjectivity can have larger influence on the results, the authors also compiled and cleaned data, then estimated incidence[4] and prevalence, [5] they severity, using disability weights (DWs) (Section 4 intro, p. 435 the PDF)
Used hospital visit data (disaggregated by âlocation, age group, year, and sexâ (p. 438) to get incidence and prevalence of diseases/âdisabilities. Comorbidity correction used a US dataset.
140 non-fatal causes were modeled (of which 11 (79â89) relate to mental health diagnoses) (pp. 478â482)
For each of the causes for a few different severity levels, sequelae were specified.[6]
Disability weights were taken from a database (GBD 2019) and matched with the sequelae.
[Section 4.8.1]âFor GBD 2010[7] [disability weights] focused on measuring health loss rather than welfare lossâ (p. 472). Data was collected in 5 countries (the study samples are claimed to be representative[8]), through in-person computer-assisted interviews and an online survey (advertised in the researchersâ networks) (p. 472).
The in-person survey participants were asked a series of questions about which of two persons (with lay descriptions of sequelae from a random database) is healthier (p. 473)
The introduction to the questions focused on the relative ability to perform activities[9] (p. 473)
The online survey participants were asked questions that also compared two health states, but of two differently-sized groups rather than two individuals (p. 473)
GBD 2013 was conducted online in four European countries (representative age, sex, and education level samples were invited). The survey estimated additional DWs not covered by the 2010 one and re-estimated 30 causes with improved lay descriptions (p. 474)
Regressions were used to convert the set of preferences between (each of two) health states to 0â1 DW values[10] (p. 474). Comorbidity was corrected for using US data (p. 475)
GBD 2019 relied on the earlier DWs but used current cause incidence and prevalence (p. 476)
DALY = YLL+YLD (p. 1431)
The GATHER checklist (pp. 1447â1449) includes methodology transparency, stating known assumptions, sharing data (in easily understandable formats), and discussing limitations.
In short, for each of the listed causes, researchers added the years lost and a relatively arbitrary disability burden value to gain the DALY burden. The data does not report wellbeing, does not include health-unrelated situations, and focuses on an objective assessment of respondentsâ relative abilities to perform tasks rather than subjective perceptions. The ratios of disability weights should be accurate but their valuation relative to death is arbitrary. Thus, it can be that the data is missing the priorities of populations entirely.
I tried figuring out how an adjusted life-year method can be used to estimate population priorities more accurately, and came up (by a series of conversations with EAs and an enumerator in a Kenyan slum and 3 trial surveys) with soliciting sincerity and using the Visual Analog Scale method (the time trade-off and standard gamble methods (source) were rejected since people had difficulties with the math).
âvital registration (VR) mortality data â anonymized individual-level records from all deaths reported in each countryâs VR system occurring between the years of studyâ (unrelated IHME citation for definition). Page 1445 of the PDF includes a map of data quality (correlates with GDP/âcapita).
Also specified in (GBD Compare FAQ): âan adjustment acknowledging that the VR data are biased compared to other sources of dataâ However, for ânon-VR sources, ⌠data quality can [also] vary widelyâ (p. 45 the PDF).
Even though life expectancy increases with age (England and Wales data exampleâsee maybe 1918), the rate of life expectancy increase should be lower than that of age increase, since YLL âhighlights premature deaths by applying a larger weight to deaths that occur in younger age groupsâ (p. 56 the PDF).
âDWs used in GBD studies before GBD 2010 have been criticized for the method used (ie, person tradeoff), the small elite panel of international public health experts who determined the weights, and the lack of consistency over time as the GBD cause list expanded and additional DWs from a study in the Netherlands were added or others were derived by ad-hoc methodsâ (p. 472). So, the 1996 source that you cite may be biased.
The design implies that computers were accessible in the study locations. In my small-scale survey in a Kenyan slum, the local enumerator refused to take a smartphone to collect data (instead used paper) due to security concerns. (Also, enumeration by a computer can motivate experimenter bias âhow they would be judged by (a traditional) authorityâ rather than responses based on inner thoughtsâ and feelingsâ examination.) Further, non-response attrition rate was not specified but âas many as three return visits [or up to seven calls] were made to do the survey at a time when the respondent was availableâ (p. 472). If attrition is relatively high, selection bias can occur. So, the sample may be not representative and data biased.
âA personâs health may limit how well parts of his body or mind work. As a result, some people are not able to do all of the things in life that others may do, and some people are more severely limited than othersâ (p. 473). This can further bias people to give objective answers on the extent to which their activities compare to that of others rather than focus on their subjective perceptions or share what they think about health.
A probit regression that estimated if the health state was the first (value: 1) or second (value: â1) (I imagine that the probit curve would lie between y=-1 and y=1) in a pair was used to get the relative distances among the health states (p. 474). The probit coefficient associated with each cause were linearly regressed onto the logit transformed intervals and then numerical integration was used to get the 0â1 DW values (p. 474). Since no logs of either the dependent or independent variables were used, the calculation was not skewed by converting to percentages. It is possible that the range of DW spread (where the relative distances should be accurate) is âstretchedâ arbitrarily across the 0â1 range, since no comparisons with death (DW=1) were used. Maybe, all of the DWs should be actually multiplied by 0.1, 10, 0.001?
Thanks for the link! I was aware of the most recent study, but you prompted me to dig deep and see what they said about their survey methodology.
The most relevant bits I found were sections 4.8 and 4.8.1 in this PDF, which describe multiple surveys done across a bunch of countries.
Iâm still not sure where to find actual response counts by country or demographic data on respondents â itâs easy to find tons of data on how different health issues are ranked and how common they are, but not to find a full âfactory tourâ of how the estimates were put together. Iâd still be interested in more data on those points (I have to imagine itâs buried somewhere in those 1800 pages).
+1 to the question, I tried to figure this out a couple years ago and all the footnotes and citations kept bottoming out without much information having been provided.
Yes, for the YLL estimates they combined different datasets to find accurate causes of death disaggregated by age, sex, location, and year. There should be little bias since data is objective and âcleanedâ using relevant expert knowledge. The authors
Used vital registration (VR)[1] data and combined them with other sources if these were incomplete (2.2.1, p. 22 the PDF)[2]
Disaggregated the data by âage, sex, location, year GBD causeâ (p. 32 the PDF) and made various adjustments for mis-diagnoses and mis-classifications, noise, non-representative data, shocks, and distributed the cause of death data where it made most sense to them, using different complex modeling methods (Section 2 the PDF)
Calculated YLL by summing the products of âestimated deaths by the standard life
expectancy at age of deathâ[3]
For YLD estimates, where subjectivity can have larger influence on the results, the authors also compiled and cleaned data, then estimated incidence[4] and prevalence, [5] they severity, using disability weights (DWs) (Section 4 intro, p. 435 the PDF)
Used hospital visit data (disaggregated by âlocation, age group, year, and sexâ (p. 438) to get incidence and prevalence of diseases/âdisabilities. Comorbidity correction used a US dataset.
140 non-fatal causes were modeled (of which 11 (79â89) relate to mental health diagnoses) (pp. 478â482)
For each of the causes for a few different severity levels, sequelae were specified.[6]
Disability weights were taken from a database (GBD 2019) and matched with the sequelae.
[Section 4.8.1]âFor GBD 2010[7] [disability weights] focused on measuring health loss rather than welfare lossâ (p. 472). Data was collected in 5 countries (the study samples are claimed to be representative[8]), through in-person computer-assisted interviews and an online survey (advertised in the researchersâ networks) (p. 472).
The in-person survey participants were asked a series of questions about which of two persons (with lay descriptions of sequelae from a random database) is healthier (p. 473)
The introduction to the questions focused on the relative ability to perform activities[9] (p. 473)
The online survey participants were asked questions that also compared two health states, but of two differently-sized groups rather than two individuals (p. 473)
GBD 2013 was conducted online in four European countries (representative age, sex, and education level samples were invited). The survey estimated additional DWs not covered by the 2010 one and re-estimated 30 causes with improved lay descriptions (p. 474)
Regressions were used to convert the set of preferences between (each of two) health states to 0â1 DW values[10] (p. 474). Comorbidity was corrected for using US data (p. 475)
GBD 2019 relied on the earlier DWs but used current cause incidence and prevalence (p. 476)
DALY = YLL+YLD (p. 1431)
The GATHER checklist (pp. 1447â1449) includes methodology transparency, stating known assumptions, sharing data (in easily understandable formats), and discussing limitations.
In short, for each of the listed causes, researchers added the years lost and a relatively arbitrary disability burden value to gain the DALY burden. The data does not report wellbeing, does not include health-unrelated situations, and focuses on an objective assessment of respondentsâ relative abilities to perform tasks rather than subjective perceptions. The ratios of disability weights should be accurate but their valuation relative to death is arbitrary. Thus, it can be that the data is missing the priorities of populations entirely.
I tried figuring out how an adjusted life-year method can be used to estimate population priorities more accurately, and came up (by a series of conversations with EAs and an enumerator in a Kenyan slum and 3 trial surveys) with soliciting sincerity and using the Visual Analog Scale method (the time trade-off and standard gamble methods (source) were rejected since people had difficulties with the math).
âvital registration (VR) mortality data â anonymized individual-level records from all deaths reported in each countryâs VR system occurring between the years of studyâ (unrelated IHME citation for definition). Page 1445 of the PDF includes a map of data quality (correlates with GDP/âcapita).
Also specified in (GBD Compare FAQ): âan adjustment acknowledging that the VR data are biased compared to other sources of dataâ However, for ânon-VR sources, ⌠data quality can [also] vary widelyâ (p. 45 the PDF).
Even though life expectancy increases with age (England and Wales data exampleâsee maybe 1918), the rate of life expectancy increase should be lower than that of age increase, since YLL âhighlights premature deaths by applying a larger weight to deaths that occur in younger age groupsâ (p. 56 the PDF).
Incidence: number of new cases or rate of new cases occurrence (IHME terms)
Prevalence: number of total cases that occurred so far (IHME terms)
For example, for HIV/âAIDS (severity: Symptomatic HIV), the sequelae are âHas weight loss, fatigue, and frequent infectionsâ (p. 485)
âDWs used in GBD studies before GBD 2010 have been criticized for the method used (ie, person tradeoff), the small elite panel of international public health experts who determined the weights, and the lack of consistency over time as the GBD cause list expanded and additional DWs from a study in the Netherlands were added or others were derived by ad-hoc methodsâ (p. 472). So, the 1996 source that you cite may be biased.
The design implies that computers were accessible in the study locations. In my small-scale survey in a Kenyan slum, the local enumerator refused to take a smartphone to collect data (instead used paper) due to security concerns. (Also, enumeration by a computer can motivate experimenter bias âhow they would be judged by (a traditional) authorityâ rather than responses based on inner thoughtsâ and feelingsâ examination.) Further, non-response attrition rate was not specified but âas many as three return visits [or up to seven calls] were made to do the survey at a time when the respondent was availableâ (p. 472). If attrition is relatively high, selection bias can occur. So, the sample may be not representative and data biased.
âA personâs health may limit how well parts of his body or mind work. As a result, some people are not able to do all of the things in life that others may do, and some people are more severely limited than othersâ (p. 473). This can further bias people to give objective answers on the extent to which their activities compare to that of others rather than focus on their subjective perceptions or share what they think about health.
A probit regression that estimated if the health state was the first (value: 1) or second (value: â1) (I imagine that the probit curve would lie between y=-1 and y=1) in a pair was used to get the relative distances among the health states (p. 474). The probit coefficient associated with each cause were linearly regressed onto the logit transformed intervals and then numerical integration was used to get the 0â1 DW values (p. 474). Since no logs of either the dependent or independent variables were used, the calculation was not skewed by converting to percentages. It is possible that the range of DW spread (where the relative distances should be accurate) is âstretchedâ arbitrarily across the 0â1 range, since no comparisons with death (DW=1) were used. Maybe, all of the DWs should be actually multiplied by 0.1, 10, 0.001?