Question about uncertainty modeling (tagging @Laura Duffy here since she might be the best person to answer it):
How do you think about the different models of welfare capacity that were averaged together to make the mixture model? Is your assumption that one of these models is really the true correct model in all species (and you donât yet know which one it is), or that the different constituent models might each be more or less true for describing the welfare capacity for each individual species?
My context for asking this is in thinking about quantifying the uncertainty for a function that depends on the welfare ranges of two different species (e.g. y = f(welfare range of shrimp, welfare range of pigs)). Itâs tempting to just treat the welfare ranges of shrimp and pigs as independent variables and to then sample each of them from their respective mixture model distribution. But if we think thereâs one true model and the mixture model is just reflecting uncertainty as to what that is, the welfare ranges of shrimp and pigs should be treated as correlated variables. One might then obtain an estimate of the uncertainty in y by generating samples as follows:
Randomly pick one of the 9 models in the mixture model as the true model
Sample the welfare range of both shrimp and pigs from their distributions for the selected constituent model
Compute y = f(welfare range of shrimp, welfare range of pigs)
Repeat steps 1-3 until the desired # of samples is obtained
I could also imagine computing the covariance of the different speciesâ welfare ranges and directly generating samples as correlated random variables.
Thanks a bunch for your question, Matt. I can speak to the philosophical side of this; Laura has some practical comments below. I do think youâre right thatâand in fact our team discussed the possibility thatâwe ought to be treating the welfare range estimates as correlated variables. However, we werenât totally sure that thatâs the best way forward, as it may treat the models with more deference than makes sense. Hereâs the rough thought. We need to distinguish between (a) philosophical theories about the relationship between the proxies and welfare ranges and (b) models that attempt to express the relationship between proxies and welfare range estimates. We assume that thereâs some correct theory about the relationship between the proxies and welfare ranges, but while there might be a best model for expressing the relationship between proxies and welfare range estimates, we definitely donât assume that weâve found it. In part, this is because of ordinary points about uncertainty. Additionally, itâs because the philosophical theories underdetermine the models: lots of models are compatible with any given philosophical theory; so, we just had to choose representative possibilities. (The 1-point-per-proxy and aggregation-by-addition approaches, for instance, are basically justified by appeal to simplicity and ignorance. But, of course, the philosophical theory behind them is compatible with many other scoring and aggregation methods.) So, thereâs a worry that if we set things up the way youâre describing, weâre treating the models as though they were the philosophical theories, whereas it might make more sense not to do that and then make other adjustments for practical purposes in specific decision contexts if weâre worried about this.
Lauraâs practical notes on this:
A change like the one youâre suggesting would likely decrease the variance in the estimates of f(), since if you assume the welfare ranges are independent variables, youâd get samples where the undiluted experiences model is dominating the welfare range for, say, shrimp, and the neuron count model is dominating the welfare range for pigs. I suggest a quick practical way of dealing with this would be to cut off values of f() below the 2.5th percentile and 97.5th percentile.
Or, even better, I suggest sorting the welfare ranges from least to greatest, then using pairs of the ith-indexed welfare ranges for the ith estimate of f(). Since each welfare model is given the same weight, I predict thisâll most accurately match up welfare range values from the same welfare model. (e.g. the first 11% will be neuron count welfare ranges, etc.)
Ultimately, however, given all the uncertainty in whether our models are accurately tracking reality, it might not be advisable to reduce the variance as such.
Thanks, this is great information! The concern you raised regarding distinguishing between philosophical theories and models makes a lot of sense. With that said, I donât currently feel super satisfied with the practical steps you suggested.
On the first note, the impact of the correlation depends on the structure of f. Suppose Iâm trying to estimate the total harms of eating chicken/âpork, so we have something like y=c1âwelfarerangeofpigs+c2âwelfarerangeofchickens. In this case, treating the welfare ranges of chickens and pigs as correlated will increasethe variance of y. On the flip side, if weâre trying to estimate the welfare impact of switching from eating chicken to eating pork, we have something like y=c3âwelfarerangeofchickensâc4âwelfarerangeofpigs. In that case, treating the welfare ranges of pigs and chickens as correlated will decreasethe variance of y. Trying to address this in an ad-hoc manner seems like itâs pretty challenging.
On the second note, I think thatâs basically treating the welfare capacities of e.g. pigs and chickens as perfectly correlated with one another. That seems extreme to me, since I think a substantial portion of the uncertainty in the welfare rages is coming from uncertainty as to which traits each species has, not which philosophical theory of welfare is correct.
I come away still thinking that the procedure I suggested seems like the most workable of the approaches mentioned so far. To put a little more rigor to things, here are some examples of plotting the welfare range estimates of chickens and pigs against one another with the different methods (uncorrelated sampling from the respective mixture distributions, sampling from the ordered distributions, and pair-wise sampling from the constituent models). In addition, there are some plots showing the impact of the different sampling methods on some toy analyses of the welfare impact of eating chicken/âpork and the impact of switching from eating chicken to eating pork (note that the actual numbers are not intended to be very representative). You can see that the trimming approach only make sense in the second case, and that the paired sampling from constituent models approach produces distributions in between those for the uncorrelated case and those for the ordered case.
Note that when using the pair-wise sampling from constituent models approach, pigs and chickens are more strongly correlated with one another than many other pairs of species are. Here is what the correlation between chickens and shrimp looks like, for example:
Hey, thanks for this detailed reply! When I said âpracticalâ, I more meant âsimple things that people can do without needing to download and work directly with the code for the welfare ranges.â In this sense, I donât entirely agree that your solution is the most workable of them (assuming independence probably would be). But I agreeâpairwise sampling is the best method if you have the access and ability to manipulate the code! (I also think that the perfect correlation you graphed makes the second suggestion probably worse than just assuming perfect independence, so thanks!)
Question about uncertainty modeling (tagging @Laura Duffy here since she might be the best person to answer it):
How do you think about the different models of welfare capacity that were averaged together to make the mixture model? Is your assumption that one of these models is really the true correct model in all species (and you donât yet know which one it is), or that the different constituent models might each be more or less true for describing the welfare capacity for each individual species?
My context for asking this is in thinking about quantifying the uncertainty for a function that depends on the welfare ranges of two different species (e.g. y = f(welfare range of shrimp, welfare range of pigs)). Itâs tempting to just treat the welfare ranges of shrimp and pigs as independent variables and to then sample each of them from their respective mixture model distribution. But if we think thereâs one true model and the mixture model is just reflecting uncertainty as to what that is, the welfare ranges of shrimp and pigs should be treated as correlated variables. One might then obtain an estimate of the uncertainty in y by generating samples as follows:
Randomly pick one of the 9 models in the mixture model as the true model
Sample the welfare range of both shrimp and pigs from their distributions for the selected constituent model
Compute y = f(welfare range of shrimp, welfare range of pigs)
Repeat steps 1-3 until the desired # of samples is obtained
I could also imagine computing the covariance of the different speciesâ welfare ranges and directly generating samples as correlated random variables.
Thanks a bunch for your question, Matt. I can speak to the philosophical side of this; Laura has some practical comments below. I do think youâre right thatâand in fact our team discussed the possibility thatâwe ought to be treating the welfare range estimates as correlated variables. However, we werenât totally sure that thatâs the best way forward, as it may treat the models with more deference than makes sense.
Hereâs the rough thought. We need to distinguish between (a) philosophical theories about the relationship between the proxies and welfare ranges and (b) models that attempt to express the relationship between proxies and welfare range estimates. We assume that thereâs some correct theory about the relationship between the proxies and welfare ranges, but while there might be a best model for expressing the relationship between proxies and welfare range estimates, we definitely donât assume that weâve found it. In part, this is because of ordinary points about uncertainty. Additionally, itâs because the philosophical theories underdetermine the models: lots of models are compatible with any given philosophical theory; so, we just had to choose representative possibilities. (The 1-point-per-proxy and aggregation-by-addition approaches, for instance, are basically justified by appeal to simplicity and ignorance. But, of course, the philosophical theory behind them is compatible with many other scoring and aggregation methods.) So, thereâs a worry that if we set things up the way youâre describing, weâre treating the models as though they were the philosophical theories, whereas it might make more sense not to do that and then make other adjustments for practical purposes in specific decision contexts if weâre worried about this.
Lauraâs practical notes on this:
A change like the one youâre suggesting would likely decrease the variance in the estimates of f(), since if you assume the welfare ranges are independent variables, youâd get samples where the undiluted experiences model is dominating the welfare range for, say, shrimp, and the neuron count model is dominating the welfare range for pigs. I suggest a quick practical way of dealing with this would be to cut off values of f() below the 2.5th percentile and 97.5th percentile.
Or, even better, I suggest sorting the welfare ranges from least to greatest, then using pairs of the ith-indexed welfare ranges for the ith estimate of f(). Since each welfare model is given the same weight, I predict thisâll most accurately match up welfare range values from the same welfare model. (e.g. the first 11% will be neuron count welfare ranges, etc.)
Ultimately, however, given all the uncertainty in whether our models are accurately tracking reality, it might not be advisable to reduce the variance as such.
Thanks, this is great information! The concern you raised regarding distinguishing between philosophical theories and models makes a lot of sense. With that said, I donât currently feel super satisfied with the practical steps you suggested.
On the first note, the impact of the correlation depends on the structure of f. Suppose Iâm trying to estimate the total harms of eating chicken/âpork, so we have something like y=c1âwelfare range of pigs+c2âwelfare range of chickens. In this case, treating the welfare ranges of chickens and pigs as correlated will increase the variance of y. On the flip side, if weâre trying to estimate the welfare impact of switching from eating chicken to eating pork, we have something like y=c3âwelfare range of chickensâc4âwelfare range of pigs. In that case, treating the welfare ranges of pigs and chickens as correlated will decrease the variance of y. Trying to address this in an ad-hoc manner seems like itâs pretty challenging.
On the second note, I think thatâs basically treating the welfare capacities of e.g. pigs and chickens as perfectly correlated with one another. That seems extreme to me, since I think a substantial portion of the uncertainty in the welfare rages is coming from uncertainty as to which traits each species has, not which philosophical theory of welfare is correct.
I come away still thinking that the procedure I suggested seems like the most workable of the approaches mentioned so far. To put a little more rigor to things, here are some examples of plotting the welfare range estimates of chickens and pigs against one another with the different methods (uncorrelated sampling from the respective mixture distributions, sampling from the ordered distributions, and pair-wise sampling from the constituent models). In addition, there are some plots showing the impact of the different sampling methods on some toy analyses of the welfare impact of eating chicken/âpork and the impact of switching from eating chicken to eating pork (note that the actual numbers are not intended to be very representative). You can see that the trimming approach only make sense in the second case, and that the paired sampling from constituent models approach produces distributions in between those for the uncorrelated case and those for the ordered case.
Note that when using the pair-wise sampling from constituent models approach, pigs and chickens are more strongly correlated with one another than many other pairs of species are. Here is what the correlation between chickens and shrimp looks like, for example:
Hey, thanks for this detailed reply!
When I said âpracticalâ, I more meant âsimple things that people can do without needing to download and work directly with the code for the welfare ranges.â In this sense, I donât entirely agree that your solution is the most workable of them (assuming independence probably would be). But I agreeâpairwise sampling is the best method if you have the access and ability to manipulate the code! (I also think that the perfect correlation you graphed makes the second suggestion probably worse than just assuming perfect independence, so thanks!)
Yeah that makes complete sense, it was a pain to get the pairwise sampling working.