I’m a developer on the EA Forum (the website you are currently on). You can contact me about forum stuff at will.howard@centreforeffectivealtruism.org or about anything else at w.howard256@gmail.com
Will Howard
Thanks for reporting!
I’ll think about how we could handle this one better. It’s tricky because the doc itself as a title, and then people often rewrite the title as a heading inside the doc, so there isn’t an obvious choice for what to use as the title. But it may be true that the heading case is a lot more common so we should make that the default.
That was indeed intended as a feature, because a lot of people use blank lines as a paragraph break. We can add that to footnotes too.
I’ll set a reminder to reply here when we’ve done these.
Cosmologist: Well, I’m a little uncomfortable with this, but I’ll give it a shot. I will tentatively say that the odds of doom are higher than 1 in a googol. But I don’t know the order of magnitude of the actual threat. To convey this:
I’ll give a 1% chance it’s between 10^-100 and 10^-99
A 1% chance it’s between 10^-99 and 10^-98
A 1% chance it’s between 10^-98 and 10^-97,
And so on, all the way up to a 1% chance it’s between 1 in 10 and 100%.
I think the root of the problem in this paradox is that this isn’t a very defensible humble/uniform prior, and if the cosmologist were to think it through more they could come up with one that gives a lower p(doom) (or at least, doesn’t look much like the distribution stated initially).
So, I agree with this as a criticism of pop-Bayes in the sense that people will often come up with a quick uniform-prior-sounding explanation for why some unlikely event has a probability that is around 1%, but I think the problem here is that the prior is wrong[1] rather than failing to consider the whole distribution, seeing as a distribution over probabilities collapses to a single probability anyway.
Imo the deeper problem is how to generate the correct prior, which can be a problem due to “pop Bayes”, but also remains when you try to do the actual Bayesian statistics.
Explanation of why I think this is quite an unnatural estimate in this case
Disclaimer: I too have no particular claim on being great at stats, so take this with a pinch of salt
The cosmologist is supposing a model where the universe as it exists is analogous to the result of a single Bernoulli trial, where the “yes” outcome is that the universe is a simulation that will be shut down. Writing this Bernoulli distribution as [2], they are then claiming uncertainty over the value of . So far so uncontroversial.
They then propose to take the pdf over to be:
Where is a normalisation constant. This is the distribution that results in the property that each OOM has an equal probability[3]. Questions about this:
Is this the appropriate non-informative prior?
Is this a situation where it’s appropriate to appeal to a non-informative prior anyway?
Is this the appropriate non-informative prior?
I will tentatively say that the odds of doom are higher than 1 in a googol. But I don’t know the order of magnitude of the actual threat.
The basis on which the cosmologist chooses this model is an appeal to a kind of “total uncertainty”/non-informative-prior style reasoning, but:
They are inserting a concrete value of as a lower bound
They are supposing the total uncertainty is over the order of magnitude of the probability, which is quite a specific choice
This results in a model where in this case, so the expected probability is very sensitive to this lower bound parameter, which is a red flag for a model that is supposed to represent total uncertainty.
There is apparently a generally accepted way to generate non-informative-priors for parameters in statistical models, which is to use a Jeffreys prior. The Jeffreys prior[4] for the Bernoulli distribution is:
This doesn’t look much like equation (A) that the cosmologist proposed. There are parameters where the Jeffreys prior is , such as the standard deviation in the normal distribution, but these tend to be scale parameters that can range from 0 to . Using it for a probability does seem quite unnatural when you contrast it with these examples, because a probability has hard bounds at 0 and 1.
Is this a situation where it’s appropriate to appeal to a non-informative prior anyway?
Using the recommended non-informative prior (B), we get that the expected probability is 0.5. Which makes sense for the class of problems concerned with something that either happens or doesn’t, where we are totally uncertain about this.
I expect the cosmologist would take issue with this as well, and say “ok, I’m not that uncertain”. Some reasons he would be right to take issue are:
A general prior that “out of the space of things that could be the case, most are not the case[5]”, this should update the probability towards 0. And in fact massively so, such that in the absence of any other evidence you should think the probability is vanishingly small, as you would for the question of “Is the universe riding on the back of a giant turtle?”
The reason to consider this simulation possibility in the first place, is not just that it is in principle allowed by the known laws of physics, but that there is a specific argument for why it should be the case. This should update the probability away from 0
The real problem the cosmologist has is uncertainty in how to incorporate the evidence of (2) into a probability (distribution). Clearly they think there is enough to the argument to not immediately reject it out of hand, or they would put it in the same category as the turtle-universe, but they are uncertain about how strong the argument actually is and therefore how much it should update their default-low prior.
...
I think this deeper problem gets related to the idea of non-informative priors in Bayesian statistics via a kind of linguistic collision.
Non-informative priors are about having a model which you have not yet updated based on evidence, so you are “maximally uncertain” about the parameters. In the case of having evidence only in the form of a clever argument, you might think “well I’m very uncertain about how to turn this into a probability, and the thing you do when you’re very uncertain is use a non-informative prior”. You might therefore come up with a model where the parameters have the kind of neat symmetry-based uncertainty that you tend to see in non-informative priors (as the cosmologist did in your example).
I think these cases are quite different though, arguably close to being opposites. In the second (the case of having evidence only in the form of a clever argument), the problem is not a lack of information, but that the information doesn’t come in the form of observations of random variables. It’s therefore hard to come up with a likelihood function based on this evidence, and so I don’t have a good recommendation for what the cosmologist should say instead. But I think the original problem of how they end up with a 1 in 230 probability is due to a failed attempt to avoid this by appealing to an non-informative prior over order of magnitude.
- ^
There is also a meta problem where the prior will tend to be too high rather than too low, because probabilities can’t go below zero, and this leads to people on average being overly spooked by low probability events
- ^
being the “true probability”. I’m using rather than p because 1) in general parameters of probability distributions don’t need to be probabilities themselves, e.g. the mean of a normal distribution, 2) is a random variable in this case, so talking about the probability of p taking a certain value could be confusing 3) it’s what is used in the linked Wikipedia article on Jeffreys priors
- ^
- ^
There is some controversy about whether this the right prior to use, but whatever the right one is it would give
- ^
For some things you can make a mutual exclusivity + uncertainty argument for why the probability should be low. E.g. for the case of the universe riding on the back of the turtle you could consider all the other types of animals it could be riding on the back of, and point out that you have no particular reason to prefer a turtle. For the simulation argument and various other cases it’s trickier because they might be consistent with lots of other things, but you can still appeal to Occam’s razor and/or viewing this as an empirical fact about the universe
Ok nested bullets should be working now :)
I have thought this might be quite useful to do. I would guess (people can confirm/correct me) a lot of people have a workflow like:
Edit post in Google doc
Copy into Forum editor, make a few minor tweaks
Realise they want to make larger edits, go back to the Google doc to make these, requiring them to either copy over or merge together the minor tweaks they have made
For this case being able to import/export both ways would be useful. That said it’s much harder to do the other way (we would likely have to build up the Google doc as a series of edits via the api, whereas in our case we can handle the whole post exported as html quite naturally), so I wouldn’t expect us to do this in the near future unfortunately.
Yep images work, and agree that nested bullet points are the biggest remaining issue. I’m planning to fix that in the next week or two.
Edit: Actually I just noticed the cropping issue, images that are cropped in google docs get uncropped when imported. That’s pretty annoying. There is no way to carry over the cropping but we could flag these to make sure you don’t accidentally submit a post with the uncropped images.
You can now import posts directly from Google docs
Plus, internal links to headers[1] will now be mapped over correctly. To import a doc, make sure it is public or shared with “eaforum.posts@gmail.com″[2], then use the widget on the new/edit post page:
Importing a doc will create a new (permanently saved) version of the post, but will not publish it, so it’s safe to import updates into posts that are already published. You will need to click the “Publish Changes” button to update the live post.
Everything that previously worked on copy-paste[3] will also work when importing, with the addition of internal links to headers (which only work when importing).
There are still a few things that are known not to work:
Nested bullet points(these are working now)Cropped images get uncropped
Bullet points in footnotes (these will become separate un-bulleted lines)
Blockquotes (there isn’t a direct analog of this in Google docs unfortunately)
There might be other issues that we don’t know about. Please report any bugs or give any other feedback by replying to this quick take, you can also contact us in the usual ways.
Appendix: Version history
There are some minor improvements to the version history editor[4] that come along with this update:
You can load a version into the post editor without updating the live post, previously you could only hard-restore versions
The version that is live[5] on the post is shown in bold
Here’s what it would look like just after you import a Google doc, but before you publish the changes. Note that the latest version isn’t bold, indicating that it is not showing publicly:
- ^
Previously the link would take you back to the original doc, now it will take you to the header within the Forum post as you would expect. Internal links to bookmarks (where you link to a specific text selection) are also partially supported, although the link will only go to the paragraph the text selection is in
- ^
Sharing with this email address means that anyone can access the contents of your doc if they have the url, because they could go to the new post page and import it. It does mean they can’t access the comments at least
- ^
I’m not sure how widespread this knowledge is, but previously the best way to copy from a Google doc was to first “Publish to the web” and then copy-paste from this published version. In particular this handles footnotes and tables, whereas pasting directly from a regular doc doesn’t. The new importing feature should be equal to this publish-to-web copy-pasting, so will handle footnotes, tables, images etc. And then it additionally supports internal links
- ^
Accessed via the “Version history” button in the post editor
- ^
For most intents and purposes you can think of “live” as meaning “showing publicly”. There is a bit of a sharp corner in this definition, in that the post as a whole can still be a draft.
To spell this out: There can be many different versions of a post body, only one of these is attached to the post, this is the “live” version. This live version is what shows on the non-editing view of the post. Independently of this, the post as a whole can be a draft or published.
Very reasonable, I think the project is great as is. I just have one more newsletter-related suggestion:
It’s a lot cheaper to collect emails than it is to do the rest of the work related to sending out automated updates, so it could be worth doing that to take advantage of the initial spike in interest (without making any promises as to whether there will be updates). This could just be a link to a google form on the website if you wanted it to be really simple to implement.
Just about oils specifically:
My best guess is that either freezing or storing them as a liquid in an inert environment could be quite economical, very rough OOM maths:
For frozen storage
Apparently ice costs ~$0.01/kg to produce, this is just for the upfront cost of initially freezing, oil would be similar/lower because it has a lower heat capacity and similar freezing point. The cost for keeping it frozen is not that easy to work out, but still I would say “well the square-cube law takes care of this if you are freezing in large enough quantities” so I would be quite surprised if it were way more than the $3.7/year depreciation cost (OOM logic: even if it melted and you re-froze it every day that would only cost $3.65/year).
For liquid storage
You can store crude oil for something like $1.2 to $6[1] per barrel per year, with the cheapest method being putting it in salt caverns, but other methods are not way more expensive. This corresponds to about 2 person-years of calories with palm oil ($0.6 to $3 per person-year for storage). There are some reasons to think storing an edible oil[2] could be more expensive:
You would probably need to backfill with an inert gas to preserve it
Other food-grade safety related things? Although I think this would be a misguided concern tbh because it’s intended for a use case where you would otherwise starve, so it only has to not kill you, it could be quite contaminated
And reasons to think it would be cheaper:
You would be storing for a known, very long, amount of time, so you could save on things required for quick access to the oil (like pumps)
It’s easier to handle, less flammable, no noxious gases etc
I would guess the cheaper side would win here, as backfilling with an inert gas doesn’t seem very hard if you have a large enough volume. Apparently oil tankers already do this (not sure about salt cavern storage) so this may be priced in already.
- ^
“Global Platts reveal that the price of onshore storage varies between 10 and 50 cents per barrel per month.”, then adjusted based on 158 litres/barrel, corresponding to ~140kg of palm oil, which is 2 person-years worth
- ^
You couldn’t actually use palm oil in this case, because it’s a solid, but e.g. sunflower oil has similar calories/$
This is a great idea, I just submitted a project. I also wrote it up as a post, but your post was what prompted me to write it :)
Something I tend to find with projects like this[1] is that they can be forgotten about after the initial launch because they’re not a destination site so there is no way for people to naturally come back to them. Have you thought about doing a newsletter or similar with an update on the projects that are added? I think it could be fairly infrequent (monthly) and automated and still be quite useful.
Also some minor feedback: submitting didn’t work initially because of some problem with the description field. I removed a url and some line breaks and then it worked.
- ^
i.e. the UnfinishedImpact project, not my idea
- ^
The thing that stands out to me as clearly seeming to go wrong is the lack of communication from the board during the whole debacle. Given that the final best guess at the reasoning for their decision seems like something could have explained[1], it does seem like an own goal that they didn’t try to do so at the time.
They were getting clear pressure from the OpenAI employees to do this for instance, this was one of the main complaints in the employee letter, and from talking to a couple of OAI employees I’m fairly convinced that this was sincere (i.e. they were just as in the dark as everyone else, and this was at least one of their main frustrations).
I’ve heard a few people make a comparison to other CEO-stepping-down situations, where it’s common for things to be relatively hush hush and “taking time out to spend with their family”. I think this isn’t a like for like comparison, because in those cases it’s usually a mutual agreement between the board and the CEO for them both to save face and preserve the reputation of the company. In the case of a sudden unilateral firing it seems more important to have your reasoning ready to explain publicly (or even privately, to the employees).
It’s possible of course that there are some secret details that explain this behaviour, but I don’t think there’s any reason to be overly charitable in assuming this. If there was some strategic tradeoff that the board members were making it’s hard to see what they were trading off against because they don’t seem to have ended up with anything in the deal[2]. I also don’t find “safety-related secret” explanations that compelling because I don’t see why they couldn’t have said this (that there was a secret, not what it was). Everyone involved was very familiar with the idea that AI safety infohazards might exist so this would have been a comprehensible explanation.
If I put myself in the position of the board members I can much more easily imagine feeling completely out of my depth in the situation that happened and ill-advisedly doubling down on this strategy of keeping quiet. It’s also possible they were getting bad advice to this effect, as lawyers tend to tell you to keep quiet, and there is general advice out there to “not engage with the twitter mob”.
- ^
Several minor fibs from Sam, saying different things to different board members to try and manipulate them. This does technically fit with the “not consistently candid” explanation but that was very cryptic without further clarification and examples
- ^
To frame this the other way, if they had kept quiet and then been given some lesser advisory position in the company afterwards you could more easily reason that some face-saving dealing had gone on
- ^
Personally I think the other members are actually the bigger news here, seeing as Sam being added back seemed like a foregone conclusion (or at least, the default outcome, and him not being added back would have been news).
But anyway, my goal was just to link to the post without editorialising too much so that people can discuss it on the forum. For this I think a policy of copying the exact title from the article is good in general.
+1, love Pigeon Hour
I wonder what’s happening with the OpenPhil board seat
I’m pretty sure that’s gone now. I.e. that the initial $30m for a board seat arrangement wasn’t actually legally binding wrt future members of the board, it was just maintained by who the current members would allow. So now there are no EA aligned board members there is no pressure or obligation to add any.
I could be wrong about this but I’m reasonably confident
You can see the raw data in the final tab, everything is only given one “Focus Area”, and there are some obviously less-than-ideal codings (e.g. “F.R.E.E. — Broiler and Cage-Free Reforms in Romania” is “Farm Animal Welfare” but not “Broiler Chicken Welfare”)
Edit: sorry I didn’t read your question properly, I think the answer to whether it’s appropriate to add up “Farmed Animal Welfare” + “Broiler Chicken Welfare” + “Cage-Free Reforms” etc is yes
Breakdown of Open Philanthropy grants to date
I came across this spreadsheet buried in a comment thread. I don’t know who made it (maybe @MarcusAbramovitch knows) but, it’s great. It shows a breakdown of all OP grants by cause area and organisation, updated automatically every day:
I’m anticipating a lot of replies being about how to reduce the aspects of the Forum that make people feel bad about posting, such as harsh criticism, and I think this is good as far as it goes. However I think it’s important to think about why people do things as being a balance between costs and benefits, and also think about how we could make the benefits larger or more salient.
“What makes you stop posting?” could be reframed as “What makes you post in the first place?”, and “What might make it easier?” could be reframed as “What might make you publish posts that were more challenging for you (practically or emotionally)?”
The quality of many forum posts is very high, including from people who are not paid by a research org to write them and have no direct connection to the community (such as these two). So even if you only factor in the time cost, you would still have to suppose some pretty large benefits to explain why people write them.
I have some ideas about what these benefits are:
If you see yourself as a temporarily embarrassed academic who has had to get a proper job as a result of economic forces, posting on the EA Forum (or LessWrong) is about as close as you can get to publishing in an academic journal without actually doing that. Your ideas will be taken seriously by a community of people you respect, and you are actually likely to get more substantive engagement than if you were a non-top-tier academic publishing in a non-top-tier journal. This kind of intellectual discussion is exciting to a lot of people, and is reason enough in itself.
Related to your ideas being taken seriously, they can also steer a community of thousands of people and billions of dollars. It’s reasonably common for this to happen, for instance GiveWell changed how they do their cost effectiveness analyses partly as a result of that post by @Froolow that I linked above. There are lots of good examples from LessWrong too, such as Katja Grace’s Let’s think about slowing down AI which was clearly [citation needed] pivotal in getting that idea recognised as a respectable mainstream position.
You can get ~material benefits~ from posting on the forum. We have found that a lot of people get jobs at least via, and possibly because of, the Forum. Also it might make people relate to you better at least socially, if not professionally, if you have a couple of good Forum posts explaining your best ideas. For people in largely EA social circles this can be a big benefit.
These are all things that seem like they could be leant on to get people to publish more and better content on the Forum, for instance GiveWell did a Change Our Mind contest which clearly increased the second one.
On the issue of criticism specifically, I am a bit less optimistic about this being a lever to pull to get people to post more. I have written before about why I think reassuring people that they won’t be criticised can be wrongheaded[1]. Obviously I think it’s good to make sure criticism is of ideas and not people/their values, and to be polite in a common sense way such as trying to give criticism as a compliment sandwich.
But ultimately it’s not the bread from a sandwich that people remember, and even being criticised for just your ideas but not your values feels bad to most people. But in order to get the benefit of people taking your ideas seriously they do need to be open to criticism, so I think it’s quite difficult to reduce this in practice.
And then on the question of whether reducing “the bad kind of criticism” (i.e. criticism of values/personal attacks) would actually make people post more[2], one bit of evidence that goes against this is that the posts that get the most engagement tend to be “drama” posts where the comments actually involve proportionately more of “the bad kind of criticism”. Obviously there are a lot of confounding factors here but one relevant idea is that “standing up against criticism of your values” can actually feel better/more wholesome than standing up against criticism of your ideas, because if your idea is proven wrong then that’s just a bit embarrassing, whereas your values usually can’t really be proven wrong.
So anyway, overall I think a better argument to make to try and persuade people to post more is less like “Don’t worry you won’t be criticised” and more like “It’s brave to post on the Forum, for the same reason that it’s brave to stand up and talk in front of a group of people. It’s brave because you’re opening yourself up to the very real downside of being criticised, but there are all these great upsides too, so you might just have to take some of the hits if you want to get them”.
And then, from the perspective of people with a community-minded interest[3] in getting more people to post more of their ideas, I think leaning on the benefits side (such as the three things I mentioned) could be at least as effective as leaning on the costs side.
Note: I am a developer on the Forum, but this comment doesn’t necessarily represent the views of the whole team
- ^
TL;DR I think it’s often a false reassurance, and so people will see through it or be wrongly convinced that they won’t be criticised, which is unfair to them if they then are
- ^
As mentioned above, I think it’s good to reduce this for lots of other reasons
- ^
Which includes the actual Forum team, which I work on, but also other users who have the best interests of the community at heart
I’m still trying to work through the maths on this so I won’t respond in much detail until I’ve got further with that, I may end up writing a separate post. I did start off at your position so there’s some chance I will end up there, I find this very confusing to think about.
Some brief comments on a couple of things:
I agree with this, but I don’t think this is our epistemic position, because we can understand all value relative to our own experiences.
I think relative is the operative word here. That is, you experience that a toe stub is 10 times worse than a papercut, and this motivates the development of moral theories that are consistent with this, and rules out ones that are not (e.g. ones that say they are equally bad). But there is an additional bit of parameter fixing that has to happen to get from the theory predicting this relative difference to predicting the absolute amount.
My claim is that at least generally speaking, and I think actually always, theories that are under consideration only predict these relative differences and not the absolute amounts. E.g. if a theory supposes that a certain pain receptor causes suffering when activated, then it might suppose that 10 receptors being activated causes 10 times as much suffering, but it doesn’t say anything about the absolute amount. This is also true of more fundamental theories (e.g. more information processing ⇒ more sentience). I have some ideas about why this is[1], but mainly I can’t think of any examples where this is not the case. If you can think of any then please tell me as that would at least partially invalidate this scale invariance thing (which would be good).
I think you would also say that theories don’t need to predict this overall scale parameter because we can always fix it based on our observations of absolute utility… this is the bit of maths that I’m not clear on yet, but I do currently think this is not true (i.e. the scale parameter does matter still, especially when you have a prior reason to think there would be a difference between the theories).
I agree that directly observing the value of a toe stub, say, under hedonism might not tell you much or anything about its absolute value under non-hedonistic theories of welfare.… However, I think we can say more under variants of closer precise theories.
I was intending to restrict to only theories that fall under hedonism, because I think this is the case where this kind of cross theory aggregation should work the best. And given that I think this scale invariance problem arises there then it would be even worse when considering more dissimilar theories.
So I was considering only theories where the welfare relevant states are things that feel pretty close to pleasure and pain, and you can be uncertain about how good or bad different states are for common sense reasons[2], but you’re able to tell at least roughly how good/bad at least some states are.
- ^
Mentioned in the previous comment. One is that the prescriptions of utilitarianism have this scale invariance (only distinguish between better/worse), as do the behaviours associated with pleasure/pain (e.g. you can only communicate that something is more/less painful, or [for animals] show an aversion to a more painful thing in favour of a less painful thing).
- ^
E.g. you might not remember them, you might struggle to factor in duration, the states might come along with some non-welfare-relevant experience which biases your recollection (e.g. a painfully bright red light vs a painfully bright green light)
- ^
We should fix and normalize relative to the moral value of human welfare, because our understanding of the value of welfare is based on our own experiences of welfare
I used to think this for exactly the same reason, but I now no longer do. The basic reason I changed my mind is the idea that uncertainty in the amount of welfare humans (or chickens) experience is naturally scale invariant. This scale invariance means that observing any particular absolute amount of welfare (by experiencing it directly) shouldn’t update you as to the relative amount of welfare under different theories.
The following is a fairly “heuristic” version of the argument, I spent some time trying to formalise it better but got stuck on the maths, so I’m giving the version that was in my head before I tried that. I’m quite convinced it’s basically true though.
The argument
Consider only theories that allow the most aggregation-friendly version of hedonistic utilitarianism[1]. Under this constraint, the total amount of utility experienced by one or more moral patients is some real quantity that can be expressed in objective units (“hedons”), and this quantity is comparable across the theories that we are allowing. You might imagine that you could consult God as to the utility of various world states and He could say truthfully “ah, stubbing your toe is −1 hedon”. In your post you also suppose that you can measure this amount yourself through direct experience, which I find reasonable.
From the perspective of someone who is unable to experience utility themselves, there is a natural scale invariance to this quantity. This is clearest when considering the “ought” side of the theory: the recommendations of utilitarianism are unchanged if you scale utility up and down by any amount as it doesn’t affect the rank ordering of world states.
Another way to get this intuition is to imagine an unfeeling robot that derives the concept of utility from some combination of interviewing moral patients and constructing a first principles theory[2]. It could even get the correct theory, and derive that e.g. breaking your arm is 10 times as bad as stubbing your toe. It would still be in the dark about how bad these things are in absolute terms though. If God told it that stubbing your toe was –1 hedons that wouldn’t mean anything to the robot. God could play a prank on the robot and tell it stubbing your toe was instead –1 millihedons, or even temporarily imbue the robot with the ability to feel pain and expose it to –1 millihedons and say “that’s what stubbing your toe feels like”. This should be equally unsurprising to the robot as being told/experiencing –1 hedon.
My claim is that the epistemic position of all the different theories of welfare are effectively that of this robot. And as a result of this, observing any absolute amount of welfare (utility) under theory A shouldn’t update you as to what the amount would be under theory B, because both theories were consistent with any absolute amount of welfare to begin with. In fact they were “maximally uncertain” about the absolute amount, no amount should be any more or less of a surprise under either theory.
If you had a prior reason to think theory B gives say 5 times the welfare to humans as theory A (importantly in relative terms), then you should still think this after observing the absolute amount yourself, and this is what generates the thorny version of the two envelopes problem. I think there are sensible prior reasons to think there is such a relative difference for various pairs of theories.
For instance, suppose both A and B are essentially “neuron count” theories and agree on some threshold brain complexity for sentience, but then A says “amount of sentience” scales linearly with neuron count whereas B says it scales quadratically. It’s reasonable to think that the amount of welfare in humans is much higher under B, maybe times higher.
Other examples where arguments like this can be made are:
A and B are the same except B has multiple conscious subsystems
A and B are predicting chicken welfare rather than human, and A says they are sentient whereas B says they are not. Clearly B predicts 0 times the welfare of A (equivalently A predicts infinity times the welfare of B)
Putting this in two envelopes terms
If we say we have two theories, 1 and 2, which you might imagine are a human centric (, )[4] and an animal-inclusive (, ) view, then we have:
And
As we are used to seeing.
But as you point out in your post, the quantities and are not necessarily the same (though you argue they should be treated as such) which makes this a nonsensical average of dimensionless numbers. E.g. could be 0.00001 hedons and could be 10 hedons, which would mean we are massively overcounting theory 1. The quantities we actually care about are and (dimension-ed numbers in units of hedons), or their ratio . We can write these as:
This may seem like a roundabout way of writing these down, but remember that what we have from our welfare range estimates are values for , so these can’t be cancelled further and the s are the minimum number of parameters we can add to pin down the equations. The ratio is then:
I find this easier to think about if the ratios are in terms of a specific theory, e.g. , so you are always comparing what the relative amount of welfare is in theory X vs some definite reference theory. We can rearrange (3) to support this by dividing all the fractions though by :
Where
Again, maybe this seems incredibly roundabout, but in this form it is more clear that we now only need the ratios not their absolute values. This is good according to the previous claims I have made:
Because of scale invariance, it’s not possible to say anything about the absolute value of
It is possible to reason about the relative welfare values between theories, represented by
So under this framing the “solution to the two envelopes problem for moral weights” is that you need to estimate the inter-theoretic welfare ratios for humans (or any reference moral patient), as well as the intra-theoretic ratios between moral patients. I.e. you have to estimate as well as and for each theory.
I think this is still quite a big problem because of the potential for arguing that some theories have combinatorially higher welfare than others, thus causing them to dominate even if you put a very low probability on them. The neuron count example above is like this, you could make it even worse by supposing a theory where welfare is exponential in neuron count.
Returning to the human-centric vs animal inclusive toy example
If we say we have two theories, 1 and 2, which you might imagine are a human centric (, )[4] and a animal-inclusive (, ) view
Adding these numbers into this example we now have:
What should the value of be? Well in this case I think it’s reasonable to suppose and are in fact equal, as we don’t have any principled reason not to, so this still comes out to ~0.001. As in the original version we can flip this around to see if we get a wildly different answer if we make the inter-theoretic comparison be between chickens:
Now what should be, recalling that theory 1 says chickens are worth very little compared to humans? I think it’s reasonable say that is also very little compared to , since the point of theory 1 is basically to suppose chickens aren’t (or are barely) sentient, and not to say anything about humans. Supposing that none of the difference is explained by humans, we get , this also gives , so comes out to ~1000. This is the inverse of as we expect.
Clearly this is just rearranging the same numbers to get the same result, but hopefully it illustrates how explicitly including these ratios makes the two envelope problem that you get by naively inverting the ratios less spooky, because by doing so you are effectively wildly changing the estimates of .
I agree with you that there are many cases where for the specific theories under consideration it is right to assume that and are equal (because we have no principled reason not to), but that this is not because we are able to observe welfare directly (even if we suppose that this is possible). And for many pairs of theories we might think and are very different.
(Apologies for switching back and forth between “welfare” and “utility”, I’m basically treating them both like “utility”)
- ^
I think it’s right to start with this case, because it should be the easiest. So if something breaks in this case it is likely to also break once we start trying to include things like non-welfare moral reasons
- ^
“I’ve met a few of those”
- ^
We can label the “true” theory as A, because we only get the chance to experience the true theory (we just don’t know which one it is)
- ^
You could make this actually zero, but I think adding infinity in makes the argument more confusing
- 28 Feb 2024 1:56 UTC; 4 points) 's comment on Solution to the two envelopes problem for moral weights by (
PSA: You can now add buttons to posts and comments:
Click here to maximise utilityWe would love it if people started adding these to job opportunities to nudge people to apply. To add a button, select some text to make the toolbar appear and then click the “Insert button” icon (see below)
As always, feedback on the design/implementation is very welcome.
This seems like the wrong order of magnitude to apply this logic at, $20mn is close to 1% of the money that OpenPhil has disbursed over its lifetime ($2.8b)