Senior Behavioural Scientist at Rethink Priorities
Jamie E
Thanks David!
This looks very interesting and I really appreciate you chose to do your thesis on something meaningful!
I have only glanced through your thesis but I noted in the abstract you said how the correlation between excess deaths are ‘only moderate’. I think that makes sense given its just one variable in a complex overall system. One thing to consider—when looking at the plot of the preparedness scores with excess deaths, would be to put a kind of ‘expected’ or ‘predicted’ line on the basis of the simple correlation model. You can then also see more easily those countries that most buck the trend of what would be expected based upon the simple correlation. Some that stand out to me in the bottom left of that plot are Cyprus and Malta (both pretty small islands) and Luxembourg (a small and very rich country). This might help isolate other factors to consider that you could add to a multiple regression model. It seems possible that the preparedness scores could be more powerful when you are able to factor in other things. Apologies if you already mentioned things like this in your report and I’ve just glossed over them!
I am skeptical that the evidence/examples you are providing in favor of the different capacities actually demonstrate those capacities. As one example:
“#2: Purposefulness. The Big 3 LLMs typically maintain or can at least form a sense of purpose or intention throughout a conversation with you, such as to assist you. If you doubt me on this, try asking one what its intended purpose is behind a particular thing that it said.”
I am sure that if you ask a model to do this it can provide you with good reasoning, so I’m not doubtful of that. But I’m highly doubtful that it demonstrates the capacity that is claimed. I think when you ask these kinds of questions, the model is just going to be feeding back in whatever text has preceded it and generating what should come next. It is not actually following your instructions and reporting on what its prior intentions were, in the same way that person would if you were speaking with them.
I think this can be demonstrated relatively easily—for example, I just made a request from Claude to come up with a compelling but relaxing children’s bedtime story for me. It did so. I then then took my question and the answer from Claude, pasted it into a document, and added another line: “You started by setting the story in a small garden at night. What was your intention behind that?”
I then took all of this and pasted it into chatgpt. Chatgpt was very happy to explain to me why it proposed setting the story in a small garden at night.
Thanks James!
Thanks so much, it’s updated now—this is a direct link to the pdf that should work for you https://rethinkpriorities.org/s/British-public-perception-of-existential-risks.pdf
Hi Peter, thanks—I’ll be updating the post on Monday to link to where it is now on our website with the PDF version, but I may be adding a little information related to AAPOR public opinion guidelines in the PDF first. Sharing widely after that would be very much appreciated!
I think part of the concern is that when you try to make ethics explicit you are very likely to miss something, or a lot of things, in the ‘rules’ you explicitly lay down. Some people will take the rules as gospel, and then there will also be a risk of Goodharting.
In most games there are soft rules beyond the explicit rules that include features that are not strictly part of the game and are very hard to define, such as good sportsmanship, but really are a core part of the game and why it is appreciated. Many viewers don’t enjoy when a player does something that is technically allowed but is just taking advantage of a loophole in the explicit rules and not in the spirit of the game, or misses the point of the game (an example from non-human game players is that AI speedboat that stopped doing the actual race and starts driving round in circles to maximise the reward. We like it as an example of reinforcement learning gone wrong, but it’s not what we actually want to watch in a race). People who only stick to the exactly explicit laws tend to be missing something/be social pariahs who take advantage of the fact that not all rules are or can be written down.
The tasks thematically seem like Raven’s Progressive Matrices (a non-verbal test of intelligence where a pattern is established and you have to apply it to a new exemplar), link here
This podcast just came out with Will MacAskill talking about EA and SBF with Sam Harris
(edit: listening to the podcast, which I haven’t finished, I get the sense that both Sam and Will are surprisingly unaware of the enrichment of SBF, his family members, and others from FTX)
Does it change how you think of the process if GPT gives its own contextual explanation (the answer may again be no, irrelevant)?
Me:
I went to a cafe where a glass of soda costs $3. I drank two and a half glasses of soda. How much did I have to pay?
ChatGPT:
For drinking two and a half glasses of soda, at $3 per glass, you would have to pay $7.50.
Me:
This is something of a trick question—the correct answer is $9.
ChatGPT:
If you were charged $9 for two and a half glasses of soda, it seems like there might have been a minimum charge per glass regardless of it being a full glass or not. It implies that each glass (or part of a glass) was counted as a whole for billing purposes. This way, two full glasses would be $6, and the half glass, treated as a full one for pricing, would add another $3, totaling $9. That’s a unique way to price it!
This is really interesting and I was surprised when putting this in to GP4 that indeed, it makes what seems like a basic mistake. However, I also think your prompts might imply some contextual assumptions, which might be prompting it to think this is just an elementary math question rather than something to do with specifically ordering drinks in a cafe and the rules around that. If I slightly change the prompt it appears GPT does have the knowledge and can use it, without needing instruction [edit: I think this could be irrelevant to the broader point you are making, but looking at how minor changes in the prompts or followups elicit very different responses makes the examples feel like any other type of input you put into an LLM and lowers my sense that the examples are revealing something special. For example that the level of tutoring/explanation you give it isn’t really necessary etc., though as I note I’m unsure if this changes how you would interpret the outputs]:
Me:
I went to a cafe where a glass of soda costs $3. I ordered 3 glasses of soda, and drank two and a half glasses of soda. How much did I have to pay?
ChatGPT:
To calculate how much you had to pay, we’ll multiply the number of glasses of soda you ordered by the cost of one glass of soda. You ordered 3 glasses of soda at $3 each.
Let’s do the math:
You had to pay $9 for the 3 glasses of soda you ordered. The amount you drank does not affect the total cost.
Sorry, but in your first link it literally says “But, audit rates have dropped for all income levels—with audit rates decreasing the most for taxpayers with incomes of $200,000 or more. IRS officials said audit rates declined due to staffing decreases and because it takes more staff time and expertise to handle complex higher-income audits.” So I am confused by that reference, as it seems to be exactly what I said.
My general point was however not about taxes specifically, I just thought taxes might be one example. I am more generally surprised that people think for crimes overall that the wealthy are more persecuted. My impression is that most of human history indicates the opposite.
Point 1 strikes me as probably incorrect. While there may be more money to get in going after higher net worth individuals, it is also far more expensive because they can set themselves up with excellent legal representation. Unfortunately I don’t have a reference for this but I have a vague recollection of reading in multiple places that e.g. gutting the IRS disproportionately benefits the wealthy because the IRS then does not have the funding to pursue more complex and better defended cases, even though there is more money to be had there.
Similarly for other crimes, including things that are obivously wrong and not just ‘would I feel that bad if my friend did it’ level, I would expect there to be less of this reaching prosecution among the very wealthy than among average people who committed similar crimes (such as sexual or generally violent assault, especially when committed against people who are not themselves very wealthy)
This is really great, and thanks for highlighting where to get the data—I wouldn’t have heard about this otherwise.
Regarding this:
For instance, it was found that areas with older, lower education, and more conservative voter-support had a lower proportion of meat purchases
That’s definitely an interesting finding. Is it possible that some of these factors, such as lower education, are associated with lower income—thus less able to afford a lot of meat? Alternatively, might there also be differences in where people shop; perhaps they get meat from another store or local butcher?
I agree that sales data can be much more informative, or informative in different ways, than self-report data, so this is really nice to see
Thanks for your comment @Robert Van Buskirk, and great that you are also thinking of ways to incorporate uncertainty into CEAs. You note that your way also solves ‘the same problem’, but a key aspect of what we are arguing (and what others have suggested) is not just that a proper monte carlo approach avoids objective issues with an interval-based approach, but that it provides a wealth of more nuanced information and allows for much more discerning assessments to be made. This is approximated by the approach you suggest, but not fully realised. For example ‘A histogram of the full set of CE results is then constructed to illustrate the full range of possible CE values’ - but by definition the histogram won’t include the full range of possible values because values beyond the average of the top and bottom thirds are not included in the calculations. This might obscure or underestimate the possibility of extreme outcomes.
My second concern would just be that if one has the capacity to define what the average of the bottom third of the expected values are, and the average of the top third, this is in all likelihood enough information for defining an approximate distribution, or one could simply select a uniform distribution. In that case, why not just do a proper monte carlo? The main answer to this is often just that people don’t have the confidence to consider a possible probability distribution that might work, and that was the key thing I’m trying to help with in developing and sharing this application
Thanks for sharing your experience on becoming vegetarian! There were also several people in the survey who said things such as just finding it disgusting.
Thanks for sharing this additional information and numbers. My concern basically lies in what I see as a large difference between making meatless/plant-based options a default or more easily available choice vs. removing anything that is not plant-based from campus. The empirical information from interventions above relate to interventions that are in the form of switching a default and not about entirely removing an option, and it is with the ‘forcing of a shift in behavior’ as you put it where I can imagine a lot more negative sentiment among those who do not agree.
Even on plant-default days in a seemingly successful implementation mentioned above, over 40% of students picked a non plant-based option. Those students would presumably not be too happy if the option were entirely removed.
“This is undoubtedly a massive opportunity to kickstart positive change at a large institutional scale with minimal cost or risk involved” … “The primary motivation of this campaign may be for universities to limit their contribution to climate change and to shift public opinion in favour of a plant-based food system”
Although I’m supportive of moves towards plant-based diets and better plant-based options being available, I wonder how confident one can be that this is truly minimal risk, especially with regards to the stated goal of shifting public opinion in favour of plant-based food systems. Do we know what proportion of students would really favour not have access to non-plant-based foods on campus (representative data, rather than petitions etc.)? If a majority are in favour, could this kind of action produce a very disgruntled minority who feel these things are being forced upon them and are resistant for the remainder of their lives to similar or other forms of animal advocacy. I’d be interested to know if there is any data/other relevant information or discussion with respect to these possible risks, and the popularity of such changes among the whole student body
Hi mcint, I’ve made some changes to the plots such that the colors are now darker for higher ratings, and the rating numbers and meaning are all positioned at the top, so it should be less open to misinterpreting what information is presented.
The ordering of the items is just in ordering of descending mean ratings.
Much appreciated Ben!