I work on AI Grantmaking at Open Philanthropy. Comments here are posted in a personal capacity.
alex lawsen
Inspect is open-source, and should be exactly what you’re looking for given your stated interest in METR
Why do you think superforecasters who were selected specifically for assigning a low probability to AI x-risk are well described as “a bunch of smart people with no particular reason to be biased”?
For the avoidance of doubt, I’m not upset that the supers were selected in this way, it’s the whole point of the study, made very clear in the write-up, and was clear to me as a participant. It’s just that “your arguments failed to convince randomly selected superforecasters” and “your arguments failed to convince a group of superforecasters who were specifically selected for confidentiality disagreeing with you” are very different pieces of evidence.
They weren’t randomly selected, they were selected specifically for scepticism!
The smart people were selected for having a good predictive track record on geopolitical questions with resolution times measured in months, a track record equaled or bettered by several* members of the concerned group. I think this is much less strong evidence of forecasting ability on the kinds of question discussed than you do.
*For what it’s worth, I’d expect the skeptical group to do slightly better overall on e.g. non-AI GJP questions over the next 2 years, they do have better forecasting track records as a group on this kind of question, it’s just not a stark difference.
The first bullet point of the concerned group summarizing their own position was “non-extinction requires many things to go right, some of which seem unlikely”.
This point was notably absent from the sceptics summary of the concerned position.
Both sceptics and concerned agreed that a different important point on the concerned side was that it’s harder to use base rates for unprecedented events with unclear reference classes.
I think these both provide a much better characterisation of the difference than the quote you’re responding to.
I’m still saving for retirement in various ways, including by making pension contributions.
If you’re working on GCR reduction, you can always consider your pension savings a performance bonus for good work :)
I’m not officially part of the AMA but I’m one of the disagreevotes so I’ll chime in.
As someone who’s only recently started, the vibe this post gives of it being hard for me to disagree with established wisdom and/or push the org to do things differently, meaning my only role is to ‘just push out more money along the OP party line’, is just miles away from what I’ve experienced.
If anything, I think how much ownership I’ve needed to take for the projects I’m working on has been the biggest challenge of starting the role. It’s one that (I hope) I’m rising to, but it’s hard!
In terms of how open OP is to steering from within, it seems worth distinguishing ‘how likely is a random junior person to substantially shift the worldview of the org’, and ‘what would the experience of that person be like if they tried to’. Luke has, from before I had an offer, repeatedly demonstrated that he wants and values my disagreement in how he reacts to it and acts on it, and it’s something I really appreciate about his management.
I think 1 unfortunately ends up not being true in the intensive farming case. Lots of things are spread by close enough contact that even intense uvc wouldn’t do much (and it would be really expensive)
I wouldn’t expect the attitude of the team to have shifted much in my absence. I learned a huge amount from Michelle, who’s still leading the team, especially about management. To the extent you were impressed with my answers, I think she should take a large amount of the credit.
On feedback specifically, I’ve retained a small (voluntary) advisory role at 80k, and continue to give feedback as part of that, though I also think that the advisors have been deliberately giving more to each other.
The work I mentioned on how we make introductions to others and track the effects of those, including collaborating with CH, was passed on to someone else a couple of months before I left, and in my view the robustness of those processes has improved substantially as a result.
This seems extremely uncharitable. It’s impossible for every good thing to be the top priority, and I really dislike the rhetorical move of criticising someone who says their top priority is X for not caring at all about Y.
In the post you’re replying to Chana makes the (in my view) virtuous move of actually being transparent about what CH’s top priorities are, a move which I think is unfortunately rare because of dynamics like this. You’ve chosen to interpret this as ‘a decision not to have’ [other nice things that you want], apparently realised that it’s possible the thinking here isn’t actually extremely shallow, but then dismissed the possibility of anyone on the team being capable of non-shallow thinking anyway for currently unspecified reasons.editing this in rather than continuing a thread as I don’t feel able to do protracted discussion at the moment:
Chana is a friend. We haven’t talked about this post, but that’s going to be affecting my thinking.
She’s also, in my view (which you can discount if you like), unusually capable of deep thinking about difficult tradeoffs, which made the comment expressing skepticism about CH’s depth particularly grating.
More generally, I’ve seen several people I consider friends recently put substantial effort into publicly communicating their reasoning about difficult decisions, and be rewarded for this effort with unhelpful criticism.
All that is to say that I’m probably not best placed to impartially evaluate comments like this, but at the end of the day I re-read it and it still feels like what happened is someone responded to Chana saying “our top priority is X” with “it seems possible that Y might be good”, and I called that uncharitable because I’m really, really sure that that possibility has not escaped her notice.
I’m fairly disappointed with how much discussion I’ve seen recently that either doesn’t bother to engage with ways in which the poster might be wrong, or only engages with weak versions. It’s possible that the “debate” format of the last week has made this worse, though not all of the things I’ve seen were directly part of that.
I think that not engaging at all, and merely presenting one side while saying that’s what you’re doing, seems better than presenting and responding to counterarguments (but only the weak ones), which still seems better than strawmanning arguments that someone else has presented.
Thank you for all of your work organizing the event, communicating about it, and answering people’s questions. None of these seem like easy tasks!
I’m no longer on the team but my hot take here is that a good bet is just going to be trying really hard to work out which tools you can use to accelerate/automate/improve your work. This interview with Riley Goodside might be interesting to listen to, not only for tips on how to get more out of AI tools, but also to hear about how the work he does in prompting those tools has rapidly changed, but that he’s stayed on the frontier because the things he learned have transferred.
Hey, it’s not a direct answer but various parts of my recent discussion with Luisa cover aspects of this concern (it’s one that frequently came up in some form or other when I was advising), in particular, I’d recommend skimming the sections on ‘trying to have an impact right now’, ‘needing to work on AI immediately’, and ‘ignoring conventional career wisdom’.
It’s not a full answer but I think the section of my discussion with Luisa Rodriguez on ‘not trying hard enough to fail’ might be interesting to read/listen to if you’re wondering about this.
Responding here to parts of the third point not covered by “yep, not everyone needs identical advice, writing for a big audience is hard” (same caveats as the other reply):
“And for years it just meant I ended up being in a role for a bit, and someone suggested I apply for another one. In some cases, I got those roles, and then I’d switch because of a bunch of these biases, and then spent very little time getting actually very good at one thing because I’ve done it for years or something.”—are you sure this is actually bad? If each time you moved to something 10x more effective, and then at some point (even if years later) settled into learning your job and doing it really well, it might still be.. good?
No, I don’t think it’s always bad to switch a lot. The scenario you described, where the person in question gets a 1 OOM impact bump per job switch and then also happens to end up in a role with excellent personal fit is obviously good, though I’m not sure there’s any scenario discussed in the podcast that wouldn’t look good if you made assumptions that generous about it.
Alex, regarding “The next time I’m looking at options is in a couple of years.”—would you endorse this sort of thing for yourself? I mean, I’m guessing it would be a big loss if you weren’t in 80k, and if (now) you weren’t in OP. I do think it would be reasonable to have you take even a whole day of vacation each week in order to make sure you get to OP 1-2 years sooner. [not as a realistic suggestion, I don’t think you could consider career options for a whole day per week, but I’m saying that the value of you doing exploration seems pretty high and would probably even justify that. no? or maybe the OP job had nothing to do with your proactive exploration]
The thing I describe as being my policy in the episode isn’t a hypothetical example, it’s an actual policy (including the fact that the bounds are soft in my case, i.e. I don’t actively look before the time commitment is through, and have a strong default but not an unbreakable rule to turn down other opportunities in the meantime. I think that taking a 20% time hit to look for other things would have been a huge mistake in my case. The OP job had nothing to do with proactive exploration, as I wasn’t looking at the time (though having got through part of the process, I brought the period of exploration I’d planned for winter 2023 forward by a few months, so by the time I got the OP offer I’d already done some assessment of whether other things might be competitive).
My own opinion here is that people are often just pretty bad at considering alternatives. Time spent in considering alternatives just isn’t so effective, so deciding to “only spend X time” doesn’t seem to solve the problem, I think.
I do think that talking to someone like an 80k advisor is a pretty good magic pill for many people. 80k does have a sense of what careers a certain person might get, and also has a sense of “yeah that is actually super useful”, plus 100 other considerations that it’s pretty hard to figure out alone imo. It also overcomes impostor syndrome (people not even considering jobs that seem to senior regardless of how long they spend thinking) and so on.
I acknowledge this doesn’t scale well
Not 100% sure I followed this but if what you’re saying is “don’t just sit and think on your own when you decide to do the career exploration thing, get advice from others (including 80k)”, then yes, I think that’s excellent advice. In making my own decision I, among other things:
Spoke to my partner, some close friends, my manager at 80k (Michelle), and my (potential) new manager at Open Phil (Luke)
Wrote and shared a decision doc
Had ‘advising call’ style conversations with three people (to whom I’m extremely grateful), who I asked because I thought they’d make good advisors, and I didn’t want to speak to one of 80k’s actual advisors because that’s a really hard position to put someone in, even though I think they’d have been willing to try to be objective. (I had other conversations with various 80k staff, just not an advising session)
I don’t think it’s worth me going back and forth on specific details, especially as I’m not on the web team (or even still at 80k), but these proposals are different to the first thing you suggested. Without taking a position on whether this structure would overall be an improvement, it’s obviously not the case that just having different sections for different possible users ensures that everyone gets the advice they need.
For what it’s worth, one of the main motivations for this being an after-hours episode, which was promoted on the EA forum and my twitter, is that I think the mistakes are much more common among people who read a lot of EA content and interact with a lot of EAs (which is a small fraction of the 80k website readership). The hope is that people who’re more likely than a typical reader to need the advice are the people most likely to come across it, so we don’t have to rely purely on self-selection.
[I left 80k ~a month ago, and am writing this in a personal capacity, though I showed a draft of this answer to Michelle (who runs the team) before posting and she agrees it provides an accurate representation. Before I left, I was line-managing the 4 advisors, two of whom I also hired.]
Hey, I wanted to chime in with a couple of thoughts on your followup, and then answer the first question (what mechanisms do we have in place to prevent this). Most of the thoughts on the followup can be summarised by ‘yeah, I think doing advising well is really hard’.
Advisors often only have a few pages of context and a single call (sometimes there are follow-ups) to talk about career options. In my experience, this can be pretty insufficient to understand someone’s needs.
Yep, that’s roughly right. Often it’s less than this! Not everyone takes as much time to fill in the preparation materials as it sounds like you did. One of the things I frequently emphasised when hiring for and training advisors was asking good questions at the start of the call to fill in gaps in their understanding, check it with the advisee, and then quickly arrive at a working model that was good enough to proceed with. Even then, this isn’t always going to be perfect. In my experience, advisors tend to do a pretty good job of linking the takes they give to the reasons they’re giving them (where, roughly speaking, many of those reasons will be aspects of their current understanding of the person they’re advising).
the person may feel more pressure to pursue something that’s not a good fit for them
With obvious caveats about selection effects, many of my advisees expressed that they were positively surprised at me relieving this kind of pressure! In my experience advisors spend a lot more time reassuring people that they can let go of some of the pressure they’re perceiving than the inverse (it was, for example, a recurring theme in the podcast I recently released).
if they disagree with the advice given, the may not raise it. For example, they may not feel comfortable raising the issue because of concerns around anonymity and potential career harm, since your advisors are often making valuable connections and sharing potential candidate names with orgs that are hiring.
This is tricky to respond to. I care a lot that advisees are in fact not at risk of being de-anonymised, slandered, or otherwise harmed in their career ambitions as a result of speaking to us, and I’m happy to say that I believe this is the case. It’s possible, of course, for advisees to believe that they are at risk here, and for that reason or several possible other reasons, to give answers that they think advisors want to hear rather than answers that are an honest reflection of what they think. I think this is usually fairly easy for advisors to pick up on (especially when it’s for reasons of embarrassment/low confidence), at which point the best thing for them to do is provide some reassurance about this.
I do think that, at some point, the burden of responsibility is no longer on the advisor. If someone successfully convinces an advisor they they would really enjoy role A, or really want to work on cause Z, because they think that’s what the advisor wants to hear, or they think that’s what will get them recommended for the best roles, or introduced to the coolest people, or whatever, and the advisor then gives them advice that follows from those things being true, I think that advice is likely to be bad advice for that person, and potentially harmful if they follow it literally. I’m glad that advisors are (as far as I can tell), quite hard to mislead in this way, but I don’t think they should feel guilty if they miss some cases like this.
I know that 80K don’t want people to take their advice so seriously, and numerous posts have been written on this topic. However, I think these efforts won’t necessarily negate 1) and 2) because many 80K advisees may not be as familiar with all of 80K’s content or Forum discourse, and the prospect of valuable connections remains nonetheless.
There might be a slight miscommunication here. Several of the posts (and my recent podcast interview) talking about how people shouldn’t take 80k’s advice so seriously are, I think, not really pointing at a situation where people get on a 1on1 call and then take the advisor’s word as gospel, but more at things like reading a website that’s aimed at a really broad audience, and trying to follow it to the letter despite it very clearly being the case that no single piece of advice applies equally to everyone. The sort of advice people get on calls is much more frequently a suggestion of next steps/tests/hypotheses to investigate/things to read than “ok here is your career path for the next 10 years”, along with the reasoning behind those suggestions. I don’t want to uncritically recommend deferring to anyone on important life decisions, but on the current margin I don’t think I’d advocate for advisees taking that kind of advice, expressed with appropriate nuance, less seriously.
OK, but what specific things are in place to catch potential harm?
There are a few things that I think are protective here, some of which I’ll list below, though this list isn’t exhaustive.
Internal quality assurance of callsThe overwhelming majority of calls we have are recorded (with permission), and many of these are shared for feedback with other staff at the organisation (also with permission). To give some idea of scale, I checked some notes and estimated that (including trials, and sitting in on calls with new advisors or triallists) I gave substantive feedback on over 100 calls, the majority of which were in the last year. I was on the high end for the team, though everyone in 80k is able to give feedback, not only advisors.
I would expect anyone listening to a call in this capacity to flag, as a priority, anything that seemed like and advisor saying something harmful, be that because it was false, displayed an inappropriate level of confidence, or because it was insensitive.
My overall impression is that this happens extremely rarely, and that the bar for giving feedback about this kind of concern was (correctly) extremely low. I’m personally grateful, for example, for some feedback a colleague gave me about how my tone might have been perceived as ‘teacher-y’ on one call I did, and another case where someone flagged that they thought the advisee might have felt intimidated by the start of the conversation. In both cases, as far as I can remember, the colleague in question thought that the advisee probably hadn’t interpreted the situation in the way they were flagging, but that it was worth being careful in future. I mention this not to indicate that I never made mistakes on calls, but instead to illustrate why I think it’s unlikely that feedback would miss significant amounts of potentially harmful advice.
Advisee feedback mechanisms
There are multiple opportunities for people we’ve advised to give feedback about all aspects of the process, including specific prompts about the quality of advice they received on the call, any introductions we made, and any potential harms.
Some of these opportunities include the option for the advisee to remain anonymous, and we’re careful about accidentally collecting de-anonymising information, though no system is foolproof. As one example, we don’t give an option to remain anonymous in the feedback form we send immediately after the call (as depending on how many other calls were happening at the time, someone filling it in straight away might be easy to notice), but we do give this option in later follow-up surveys (where the timing won’t reveal identity).
In user feedback, the most common reason given by people who said 1on1 caused them harm is that they were rejected from advising and felt bad/demotivated about that. The absolute numbers here are very low, but there’s an obvious caveat about non-response bias.
On specific investigations/examples
I worked with community health on some ways of preventing harm being done by people advisers made introductions to (including, in some cases, stopping introductions)
I spent more than 5 but less than 10 hours, on two occasions, investigating concerns that had been raised to me about (current or former) advisors, and feel satisfied in both cases that our response was appropriate i.e. that there was not an ongoing risk of harm following the investigation.
Despite my personal bar for taking concerns of this sort seriously being pretty low compared to my guess at the community average (likely because I developed a lot of my intuitions for how to manage such situations during my previous career as a teacher), there were few enough incidents meriting any kind of investigation that I think giving any more details than the above would not be worth the (small) risk of deanonymising those involved. I take promises of confidentiality really seriously (as I hope would be expected for someone in the position advisors have).
Thanks for asking these! Quick reaction to the first couple of questions, I’ll get to the rest later if I can (personal opinions, I haven’t worked on the web team, no longer at 80k etc. etc.):
I don’t think it’s possible to write a single page that gives the right message to every user—having looked at the pressing problems page—the second paragraph visible on that page is entirely caveat. It also links to an FAQ, where multiple parts of the FAQ directly talk about whether people should just take the rankings as given. When you then click through to the AI problem profile, the part of the summary box that talks about whether people should work on AI reads as follows:As a result, the possibility of AI-related catastrophe may be the world’s most pressing problem — and the best thing to work on for those who are well-placed to contribute.
Frankly, for my taste, several parts of the website already contain more caveats than I would use about ways the advice is uncertain and/or could be wrong, and I think moves in this direction could just as easily be patronising as helpful.
I’ve seen people wear a very wide range of things at the EAGs I’ve been to.