I agree that selecting experts is a challenge, but it seems better to survey credible experts than to exclude that evidence from the decision-making process.
I want to point out you didn’t address my intended point of “how to pick experts”. You said you’d survey “credible experts”—who are those? How do you pick them? A more object-level answer would be “by forum karma” (not that I’m saying it’s the best answer, but it is more object-level than saying you’d pick the “credible” ones)
Yonatan:
something that might change my mind very quickly is if you’ll give me examples of what “language” you might want to create. Maybe you want a term for “safety washing” as an example [of an example]?
Peter:
would conceptualize movement building in a broad sense e.g., as something involving increased contributors, contribution and coordination, people helping with operations communication and working on it while doing direct work (e.g, via going to conferences etc)
Are these examples of things you think might be useful to add to the language of community building, such as “safety washing” might be an example of something useful?
If so, it still seems too-meta / too-vague.
Specifically, it fails to address the problem I’m trying to point out of differentiating positive community building from negative community building.
And specifically, I think focusing on a KPI like “increased contributors” is the way that AI Safety community building accidentally becomes net negative.
See my original comment:
I think that having metrics for community building (that are not strongly grounded in a “good” theory of change) (such as metrics for increasing people in the field in general) have extra risk for that kind of failure mode.
However, would it have been helpful to know that 85% of researchers recommend research agenda Y or person X?
TL;DR: No. (I know this is an annoying unintuitive answer)
I wouldn’t be surprised if 85% of researchers think that it would be a good idea to advance capabilities (or do some research that directly advances capabilities and does not have a “full” safety theory of change), and they’ll give you some reason that sounds very wrong to me. I’m assuming you interview anyone who sees themselves as working on “AI Safety”.
[I don’t actually know if this statistic would be true, but it’s a kind example of how your survey suggestion might go wrong imo]
Thanks, that’s helpful to know. It’s a surprise to me though! You’re the first person I have discussed this with who didn’t think it would be useful to know which research agendas were more widely supported.
Just to check, would your institution change if the people being survey were only people who had worked at AI organisations, or if you could filter to only see the aggregate ratings from people who you thought were most credible (e.g., these 10 researchers)?
As an aside, I’ll also mention that I think it would be a very helpful and interesting finding if we found that 85% of researchers thought that it would be a good idea to advance capabilities (or do some research that directly advances capabilities and does not have a “full” safety theory of change). That would make me change my mind on a lot of things and probably spark a lot of important debate that probably wouldn’t otherwise have happened.
I want to point out you didn’t address my intended point of “how to pick experts”. You said you’d survey “credible experts”—who are those? How do you pick them? A more object-level answer would be “by forum karma” (not that I’m saying it’s the best answer, but it is more object-level than saying you’d pick the “credible” ones)
Sorry. Again, early ideas, but the credible experts might be people who have published an AI safety paper, received funding to work on AI, and/or worked at an organisation etc. Let me know what you think of that as a sample.
Yonatan:
something that might change my mind very quickly is if you’ll give me examples of what “language” you might want to create.
Peter would conceptualize movement building in a broad sense e.g., as something involving increased contributors, contribution and coordination, people helping with operations communication and working on it while doing direct work (e.g, via going to conferences etc)
Are these examples of things you think might be useful to add to the language of community building, such as “safety washing” might be an example of something useful?
The bolded terms are broadly examples of things that I want people in the community to conceptualize in similar ways so that we can have better conversations about them (i.e., shared language/understanding). What I mention there and in my posts is just my own understanding, and I’d be happy to revise it or use a better set of shared concepts.
If so, it still seems too-meta / too-vague.
Specifically, it fails to address the problem I’m trying to point out of differentiating positive community building from negative community building.
And specifically, I think focusing on a KPI like “increased contributors” is the way that AI Safety community building accidentally becomes net negative.
See my original comment:
I think that having metrics for community building (that are not strongly grounded in a “good” theory of change) (such as metrics for increasing people in the field in general) have extra risk for that kind of failure mode.
I agree that the shared language will fail to address the problem of differentiating positive community building from negative community building.
However, I think it is important to have it because we need shared conceptualisations and understanding of key concepts to be able to productively discuss AI safety movement building.
I therefore see it as something that will be helpful for making progress in differentiating what is most likely to be good or bad AI Safety Movement building, regardless of whether that is via 1-1 discussions or survey of experts etc.
Maybe it’s useful to draw an analogy to EA? Imagine that someone wanted to understand what sort of problems people in EA thought were most important so that they can work on and advocate for them. Imagine this is before we have many shared concepts everyone is talking about doing good in different ways—saving lives, reducing stuffer or risk etc. This person realises that people seem to have very different understandings of doing good/impact etc so they try to develop and introduce some shared conceptualizations like cause areas and qualys etc. Then they use those concepts to help them explore the community consensus and use that evidence to help them to make better decisions. That’s sort of what I am trying to do here.
Does that make sense? Does it seem reasonable? Open to more thoughts if you have time and interest.
I want to point out you didn’t address my intended point of “how to pick experts”. You said you’d survey “credible experts”—who are those? How do you pick them? A more object-level answer would be “by forum karma” (not that I’m saying it’s the best answer, but it is more object-level than saying you’d pick the “credible” ones)
Yonatan:
Peter:
Are these examples of things you think might be useful to add to the language of community building, such as “safety washing” might be an example of something useful?
If so, it still seems too-meta / too-vague.
Specifically, it fails to address the problem I’m trying to point out of differentiating positive community building from negative community building.
And specifically, I think focusing on a KPI like “increased contributors” is the way that AI Safety community building accidentally becomes net negative.
See my original comment:
TL;DR: No. (I know this is an annoying unintuitive answer)
I wouldn’t be surprised if 85% of researchers think that it would be a good idea to advance capabilities (or do some research that directly advances capabilities and does not have a “full” safety theory of change), and they’ll give you some reason that sounds very wrong to me. I’m assuming you interview anyone who sees themselves as working on “AI Safety”.
[I don’t actually know if this statistic would be true, but it’s a kind example of how your survey suggestion might go wrong imo]
Thanks, that’s helpful to know. It’s a surprise to me though! You’re the first person I have discussed this with who didn’t think it would be useful to know which research agendas were more widely supported.
Just to check, would your institution change if the people being survey were only people who had worked at AI organisations, or if you could filter to only see the aggregate ratings from people who you thought were most credible (e.g., these 10 researchers)?
As an aside, I’ll also mention that I think it would be a very helpful and interesting finding if we found that 85% of researchers thought that it would be a good idea to advance capabilities (or do some research that directly advances capabilities and does not have a “full” safety theory of change). That would make me change my mind on a lot of things and probably spark a lot of important debate that probably wouldn’t otherwise have happened.
Thanks for replying:
I want to point out you didn’t address my intended point of “how to pick experts”. You said you’d survey “credible experts”—who are those? How do you pick them? A more object-level answer would be “by forum karma” (not that I’m saying it’s the best answer, but it is more object-level than saying you’d pick the “credible” ones)
Sorry. Again, early ideas, but the credible experts might be people who have published an AI safety paper, received funding to work on AI, and/or worked at an organisation etc. Let me know what you think of that as a sample.
Yonatan:
something that might change my mind very quickly is if you’ll give me examples of what “language” you might want to create.
Peter
would conceptualize movement building in a broad sense e.g., as something involving increased contributors, contribution and coordination, people helping with operations communication and working on it while doing direct work (e.g, via going to conferences etc)
Are these examples of things you think might be useful to add to the language of community building, such as “safety washing” might be an example of something useful?
The bolded terms are broadly examples of things that I want people in the community to conceptualize in similar ways so that we can have better conversations about them (i.e., shared language/understanding). What I mention there and in my posts is just my own understanding, and I’d be happy to revise it or use a better set of shared concepts.
If so, it still seems too-meta / too-vague.
Specifically, it fails to address the problem I’m trying to point out of differentiating positive community building from negative community building.
And specifically, I think focusing on a KPI like “increased contributors” is the way that AI Safety community building accidentally becomes net negative.
See my original comment:
I agree that the shared language will fail to address the problem of differentiating positive community building from negative community building.
However, I think it is important to have it because we need shared conceptualisations and understanding of key concepts to be able to productively discuss AI safety movement building.
I therefore see it as something that will be helpful for making progress in differentiating what is most likely to be good or bad AI Safety Movement building, regardless of whether that is via 1-1 discussions or survey of experts etc.
Maybe it’s useful to draw an analogy to EA? Imagine that someone wanted to understand what sort of problems people in EA thought were most important so that they can work on and advocate for them. Imagine this is before we have many shared concepts everyone is talking about doing good in different ways—saving lives, reducing stuffer or risk etc. This person realises that people seem to have very different understandings of doing good/impact etc so they try to develop and introduce some shared conceptualizations like cause areas and qualys etc. Then they use those concepts to help them explore the community consensus and use that evidence to help them to make better decisions. That’s sort of what I am trying to do here.
Does that make sense? Does it seem reasonable? Open to more thoughts if you have time and interest.