Currently doing local AI safety Movement Building in Australia and NZ.
Chris Leong
Summary: “Imagining and building wise machines: The centrality of AI metacognition” by Johnson, Karimi, Bengio, et al.
Reflections on AI Wisdom, plus announcing Wise AI Wednesdays
Potentially Useful Projects in Wise AI
“Prominence” isn’t static.
My impression is that Holly has intentionally sacrificed a significant amount of influence within EA because she feels that EA is too constraining in terms of what needs to be done to save humanity from AI.So that term would have been much more accurate in the past.
At least in the government sector, time-limited post-employment restrictions are not uncommon. They are intended to avoid the appearance of impropriety as much as actual impropriety itself. In those cases, we don’t trust the departing employee not to use their prior public service for private gain in certain ways.
My argument for this being bad is quite similar to what you’ve written.This is also a massive burning of the commons. It is valuable for forecasting/evals orgs to be able to hire people with a diversity of viewpoints in order to counter bias. It is valuable for folks to be able to share information freely with folks at such forecasting orgs without having to worry about them going off and doing something like this.
However, this only works if those less worried about AI risks who join such a collaboration don’t use the knowledge they gain to cash in on the AI boom in an acceleratory way. Doing so undermines the very point of such a project, namely, to try to make AI go well. Doing so is incredibly damaging to trust within the community.
Sometimes the dollar signs can blind someone and cause them not to consider obvious alternatives. And they will feel that they made the decision for reasons other than the money, but the money nonetheless caused the cognitive distortion that ultimately led to the decision.
I’m not claiming that this happened here. I don’t have any way of really knowing. But it’s certainly suspicious. And I don’t think anything is gained by pretending that it’s not.
I bet the strategic analysis for Mechanize being a good choice (net-positive and positive relative to alternatives) is paper-thin, even given his rough world view.
I can see why you’d find this personally frustrating.
On the other hand, many people in the community, myself included, took certain claims from OpenAI and sbf at face value when it might have been more prudent to be less trusting. I understand that it must be unpleasant to face some degree of distrust due to the actions of others.And I can see why you’d see your statements as a firm denial, whilst from my perspective, they were ambiguous. For example, I don’t know how to interpret your use of the word “meaningful”, so I don’t actually know what exactly you’ve denied. It may be clear to you because you know what you mean, but it isn’t clear to me.
(For what it’s worth, I neither upvoted nor downvoted the comment you made before this one, but I did disagree vote it.)
If you don’t want to engage, that’s perfectly fine. I’ve written a lot of comments and responding to all of them would take substantial time. It wouldn’t be fair to expect that from you.
That said, labelling asking for clarification “cult-like behaviour” is absurd. On the contrary, not naively taking claims at face value is a crucial defence against this. Furthermore, implying that someone asking questions in bad faith is precisely the technique that cult leaders use[1].
I said that the statement left you substantial wiggle room. This was purely a comment about how the statement could have a broad range of interpretations. I did not state, nor mean to imply, that this vagueness was intentional or in bad faith.
- ^
That said, people asking questions in bad faith is actually pretty common and so you can’t assume that something is a cult just because they say that their critics are mostly acting in bad faith.
- ^
I think Holly just said what a lot of people were feeling and I find that hard to condemn.
”Traitor” is a bit of a strong term, but it’s pretty natural for burning the commons to result in significantly less trust. To be honest, the main reason why I wouldn’t use that term myself is that it reifies individual actions into a permanent personal characteristic and I don’t have the context to make any such judgments. I’d be quite comfortable with saying that founding Mechanise was a betrayal of sorts, where the “of sorts” clarifies that I’m construing the term broadly.Glossing over it as being “essentially correct”
This characterisation doesn’t quite match what happened. My comment wasn’t along the lines, “Oh, it’s essentially correct, close enough is good enough, details are unimportant”, but I actually wrote down what I thought a more careful analysis would look like.
They end up legitimizing what is, in fact, a low-effort personal attack without a factual basis
Part of the reason why I’ve been commenting is to encourage folks to make more precise critiques. And indeed, Michael has updated his previous comment in response to what I wrote.
A baseless, false accusation
Is it baseless?
I noticed you wrote: “we do not plan on meaningfully making use”. That provides you with substantial wriggle room. So it’s unclear to me at this stage that your statements being true/defensible would necessitate her statements being false.
I agree that Michael’s framing doesn’t quite work. It’s not even clear to me that OpenPhil, for example, is aiming to “slow down AI development” as opposed to “fund research into understanding AI capability trends better without accidentally causing capability externalities”.
I’ve previously written a critique here, but the TLDR is that Mechanise is a major burning of the commons that damages trust within the Effective Altruism community and creates a major challenge for funders who want to support ideological diversity in forecasting organisations without accidentally causing capability externalities.
Furthermore, we do not plan on meaningfully making use of benchmarks, datasets, or tools that were developed during my previous roles in any substantial capacity at the new startup. We are not relying on that prior work to advance our current mission. And as far as I can tell, we have never claimed or implied otherwise publicly.
This is a useful clarification. I had a weak impression that Mechanise might be.
They seem more consistent with a kind of ideological or tribal backlash to the idea of accelerating AI than with genuine, thoughtful, and evidence-based concerns.
I agree that some of your critics may not have quite been able to hit the nail on the head when they tried to articulate their critiques (it took me substantial effort to figure out what I precisely thought was wrong, as opposed to just ‘this feels bad’), but I believe that the general thrust of their arguments more or less holds up.
“From people who want to slow down AI development”
The framing here could be tighter. It’s more about wanting to be able to understand AI capability trends better without accidentally causing capability externalities.
Note: Matthew’s comment was negative just now. Please don’t vote it into the negative and use the disagree button instead. Even though I don’t think Matthew’s defense is persuasive, it deserves to be heard.
I wrote a critique of that article here. TLDR: “It has some strong analysis at points, but unfortunately, it’s undermined by some poor choices of framing/focus that mean most readers will probably leave more confused than when they came”.
”A software singularity is unlikely to occur”—Unlikely enough that you’re willing to bet the house on it? Feels like you’re picking up pennies in front of a steamroller.I continue to think that accelerating AI is likely good for the world
AI is already going incredibly fast. Why would you want to throw more fuel on the fire?
Is it that you honestly think AI is moving too slow at the moment (no offense, but seems crazy to me) or is your worry that current trends are misleading and AI might slow in the future?Regarding the latter, I agree that once timelines start to get sufficiently long, there might actually be an argument for accelerating them (but in order to reach AGI before biotech causes a catastrophe, rather than the more myopic reasons you’ve provided). But if your worry is stagnation, why not actually wait until things appear to have stalled and then perhaps consider doing something like this?
Or why didn’t you just stay at Epoch, which was a much more robust and less fragile theory of action? (Okay, I don’t actually think articles like this are high enough quality to be net-positive, but you were 90% of the way towards having written a really good article. The framing/argument just needed to be a bit tighter, which could have been achieved with another round of revisions).
The main reason not to wait is… missing the opportunity to cash in on the current AI boom.
Short update—TLDR—mechanise is going straight for automating software engineering.
Matthew’s comment was on −1 just now. I’d like to encourage people not to vote his post into the negative. Even though I don’t find his defense at all persuasive, I still think it deserves to be heard.
What I perceive as a measured, academic disagreement
This isn’t merely an “academic disagreement” anymore. You aren’t just writing posts, you’ve actually created a startup. You’re doing things in the space.
As an example, it’s neither incoherent nor hypocritical to let philosophers argue “Maybe existence is negative, all things considered” whilst still cracking down on serial killers. The former is necessary for academic freedom, the latter is not.
The point of academic freedom is to ensure that the actions we take in the world are as well-informed as possible. It is not to create a world without any norms at all.It appears that advocating for slowing AI development has become a “sacred” value… Such reactions frankly resemble the behavior of a cult
Honestly, this is such a lazy critique. Whenever anyone disagrees with a group, they can always dismiss them as a “cult” or “cult-adjacent”, but this doesn’t make it true.
I think Ozzie’s framing of cooperativeness is much more accurate. The unilateralist’s curse very much applies to differential technology development, so if the community wants to have an impact here, it can’t ignore the issue of “cowboys” messing things up by rowing in the opposite direction, especially when their reasoning seems poor. Any viable community, especially one attempting to drive change, needs to have a solution to this problem.
Having norms isn’t equivalent to being a cult. When Fair Trade started taking off, I shared some of my doubts with some people who were very committed to it. This went poorly. They weren’t open-minded at all, but I wouldn’t run around calling Fair Trade a cult or even cult adjacent. They were just… a regular group.
And if I had run around accusing them of essentially being a “cult” that would have reflected poorly on me rather than on them.
I have been publicly labeled a “sellout and traitor”… simply because I cofounded an AI startup
As I described in my previous comment, the issue is more subtle than this. It’s about the specific context:
This is also a massive burning of the commons. It is valuable for forecasting/evals orgs to be able to hire people with a diversity of viewpoints in order to counter bias. It is valuable for folks to be able to share information freely with folks at such forecasting orgs without having to worry about them going off and doing something like this.
However, this only works if those less worried about AI risks who join such a collaboration don’t use the knowledge they gain to cash in on the AI boom in an acceleratory way. Doing so undermines the very point of such a project, namely, to try to make AI go well. Doing so is incredibly damaging to trust within the community.
I concede that there wasn’t a previous well-defined norm against this, but norms have to get started somehow. And this is how it happens, someone does something, people are like wtf and then, sometimes, a consensus forms that a norm is required.
Looks like Mechanize is choosing to be even more irresponsible than we previously thought. They’re going straight for automating software engineering. Would love to hear their explanation for this.
“Software engineering automation isn’t going fast enough” [1] - oh really?
This seems even less defensible than their previous explanation of how their work would benefit the world.- ^
Not an actual quote
- ^
Great post. Listing concrete examples of orphaned policies makes it much easier for folks to evaluate how much of a priority drafting orphaned policies should be.
That said, my belief is that actually it’s not just fine for the AI governance community to propose far more policies than policies actually drafted, but this is exactly the way that it should be.
Generally, when you have a pipeline, you want filtering to occur at each stage. I have a strong intuition that the impact of policies is quite heavy-tailed, particularly because some policies that initially seem promising might actually turn out to be net-negative, impractical or hard to have any confidence in.
Here’s my hot-takes (disclaimer: by hot-takes, I really do mean hot-takes and I’m not a policy professional!)
• Windfalls clause: robustly good, but not on the critical path
• Antitrust waiver: seems robustly good
• Visa reform: hard to determine the sign of due to espionage concerns
• Insurance requirements: hard to determine the sign of due to moral hazard
• Public grant funding: quite hard to make sure this goes to anything useful, that said, UK AISI is distributing grants and I’m quite optimistic about their judgement
• Global crisis hotline: seems robustly good
• Compute monitoring: seems robustly good
• LAW Boycott: hard to determine sign due to unstable equilibrium
• Industry standards: I’m a lot more pessimistic about this than most folks. Very easy for this to create a false sense of security. Unfortunately, at the end of the day, if a company doesn’t care, they don’t care
• Structured access to research: I suspect that the companies will either give it voluntarily or it’ll not be worth the political capital to try to mandate
So 3⁄11 seem robustly good, with another robustly good but not on the critical path.
Question: Are there any organisations focused on taking general policy proposals and developing them into specific proposals for legislation? I could see value in having an organising specialising in this stage if the majority of governance organisations are just throwing rather general proposals over the wall and hoping someone else will fill in the details.
Sad to see this. I agree that Apart adds something distinctive to the ecosystem (an extremely easy entry point), so it would be a shame to see it disappear.
I wonder whether this is because there’s so much competition for competitive fellowships like MATS (and perhaps even for some of the unpaid AI safety opportunities) that funders feel less need to fund projects earlier in the pipeline?
I’d love to see you expand on this paragraph:
Personally, what feels most missing to me around EA online is leadership/communication about the big issues, some smart+effective moderation (this is really tough), and experimentation on online infrastructure outside the EA Forum (see Discords, online courses, online meetups, maybe new online platforms, etc). I think there’s a lot of work to do here, but would flag that it’s likely pretty hit-or-miss, maybe making it a more difficult ask for funders.
EA used to lean more into moral arguments/criticism back in the day, but most folks, even those who were part of the movement back in the day, seem to have leaned away from this.
It’s hard to say why exactly, but being confrontational is unpleasant and it’s not clear that it was actually more effective. OGTutzauer makes a good point that a movement trying to raise donations has more incentive to leverage guilt, whilst a movement trying to shift people’s careers has more incentive to focus on being appealing to be part of.
It might also be partly due to the influence of rationalist culture norms, whilst Moral Ambition seems to have been influenced by both EA and progressivism. (My experience has been that the animal welfare folks, who tend to lean more into progressivism, are most likely to lean into confrontationalism).