Global moratorium on AGI, now (Twitter). Founder of CEEALAR (née the EA Hotel; ceealar.org)
Greg_Colbourn ⏸️
Here’s a positive EA Forum comment about Stop AI from Greg Colbourn. Greg Colbourn created the EA Hotel. He’s been on the EA Forum since 2014 and has 5972 karma. Greg Colbourn has also made a number of positive tweets about Stop AI: example 1, example 2, example 3, example 4.
Is this enough evidence to establish a meaningful connection between EA and Stop AI? I think so, but you may disagree.
FWIW my bio on X now reads: “Ex-EA (left over EA’s anti-Pause, pro-Anthropic stance).”
And it’s pretty clear that Stop AI are committed to non-violence given they expelled Kirchner after he started talking about it!
(And here is something positive that is happening with Stop AI now (and could be really impactful).)
You worry about Anthropic’s money being a corrupting influence, but their whole company is far far worse than FTX, because of the existential risk it’s subjecting the entire world to. Not only are they continuing in the suicide-onmicide race, they are now leading it!
If you work at Anthropic, please quit and make a big public statement about this. You shouldn’t be waiting for the IPO, because there’s not much point in having millions to donate if we’re all dead in short order (because of how those millions were made). If you own shares, sell them (even pre-IPO, if you can). I no longer consider myself EA because of this issue. Whilst the money and power in EA is pro-Anthropic, I don’t want to associate with it.
Dude it’s basically the whole of Anthropic! And the fact that EAs (mostly) can’t see this is worrying. OP worries about Anthropic’s money being a corrupting influence, but their whole company is far far worse than FTX, because of the existential risk it’s subjecting the entire world to.
But of course the reason for this is corruption of the entire mission by Anthropic shareholding. I really hope that Dustin will see the light in time and do something about it..
The whole AI Safety grantmaking sector is woefully underinvesting in Pause-related initiatives. It’s a tragedy that the world will probably end with so little resources going into pushing for the one solution that might actually prevent it from happening—an international ban (and taboo) on ASI.
I don’t think many people biased in such a way are going to even be particularly aware of it when making arguments, let alone admit to it. It’s mostly a hidden bias. You really don’t want it to be true because of how you think it will affect you if it was.
Thinking AI Risk is among the most important things to work on is one thing. Thinking your life depends on minimising it is another.
Just thinking: surely to be fair, we should be aggregating all the AI results into an “AI panel”? I wonder how much overlap there is between wrong answers amongst the AIs, and what the aggregate score would be?
Right now, as things stand with the scoring, “AGI” in ARC-AGI-2 means “equivalent to the combined performance of a team of 400 humans”, not “(average) human level”.
Ok, I take your point. But no one seems to be actually doing this (seems like it would be possible to do already, for this example; yet it hasn’t been done.)
What do you think a good resolution criteria for judging a system as being AGI should be?Most relevant to X-risk concerns would be the ability to do A(G)I R&D as good as top AGI company workers. But then of course we run into the problem of crossing the point of no return in order to resolve the prediction market. And we obviously shouldn’t do that (unless superalignment/control is somehow solved).
The human testers were random people off the street who got paid $115-150 to show up and then an additional $5 per task they solved. I believe the ARC Prize Foundation’s explanation for the 40-point discrepancy is that many of the testers just didn’t feel that motivated to solve the tasks and gave up [my emphasis]. (I vaguely remember this being mentioned in a talk or interview somewhere.)
I’m sceptical of this when they were able to earn $5 for every couple of minutes’ work (time to solve a task). This is far above the average hourly wage.
100% is the score for a “human panel”, i.e. a set of at least two humans.
Also seems very remarkable (suspect, in fact) - this would mean almost no overlap between the questions that the humans were getting wrong—i.e. if each human averages 60% right, then for 2 humans to get 100% there can only be 20% of questions where both get it right! I think in practice the panels that score 100% have to contain many more than 2 humans on average.
EDIT: looks like “at least 2 humans” means at least 2 humans solved every problem in the set, out of the 400 humans that attempted them!
See the quote in the footnote: “a provision that the system not simply be cobbled together as a set of sub-systems specialized to tasks like the above, but rather a single system applicable to many problems.”
the forecasts do not concern a kind of system that would be able to do recursive self-improvements (none of the indicators have anything to do with it)
The indicators are all about being human level at ~everything kind of work a human can do. That includes AI R&D. And AIs are already known to think (and act) much faster than humans, and that will only become more pronounced as the AGI improves itself; hence the “rapid recursive self-improvement”.
Even if it takes a couple of years, we would probably cross a point of no return not long after AGI.
None of these indicators actually imply that the “AGI” meeting them would be dangerous or catastrophic to humanity
Thanks of pointing this out. There was indeed a reasoning step missing from the text. Namely: such AGI would be able to automate further AI development, leading to rapid recursive self-improvement to ASI (Artificial Superintelligence). And it is ASI that will be lethally intelligent to humanity (/all biological life). I’ve amended the text.
there is nothing to indicate that such a system would be good at any other task
The whole point of having the 4 disparate indicators is that they have to be done by a single unified system (not specifically trained for only those tasks)[1]. Such a system would implicitly be general enough to do many other tasks. Ditto with the Strong AGI question.
While an ideal adversarial Turing test would be a very difficult task for an AI system, ensuring these ideal conditions is often not feasible. Therefore, I’m certainly going to expect news that AI systems will pass some form of the adversarial test
That is what both the Turing Test questions are all about! (Look at the success conditions in the fine print.)
- ^
Metaculus: “By “unified” we mean that the system is integrated enough that it can, for example, explain its reasoning on an SAT problem or Winograd schema question, or verbally report its progress and identify objects during videogame play. (This is not really meant to be an additional capability of “introspection” so much as a provision that the system not simply be cobbled together as a set of sub-systems specialized to tasks like the above, but rather a single system applicable to many problems.)”
- ^
It’s only 8 months later, and the top score on ARC-AGI-2 is now 54%.
One option, if you want to do a lot more about it than you currently are, is Pause House. Another is donating to PauseAI (US, Global). In my experience, being pro-active about the threat does help.
I have to think holding such a belief is incredibly distressing.
Have you considered that you might be engaging in motivated reasoning because you don’t want to be distressed about this? Also, you get used to it. Humans are very adaptable.
The 10% comes from the linked aggregate of forecasts, from thousands of people’s estimates/bets on Metaculus, Manifold and Kalshi; not the EA community.
I think this is pretty telling. I’ve also had a family member say a similar thing. If your reasoning is (at least partly) motivated by wanting to stay sane, you probably aren’t engaging with the arguments impartially.
I would bet a decent amount of money that you would not in fact, go crazy. Look to history to see how few people went crazy over the threat of nuclear annihilation in the Cold War (and all the other things C.S. Lewis refers to in the linked quote).
But a lot of informed people do (i.e. an aggregation of forecasts). What would you do if you did believe both of those things?
See also (somewhat ironically), the AI roast:
its primary weakness is underexploring how individual rationalization might systematically lead safety-concerned researchers to converge on similar justifications for joining labs they believe pose existential threats.
They say in their announcement (already linked) that they expelled him. What makes you think he resigned?