Three polls: on timelines and cause prio
Below are a few polls which Iāve considered running as debate weeks, but thought better of (for now at least).
Timelines
I didnāt run this as a debate week because I figured that the debate slider tool isnāt the ideal way to map out a forecast.
However, I still think itās an interesting temperature check to run on the community, especially with the publication of AI 2027. For the purposes of this poll, we can use the criteria from this metaculus poll.
Also itās no crime to vote based on vibes, leave a comment, and change your mind later.
Bioweapons
Obviously, bioweapons pose a catastrophic risk. But can they be existential? I buy the Parfitian argument that we should disvalue extinction far more than about catastrophe (and this extends somewhat to other states nearby in value to extinction). But Iām unsure how seriously I should take bio-risks compared to other putative existential risks.
Definitions:
Bioweapons: Iām thinking of engineered pathogens in particular.
Existential risk = a risk of existential catastrophe, where existential catastrophe means āan event which causes the loss of a large fraction of expected valueā
Strong longtermism
I wonder where people land on this now that we talk about longtermism less. As a reminder, strong longtermism is the view that āthe most important feature of our actions today is their impact on the far futureā.
Summary of Greaves and MacAskillās paper on the view here.
Disagree on the basis of cluelessness.
Uncertainty about how to reliably affect the longterm future is much worse than uncertainty over our effects on the near-term.
I find the Hilary Greaves argument that neartermist interventions are just as unpredictable as longtermist interventions unconvincing because you could apply the same reason to treating a sick person (maybe theyāll go on to cause disaster), or getting out of bed in the morning (maybe Iāll go on to cause disaster). This paralysis is not tenable.
This question is already probabilistic, so arguably I should put my vote all the way on the ādisagreeā side, because I donāt think itās more likely than not.
But I also donāt think itās that far from a 50% chance eitherāmaybe 40% although I donāt have a strong belief. So my answer is that I weakly disagree.
Did you look at the metaculus resolution criteria? They seem extremely weak to me, would be intersted to know which critiera you think o3 (or whatever the best OAI model is) is furthest away from.
To be honest I did not read the post, I just looked at the poll questions. I was thinking of AGI in the way I would define it*, or as the other big Metaculus AGI question defines it. For the āweakly general AIā question, yeah I think 50% chance is fair, maybe even higher than 50%.
*I donāt have a precise definition but I think of it as an AI that can do pretty much any intellectual task that an average human can do
Yeah thatās fair. Iām a lot more bullish on getting AI systems that satisfy the linked questionās definition than my own one.
Most of my uncertainty is from potentially not understanding the criteria. They seem extremely weak to me:
I wouldnāt be surprised if weāve already passed this.
I donāt think the current systems are able to pass the Turing test yet. Quoting from Metaculus admins:
āGiven evidence from previous Loebner prize transcripts ā specifically that the chatbots were asked Winograd schema questions ā we interpret the Loebner silver criteria to be an adversarial test conducted by reasonably well informed judges, as opposed to one featuring judges with no or very little domain knowledge.ā
Iād bet that current models with less than $ 100,000 of post-training enhancements achieve median human performance on this task.
Seems plausible the metaculus judges would agree, especially given that that comment is quite old.
Look at the resolution criteria which is based on the specific metaculus Q, seems like a very low bar
I disagree, mostly due to the should wording, as believing in consequentialism doesnāt obligate you to have any particular discount rate or have any particular discount function, and these are basically free parameters, so discount rates are independent of consequentialism.
Iāll just repeat @weeatquinceās comment, since he already covered the issue better than I did:
Iām skeptical of Pascalās Muggings
If this includes AI created/āenhanced bioweapons it seems plausible, without that Iām much less sure, though if thereās another few decades of synth bio progress but no AGI, seems plausible too
I hope to write about this at length once school ends, but in short, here are the two core reasons I feel AGI in three years is quite implausible:
The models arenāt generalizing. LLMās are not stochastic parrots, they are able to learn, but the learning heuristics they adopt seem to be random or imperfect. And no, I donāt think METRās newest is evidence against this.[1]
It is unclear if models are situationally aware, and currently, it seems more likely than not that they do not possess this capability. Laine et al. (2024) shows that current models are far below human baselines of situational awareness when tested on MCQ-like questions. I am unsure how models would be able to perform long-term planningāa capability I would consider is crucial for AGIāwithout being sufficiently situationally aware.
As Beth Barnes put it, their latest benchmark specifically shows that āthereās an exponential trend with doubling time between ~2 ā12 months on automatically-scoreable, relatively clean + green-field software tasks from a few distributions.ā Real world tasks rarely have such clean feedback loops; see Section 6 of METRās RE-bench paper for a thorough list of drawbacks and limitations.
Fairācan you give some examples of questions youād use?
While I think AGI by 2028 is reasonably plausible, I think that there are way too many factors that have to go right in order to get AGI by 2028, and this is true even if AI timelines are short.
To be clear, I do agree that if we donāt get AGI by the early 2030s at latest, AI progress will slow down, I donāt have nearly enough credence for the supporting arguments to have my median be in 2028.
I think itās 20% likely based on the model I made.
Note that imo almost all the x-risk from bio routes through AI, and is better thought of as an AI-risk threat model.
Not sure how to interpret this question but the interpretation that comes to mind is āthere is some risk that bioweapons cause extinctionā, on other words āthere is a non-infinitesimal probability that bioweapons cause extinctionā, in which case yes that is certainly true.
Or, a slightly stronger interpretation could be āthe risk from bioweapons is at least as large as the risk from asteroidsā, which I am also pretty confident is true.
However people interpret the question is how we should discuss it, but when I was writing it, I was wondering about whether bioweapons can cause extinction/ā existential risks or not per se. I.e. can bioweapons either:
a) kill everyone
b) Kill enough of the population, forever, such that we can never achieve much as a species.
Iām not sure about the feasibility of either.
It seems like I interpreted this question pretty differently to Michael (and, judging by the votes, to most other people). With the benefit of hindsight, it probably would have been helpful to define what percentage risk the midpoint (between agree and disagree) corresponds to?[1] Sounds like Michael was taking it to mean āliterally zero riskā or ā1 in 1 million,ā whereas I was taking it to mean 1 in 30 (to correspond to Ordās Precipice estimate for pandemic x-risk).
(Also, for what itās worth, for my vote Iām excluding scenarios where a misaligned AI leverages bioweaponsāI count that under AI risk. (But I am including scenarios where humans misuse AI to build bioweapons.) I would guess that different voters are dealing with this AI-bio entanglement in different ways.)
Though I appreciate that it was better to run the poll as is than to let details like this stop you from running it at all.
This is helpful. If this was actually for a debate week, Iād have made it āmore than 5% extinction risk this centuryā and (maybe) excluded risks from AI.
Iām interpreting this question as āan existential risk that we should be concerned aboutā, which I think the case for is much weaker than whether or not they are generally an existential risk (though I still think the answer is yes).
Ai 2027
My current analysis, as well as a lot of other analysis Iāve seen, suggests AGI is most likely to be possible around 2030.
I think we should focus on short timelines, still I think there are not the most likely scenario. Most likely is imo a delay of maybe two years.
It just makes theoretically sense. In practice it doesnāt matter, e.g. RSI and loss of control is a near term risk.
Mainly thinking about A(G)I engineered bioweapons.
I donāt buy the Parfitian argument, so Iām not sure what a binary yes-no about existential risk would mean to me.
I agree with a bunch of the standard arguments against this, but Iāll throw in two more that I havenāt seen fleshed out as much:
The intuitive definition of AGI includes some physical capabilities (and even ones that nominally exclude physical capabilities probably necessitate some), and we seem really far behind on where I would expect AI systems to be in manipulating physical objects.
AIs make errors in systematically different ways than humans, and often have major vulnerabilities. This means weāll probably want AI that works with humans in every step, and so will want more specialized AI. I donāt really buy some arguments that Iāve seen against this but I donāt know enough to have a super confident rebuttal.
Hmm it seems like the Metaculus poll linked is actually on a random selection of benchmarks being arbitrarily defined as a weakly general intelligence. If I have to go with the poll resolution, I think thereās a much greater chance (not going to look into how difficult the Atari game thing would be yet, so not sure how much greater).
I gave a number of reasons I think AGI by 2030 is extremely unlikely in a post here.
Technically I agree that 100% consequentialists should be strong longtermists, but I think if you are moderately consequentialist, you should only sometimes be a longtermist. When it comes to choosing your career, yes, focus on the far future. When it comes to abandoning family members to squeeze out another hour of work, no. Weāre humans not machines.
For me, the strongest arguments against strong longtermism are simulation theory and the youngness paradox (as well as yet-to-be-discovered crucial considerations).[1]
(Also, nitpickily, Iād personally reword this poll from āConsequentialistsshouldbe strong longtermistsā to āI am a strong longtermist,ā because Iāmnot convincedthat anyone āshouldā be anything, normatively speaking.)I also worry about cluelessness, though cluelessness seems just as threatening to neartermist interventions as it does to longtermist ones.
Iām a pretty strong anti-realist but this is one of the strongest types of shoulds for me.
I.e. āIf you want to achieve the best consequences, then you should expect the majority of affectable consequences to be in the far futureā Seems like the kind of thing that could be true or false on non-normative grounds, and would normatively ground a āshouldā if you are already committed to consequentialism. In the sense that believing āI should get to Rome as fast as possibleā and āThe fastest way to get to Rome is to take a flightā grounds a āshouldā for āI should take a flight to Romeā.