I’m a computational physicist, I generally donate to global health. I am skeptical of AI x-risk and of big R Rationalism, and I intend on explaining why in great detail.
titotal
Yes, but in doing so the uncertainty in both A and B matters, and showing that A is lower variance than B doesn’t show that E[benefits(A)] > E[benefits(B)]. Even if benefits(B) are highly uncertain and we know benefits(A) extremely precsiely, it can still be the case that benefits(B) are larger in expectation.
If you properly account for uncertainty, you should pick the certain cause over the uncertain one even if a naive EV calculation says otherwise, because you aren’t accounting for the selection process involved in picking the cause. I’m writing an explainer for this, but if I’m reading the optimisers curse paper right, a rule of thumb is that if cause A is 10 times more certain than cause B, cause B should be downweighted by a factor of 100 when comparing them.
Generally, the scientific community is not going around arguing that drastic measures should be taken based on singular novel studies. Mainly, what a single novel study will produce is a wave of new studies on the same subject, to ensure that the results are valid and that the assumptions used hold up to scrutiny. Hence why that low-temperature superconductor was so quickly debunked.
I do not see similar efforts in the AI safety community. The studies by METR are great first forays into difficult subjects, but then I see barely any scrutinity or follow-up by other researchers. And people accept much worse scholarship like AI2027 at face-value for seemingly no reason.
I have experience in both academia and EA now, and I believe that the scholarship and skeptical standards in EA are substantially worse.
Taken literally, “accelerationist” implies that you think the technology isn’t currently progressing fast enough, and that some steps should be taken to make it go faster. This seems a bit odd, because one of your key arguments (that I actually agree with) is that we learn to adapt to technology as it rolls out. But obviously it’s harder to adapt when change is super quick, compared to gradual progress.
How fast do you think AI progress should be going, and what changes should be made to get there?
I think Eric has been strong about making reasoned arguments about the shape of possible future technologies, and helping people to look at things for themselves.
I guess this is kind of my issue, right? He’s been quite strong at putting forth arguments about the shape of the future that were highly persuasive and yet turned out to be badly wrong.[1] I’m concerned that this does not seem to have his affected his epistemic authority in these sort of circles.
You may not be “defering” to drexler, but you are singling out his views as singularly important (you have not made similar posts about anybody else[2]). There are hundreds of people discussing AI at the moment, a lot of them with a lot more expertise, and a lot of whom have not been badly wrong about the shape of the future.
Anyway, I’m not trying to discount your arguments either, I’m sure you have found stuff in valuable. But if this post is making a case for reading Drexler despite him being difficult, I’m allowed to make the counterargument.
Drexlers previous predictions seem to have gone very poorly. This post evaluated the 30 year predictions of a group of seven futurists in 1995, and Drexler came in last, predicting that by 2026 we would have complete drexlerian nanotech assemblers, be able to reanimate cryonic suspendees, have uploaded minds, and have a substantial portion of our economy outside the solar system.
Given this track record of extremely poor long-term prediction, why should I be interested in the predictions that Drexler makes today? I’m not trying to shit on Drexler as a person (and he has had a positive influence in inspiring scientists), but it seems like his epistemological record is not very good.
I’m broadly supportive of this type of initiative, and it seems like it’s definitely worth a try (the downsides seem low compared to the upsides). However I suspect that, like most apparently good ideas, scrutiny will yield problems.
One issue I can think of: in this analysis, a lot of the competitive advantage for the company arises from the good reputation of the charitable foundation running it. However, running a large company competitively sometimes involves making tough, unpopular decisions, like laying off portions of your workforce. So I don’t think your assumption that the charity-owned company can act exactly like a regular company holds up necessarily: doing so risks eliminating the reputational advantage that is needed for the competitive edge.
I have many disagreements, but I’ll focus on one: I think point 2 is in contradiction with points 3 and 4. To put it it plainly: the “selection pressures” go away pretty quickly if we don’t have reliable methods of knowing or controlling what the AI will do, or preventing it from doing noticeably bad stuff. That applies to the obvious stuff like if AI tries to prematurely go skynet, but it also applies to more mundane stuff like getting an AI to act reliably more than 99% of the time.
I believe that if we manage to control AI enough to make widespread rollout feasible, then it’s pretty likely we’ve already solved alignment well enough to prevent extinction.
I’m not super excited about revisiting the model, to be honest, but I’ll probably take a look at some point.
What I’d really like to see, and what I haven’t noticed from a quick look through the update, is some attempt to prove the validity of the models with reference to actual data. For example, I think METR comes off looking pretty good right now with their exponential model of horizon growth, which has held up for nearly a year post-publication now. The AI2027 model’s prediction of superexponential growth has not. So I think they have to make a pretty strong case for why I should trust the new model.
I think the problem here is that novel approaches are substantially more likely to be failures due to being untested and unproven. This isn’t a big deal in areas where you can try lots of stuff out and sift through them with results, but in something like an election you only get feedback like once a year or so. Worse, the feedback is extremely murky, so you don’t know if it was your intervention or something else that resulted in the outcome you care about.
One other issue I thought of since my other comment: you list several valid critiques that the AI made that you’d already identified, but were not in the provided training materials. You state that this gives additional credence to the helpfulness of the models:
three we were already planning to look into but weren’t in the source materials we provided (which gives us some additional confidence in AI’s ability to generate meaningful critiques of our work in the future—especially those we’ve looked at in less depth).
However, just because the critique is not in the provided source materials, it doesn’t mean that it’s not in the wider training data of the LLM model. So for example, if Givewell talked about the identified issue of “optimal chlorine doses” in a blog comment or something, and that blog site got scraped into the LLM, then the critique is not a sign of LLM usefulness: they may just be parroting back your own findings to you.
Overall this seems like a sensible, and appropriately skeptical, way of using LLM’s in this sort of work.
In regards to improving the actual AI output, it looks like there is insufficient sourcing of claims in what it puts out, which is going to slow you down when you actually try and check the output. I’m looking at the red team output here on water turpidity. This was highlighted as a real contribution by the AI, but the output has zero sourcing on it’s claims, which presumably made it much harder to actually check for validity. If you were to get this critique from a real, human, red-teamer, they would make it signficantly more easy to check that the critique was valid and sourced.
One question I have to ask is whether you are measuring how much time and effort is being extended into managing the output of these LLM’s and sifting out the actually useful recommendations? When assessing whether the techniques are a success, you have to consider the counterfactual case where that time was replaced by human research time looking more closely at the literature, for example.
I would not describe the finetuning argument and the Fermi paradox as strong evidence in favour of the simulation hypothesis. I would instead say that they are open questions for which a lot of different explanations have been proposed, with the simulation offering only one of many possible resolutions.
As to the “importance” argument, we shouldn’t count speculative future events as evidence of the importance of now. I would say the mid-20th century was more important than today, because that’s the closest we ever got to nuclear annihilation (plus like, WW2).
I’d like to see more outreach to intellectual experts outside of the typical EA community. I think there are lots of people with knowledge and expertise that could be relevant to EA causes, but who barely know that it exists, or have disagreements with fundamental aspects of the movement. Finding ways to engage with these people could be very valuable to get fresh perspectives and it could help grow the community.
I don’t know how exactly to do this, but maybe something like soliciting guest posts from professors or industry experts, or AMA style things or dialogues.
Before I can or should try to write up that take, I need to fact-check one of my take-central beliefs about how the last couple of decades have gone down. My belief is that the Open Philanthropy Project, EA generally, and Oxford EA particularly, had bad AI timelines and bad ASI ruin conditional probabilities; and that these invalidly arrived-at beliefs were in control of funding, and were explicitly publicly promoted at the expense of saner beliefs.
We don’t know if AGI timelines or ASI ruin conditional probabilities are “bad”, because neither event has happened yet. If you want to ask what openphils probabilities are and if they disagree with your own, you should just ask that directly. My impression is that there is a wide range of views on both questions among EA org leadership.
I have an issue with this argument, although I don’t have much expertise in this field.
you talk about the legality of a patient directly buying radiology results from an AI company, but this isn’t a very plausible path of radiologists being replaced. People will still have to go to the hospital to get the actual radiology scans done.
The actual concern would be that hospitals get the radiology scans done by non-radiologists, and outsource the radiology results to an AI radiology company. I can’t really tell from your post whether this is illegal or not (if it is, what is the business model of these companies?). This process seems more like how automation will actually go in most fields, so it’s relevant if it’s not working for radiology.
And another point: one reason that this stuff may be illegal is that it doesn’t work well enough to be made legal. I think if this is partly the reason, that can absolutely be used as a point against the likelihood of AI automation.
Some of these technological developments were themselves a result of social coordination. For example, solar panels are extremely cheap now, but they used to be very expensive. Getting them to where they are now involved decades of government funded research and subsidies to get the industry up and running, generally motivated by environmental concerns.
It seems like there are many cases where technology is used to solve a problem, but we wouldn’t have actually made the switch without regulation and coordinated action. Would you really attribute the banning of CFC’s primarily to the existence of technological alternatives? It seems like you need both an alternative technology and social and political action.
they can just describe to ChatGPT or Claude what they want to know and ask the bot to search the EA Forum and other EA-related websites for info.
I feel like you’ve written a dozen posts at this point explaining why this isn’t a good idea. LLM’s are still very unreliable, the best way to find out what people in EA believe is to ask.
With regards to the ranking of charities, I think it would be totally fine if there were 15 different versions rankings out there. It would allow people to get a feel for what people with different worldviews value and agree or disagree on. I think this would be preferable to having just one “official” ranking, as there’s no way to take into account massive worldview differences into that.
Explaining a subtle but important error in the LEAP survey of AI experts
I think you might be engaging in a bit of Motte-and-Baileying here. Throughout this comment, you’re stating MIRI’s position as things like “it will be hard to make ASI safe”, and that AI will “win”, and that it will be hard for an AI to be perfectly aligned with “human flourishing” Those statements seem pretty reasonable.
But the actual stance of MIRI, which you just released a book about, is that there is an extremely high chance that building powerful AI will result in everybody on planet earth being killed. That’s a much narrower and more specific claim. You can imagine a lot of scenarios where AI is unsafe, but not in a way that kills everyone. You can imagine cases where AI “wins”, but decides to cut a deal with us. You can imagine cases where an AI doesn’t care about human flourishing because it doesn’t care about anything, it ends up acting like a tool that we can direct as we please.
I’m aware that you have counterarguments for all of these cases (that I will probably disagree with). But these counterarguments will have to be rooted in the actual nuts and bolts details of how actual, physical AI works. And if you are trying to reason about future machines, you want to be able to get a good prediction about their actual characteristics.
I think in this context, it’s totally reasonable for people to look at your (in my opinion poor) track record of prediction and adjust their credence in your effectiveness as an institution.
This is a very important question to be asking.
This analysis seems to be entirely based on METR’s time horizon research. I think that research is valuable, but it raises the concern that any findings may be a result of particular quirks of METR’s approach, you describe some of those in here.
Are you aware of any alternative groups that have explored this question? It feels to me like it’s not a question you explicitly need time horizons to answer.