Technoprogressive, biocosmist, rationalist, defensive accelerationist, longtermist
Matrice Jacobine
ASI existential risk: reconsidering alignment as a goal
How prediction markets can create harmful outcomes: a case study
This seems to complement @nostalgebraist’s complaint that much of work on AI timelines (Bio Anchors, AI 2027) rely on a few load-bearing assumptions (e.g. the permanence of Moore’s law, the possibility of software intelligence explosion) and then doing a lot of work crunching statistics and Fermi estimations to “predict” an AGI date, when really the end result is overdetermined by those beginning assumptions and not affected very much by changing the secondary estimations. It is thus largely a waste of time to focus on improving those estimations when there is a lot more research to be done on the actual load-bearing assumptions:
Is Moore’s law going to continue indefinitely?
Is software intelligence explosion plausible? (If yes, does it require concentration of compute?)
Is technical alignment easy?
...
Which are the actual cruxes for the most controversial AI governance questions like:
How much should we worry about regulatory capture?
Is it more important to reduce the rate of capabilities growth or for the US to beat China?
Should base models be open-sourced?
How much can friction when interacting with the real world (e.g. time needed to build factories and perform experiments (poke @titotal), regulatory red tape, labor unions, etc.) prevent AGI?
How continuous are “short-term” AI ethics efforts (FAccT, technological unemployment, military uses) with “long-term” AI safety?
How important is it to enhance collaboration between US, European and Chinese safety organizations?
Should EAs work with, for, or against frontier AI labs?
...
I’d like to thank Sam Altman, Dario Amodei, Demis Hassabis, Yann LeCun, Elon Musk, and several others who declined to be named for giving me notes on each of the sixteen drafts of this post I shared with them over the past three months. Your feedback helped me polish a rough stone of thought into a diamond of incisive criticism.
??? Was this meant for April’s Fools Day? I’m confused.
“Long” timelines to advanced AI have gotten crazy short
“This might be the first large-scale application of AI technology to geopolitics.. 4o, o3 high, Gemini 2.5 pro, Claude 3.7, Grok all give the same answer to the question on how to impose tariffs easily.”
It doesn’t matter what you think they should have done, the fact is, Murati and Sutskever defected to Altman’s side after initially backing his firing, almost certainly because the consensus discourse quickly became focused on EA and AI safety and not the object-level accusations of inappropriate behavior.
Big Banks Quietly Prepare for Catastrophic Warming
The “highly inappropriate behavior” is question was nearly entirely about violating safety protocols, and by the time Murati and Sutskever defected to Altman’s side the conflict was clearly considered by both sides to be a referendum on EA and AI safety, to the point of the board seeking to nominate rationalist Emmett Shear as Altman’s replacement.
Show me a 1966 study showing 70% of a representative sample of the general population mistake ELIZA for an human after 5 minutes of conversation.
Large Language Models Pass the Turing Test
I know this is an April’s Fools joke, but EAs and AI safety people should do more thinking about how to value-align human organizations while still making them instrumentally effective (see e.g. @Scott Alexander’s A Paradox of Ecclesiology, the social and intellectual movements tag).
Plenty AI safety people have tried to do work in AI, with a, let’s say, mixed track record:
Be too relaxed in organization and orthodoxy and bottom-up in control, you wind up starting the AI race in the first place because the CEO you picked turned out to be a pathological liar and plenty of your new hires more committed to him and acceleration than safety.
Be too strict in organization and orthodoxy and top-down in control, the sole AI safety work you manage to publish is a seven-page Word document with mistaken mathematical signs and the only thing you’re known for is getting linked with eight violent deaths.
… probably there should be a golden mean between the two. (EleutherAI seems to be a rare success story in this area.)
I think you’re interpreting as ascendancy what is mostly just Silicon Valley realigning to the Republican Party (which is more of a return to the norm both historically and for US industrial lobbies in general). None of the Democrats you cite are exactly rising stars right now.
If you mean Meta and Mistral I agree. I trust EleutherAI and probably DeepSeek to not release such models though, and they’re more centrally who I meant.
This isn’t really the best example to use considering AI image generation is very much the one area where all the most popular models are open-weights and not controlled by big tech companies, so any attempt at regulating AI image generation would necessarily mean concentrating power and antagonizing the free and open source software community (something which I agree with OP is very ill-advised), and insofar as AI-skeptics are incapable of realizing that, they aren’t reliable.
Yeah, this feel particularly weird because, coming from that kind of left-libertarian-ish perspective, I basically agree with most of it but also every time he tries to talk about object-level politics it feels like going into the bizarro universe and I would flip the polarity of the signs of all of it. Which is an impression I generally have with @richard_ngo’s work in general, him being one of the few safetyists on the political right to not have capitulated to accelerationism-because-of-China (as most recently even Elon did). Still, I’ll try to see if I have enough things to say to collect bounties.
I wasn’t even contrasting “moral alignment” with “aligning to the creator’s specific intent [i.e. his individual coherent extrapolated volition]”, but with just “aligning with what the creator explicitly specified at all in the first place” (“inner alignment”?), which is implicitly a solved problem in the paperclip maximizer thought experiment if the paperclip company can specify “make as many paperclips as possible”, and is very much not a solved problem in LLMs.
For the record, as someone who was involved in AI alignment spaces well before it became mainstream, my impression was that, before the LLM boom, “moral alignment” is what most people understood AI alignment to mean, and what we now call “technical alignment” would have been considered capabilities work. (Tellingly, the original “paperclip maximizer” thought experiment by Nick Bostrom assumes a world where what we now call “technical alignment” [edit: or “inner alignment”?] is essentially solved and a paperclip company can ~successfully give explicit natural language goals to its AI to maximize.)
In part this may be explained by updating on the prospect of LLMs becoming the route to AGI (with the lack of real utility function making technical alignment much harder than we thought, while natural language understanding, including of value-laden concepts, seems much more central to machine intelligence than we thought), but the incentives problem of AI alignment work being increasingly made under the influence of first OpenAI then OPP-backed Anthropic is surely a part of it.
Digital rights organizations like the Electronic Frontier Foundation might be of particular interest, as they not just combat anti-democratic abuses by both state and corporate powers, but are particularly interested in protecting spaces of communication from surveillance and censorship, which seem of particular importance for making society resilient to authoritarianism in the long term, including in the least convenient possible world where democratic backsliding throughout the West turns out to be in fact a durable trend (in which case the more traditional organizations you cite will probably be useless).