pre-doc at Data Innovation & AI Lab
previously worked in options HFT and tried building a social media startup
founder of Northwestern EA club
Charlie_Guthmann
What someones #1 focus should be is a really complicated question that involves values, interests, etc. For the movements part, there is no official list.
That being said It’s reasonable to argue against democracy preservation as a good use of EA (or specfic peoples) time, but neglectedness alone would only be a part of that story.
Moratoriums Only Freeze Half the Stack: Why AI Capability Growth Won’t Stop at Compute
Yes I think you mostly captured it and quite well. But I think there is something a little more too, which is that EA meme actually is more epistemically humble than you think. There is EA the meme and EA the group. The EA meme has leaked into much of mainstream policy and economics. It’s in the water. The EA group has not.
Let’s say (referring to your other comment here), you do get a rich funder to fund work on applying alternate moral systems, in a ratio such that we, we being the current people and groups who you think compose ea (who is that?), in tandem with this new funding, are riding the perfect part of the curve where the marginal efficiency of exploration and exploitation (of our moral values) is equivalent.
Taking a specific example, let’s say this founder funds EA of biodiversity. Based on some (evolving) metric of biodiversity, this new group finds the best interventions for preserving biodiversity. Let’s say their current best cause areas after all of this debate are saving the coral reefs and preserving indigenous languages and culture.
In what sense are they any longer part of EA? Would you expect this subgroup to then post to the EA forums and go to EAG? More likely is they just become their own thing or the people get absorbed into the existing biodiversity or climate movements.
So then are we still properly exploring/exploiting? or do we now need a new group? Again, who is we?
We is some effort-status-capital-talent weighted aggregation of all the people who care to engage in the spaces and network of other people who would self describe as ea. It’s a very ephemeral thing driven by subliminal status games and hidden incentives.I’m definitely not sure this is futile. I still try to push towards you vision, and others have too.
However the question isn’t can it be done, but is it the best path. I now lean in the direction that it is better to just start a new movement. I have tried to flesh parts of generative visions.
Spot on in your analysis but I don’t know if it’s fixable. I have suggested many times on this forum that we need to bake moral anti-realism into the core of the movement (which as you state probably does nothing). Ironically I think one of the core (but maybe not so novel) lessons of uppercase EA is that decentralization breeds fanaticism in a social movement if it financially exists in a larger extremely unequal society (even if the members are insanely thoughtful and bayesian). Some form of centralization is required to conform evolutionary value drift into something closer to ideal reflection.
There are many paths, but unfortunately all of them require state capacity and culture. We would need some sort of political system to enforce the financial regulations that stops the gravity of the wealth-weighted dominant aesthetics from consuming the meta idea of ea (lower case ea). And probably a bunch of other things. But this is hard, there are 3 camps of main resistance.
(1) the pureThose who believe counting is not politics but math.
(2) the pragmatic
Those who believe decentralization is good for the movement
(3) the de jure
Those who believe decentralization is good for their career, usually because it continues the default out status quo of who current has power
Together this coalition is sizable. I’m not sure exactly how much and maybe a vocal minority but I’d reckon at least 30%. Let’s assume the rest of the movement is at least weakly in favor of centralization. But I think that 30% is more like a 50-70 percent in the hubs of oxford, dc, sf (just speculating here). These parts of the movement have not just money but better organization as well. The remaining 70% are spread throughout the world and It’s not clear how they might at current coordinate to force some sort of constitution.
Your functional path 5s are good ideas, but again who exactly is doing or paying for them? maybe you can convince someone rich right now, or maybe you can go build these projects, but there is nothing legally or politically forced and the Egregore will eat it up all the same. Anything short of a real politically binding set of laws and delineation between members and non members seems like window dressing to me. But increasingly I think even if this would get passed I wonder if the ea infra is best left as is and new young people try to just start a more functionally agnostic version of the movement. That’s at least some of the essence of post-rats, though they never meant for that to be a big tent idea.
I feel mixed about ai-writing detection for a few reasons. I have very few issues with someone putting the bullet points of their argument into ai and then reading/editing/discussing a few times the response and letting the ai write it. I also think there is value in just putting your messy thoughts out as you have them and not having everything polished, but it depends on the situation.
Also separately I’m worried ai-writing detector proliferation will just speed up “immunity”. I don’t think there is something deep and fundamental that stops ai from writing e.g. exactly what I have written to this point. You can already download all your writing, ask ai to summarize it, make a text file that precisely describes your style and then ask the ai to write something in your voice. I’ve done this and yes they still have a bit of that vanilla llm feel but if there is actual market demand for solutions it doesn’t seem like this is an insurmountable problem.I think people should say when they used ai and to what degree and there should be an expectation that just because polished writing is cheaper than it used to be you will not pollute the forum with things that you have not thought an appropriate amount about.
[Question] Is benchmarking AI capabilities positive EV?
FWIW I went to the best (or second best lol) high school in Chicago, Northside, and tbh the kids at these top city highschools are of comparable talent to the kids at northwestern, with a higher tail as well. More over everyone has way more time and can actually chew on the ideas of EA. There was a jewish org that sent an adult once a week with food and I pretty much went to all of them even tho i would barely even self identify as jewish because of the free food and somewhere to sit and chat about random stuff while I waited for basketball practice.
So yes I think it would be highly successful. But I think you would need adult actual staff to come at least every other week (as brian mentioned) and as far as I can tell EA is currently struggling pretty hard w/organizing capacity and it seems to be getting worse (in part because as I have said many times, we don’t celebrate organizers enough and we focus the movement too much on intellectualism rather than coordination and organizing). So I kind of doubt there is a ton of capacity for this. But if there is it’s a good idea. I’m happy to help you understand how you could implement this at CPS selective enrollment schools if you want to help do it yourself.
Thank you for doing this, love to see some data.
I don’t have high familiarity with METR but I think it is probably not great data for this type of analysis. Few issues or clarifications would be needed (and anyone who understands METR better bear with me/or correct me on my mistakes plz).
1. How does METR handle context windows? Are we doing a rolling window? Compact? something else?
How much of this inverse quadratic relationship is just caused by longer tasks having a larger used context window for the back half of the run? How much is caused by a lack of a default information management system that persists?2. What is the exact harness(es) METR is using?
Harness/enviornment engineering/information management might control more of the cost of long running SWE projects than iq (past a point).
3. Does METR allow repo forking? routing?
In the future no 180 iq ai is building the ORM and buttons for a crud app. They are either forking a boilerplate or routing it to a cheaper model.
etc.
It is said that the current iteration of models suffer from retrograde amnesia. Whether or not this will get bitter lesson pilled is a separate question, but for this class of memento models, version control, information management, context management, and the meta process of improving and routing through the best versions of these combos for a specific task is not some side quest but in fact the main route to making long tasks cheaper. Even as we enter the next paradigm of models that don’t have such profound short term memory loss, a huge part of cost reduction will come from the orchestrator meta planning about how much to explore the space of options/build out the software factory / vs actually starting the work.I’m not denying the core question OP is raising — costs could plausibly be rising and could matter a lot. I’m just not convinced this specific curve cleanly isolates “AI economics” from “how expensive a particular scaffold/set of arbitrary constraints makes long-context work.”
yea doesn’t arc leaderboard have somewhat opposing trends? https://arcprize.org/leaderboard
might want to check out this (only indirectly related but maybe useful).
Personally don’t mind o-risk think it has some utility but s-risk ~somewhat seems like it still works here. An O-risk is just a smaller scale s-risk no?
What do people think of the idea of pushing for a constitutional convention/amendment? The coalition would be ending presidential immunity + reducing the pardon powers + banning stock trading for elected officials. Probably politically impossible but if there were ever a time it might be now.
Dario Amodei on AI risk and Anthropic’s approach (“The Adolescence of Technology”)
tldr; wrote some responses to sections, don’t think I have an overall point. I think this line of argumentation deserves to be taken seriously but think this post is maybe trying to do too much at once. The main argument is simply cluelessness + short term positive EV.
In virtually every other area of human decision-making, people generally accept without much argument that the very-long-term consequences of our actions are extremely difficult to predict.
I’m a little confused what your argumentative technique is here. Is that fact that most humans do something the core component here? Wouldn’t this immediately disqualify much of what EAs work on? Or is this just a persuasive technique, and you mean ~ “most humans think this for reason x. I also think this for reason x, though the fact most humans think it matters little to me.”
For me, most humans do x is not an especially convincing argument of something.I don’t want to get bogged down on cluelessness because there are many lengthy discussions elsewhere but I’ll say that cluelessness depends on the question. If you told me what the rainforest looked like and then asked me to guess the animals I wouldn’t have a chance. If you asked me to guess if they ate food and drank water I think I would do decent. Or a more on the nose example. If you took me back to 5 million years ago and asked me to guess what would happen to the chimps if humans came to exist, I wouldn’t be able to predict much specifics, but I might be able to predict (1) humans would become the top dog, and with less certainty (2) chimp population would go down and with even less certainty (3) chimps will go extinct. That’s why the horse model gets so much play, people have some level of belief that there are certain outcomes that might be less chaotic if modeled correctly.
To wrap up I think your first 4 paragraphs could be shortened to your unique views on cluelessness (specifically wrt ai?) + discount rates/whatever other unique philosophical axioms you might hold.
Understood in this way, AI does not actually pose a risk of astronomical catastrophe in Bostrom’s sense.
To be clear, neither does the asteroid. Aliens might exist and our survival similarly presents a risk of replacement for all the alien civs that won’t have time to biologically evolve as (humans or ai from earth) speed through the lightcone. Also even if no aliens, we have no idea if conditional on humans being grabby, utility is net positive or negative. There isn’t even agreement on this forum or in the world on if there is such a thing as a negative life or not. Don’t think i’m arguing against you here but feels like you are being a little loose here (don’t want to be too pedantic as I can totally understand if you are writing for a more general audience).
Now, you might still reasonably be very concerned about such a replacement catastrophe. I myself share that concern and take the possibility seriously. But it is crucial to keep the structure of the original argument clearly in mind. … Even if you accept that killing eight billion people would be an extraordinarily terrible outcome, it does not automatically follow that this harm carries the same moral weight as a catastrophe that permanently eliminates the possibility of 10^23 future lives.
Well I have my own “values”. Just because I die doesn’t mean these disappear. I’d prefer that those 10^23 lives aren’t horrifically tortured for instance.
Though I say this with extremely weak confidence, I feel like in the case where a “single agent/hivemind” misaligned ai immediately wipes us all out, I’m thinking they probably are not going to convert resources into utility as efficiently as me (by my current values), and thus this might be viewed as an s-risk. I’m guessing you might say that we can’t possibly predict that, but then can we even predict if those 10^23 lives will be positive or negative? if not I guess i’m not sure why you brought any of this up anyway. Bostrom’s whole argument predicates on the assumption that earth descended life is + ev, which predicates on not being clueless or having a very kumbaya pronatal moral philosophy.
So I guess even better for you, from my POV you don’t even need to counter argue this.
Virtually every proposed mechanism by which AI systems might cause human extinction relies on the assumption that these AI systems would be extraordinarily capable, productive, or technologically sophisticated.
I might not be especially up to date here. Can’t it like cause a nuclear fallout etc? totalitarian lock in? the matrix? Extreme wealth and power disparity? is there agreement that the only scenarios in which our potential is permanently curtailed the terminator flavors?
The reason is that a decade of delayed progress would mean that nearly a billion people will die from diseases and age-related decline who might otherwise have been saved by the rapid medical advances that AI could enable. Those billion people would have gone on to live much longer, healthier, and more prosperous lives.
You might need to flesh this out a bit more for me because I don’t think it’s as true as you said. Is the claim here that AI will (1) invent new medicine or (2) replace doctors or (3) improve US healthcare policy?
(1) Drug development pipelines are excruciatingly long and mostly not because of a lack of hypotheses. For instance, https://pmc.ncbi.nlm.nih.gov/articles/PMC10786682/ GLP-1 have been in the pipelines for half a century (though debatably with better ai some of the nausea stuff could have been figured out quicker). IL-23 connection to IBD/Crohns was basically known ~2000 as it was one of the first/most significant single nucleotide mutations picked up with GWAS phenotype/genotype studies. Yet Skyrizi only hit the market a few years ago. Assuming ai could instantly just invent the drugs, IIRC it’s a minimum of like 7 years to get approval. That’s absolute minimum. And likely even super intelligent AI is gonna need physical labs, iteration, make mistakes, etc.
Assuming sufficient AGI in 2030 for this threshold, we are looking at early 2040s before we start to see significant impact on the drugs we use, although it’s possible AI will usher a new era of repurposes drug cocktails via extremely good lit review (although IMO the current tools might already be enough to see huge benefits here!).
(2) Doctors, while overpaid, still only make up like 10-15% of healthcare costs in the US. I do think ai will end up being better than them, although whether people will quickly accept this idk. So you can get some nice savings there, but again that’s assuming you just break the massive lobbying power they have. And beyond the costs, tons of the most important health stuff is already widely known among the public. Stuff like don’t smoke cigarettes, don’t drink alcohol, don’t be fat, don’t be lonely. People still fail to do this stuff. Not an information problem. Further doctors often know when they are overprescribing useless stuff, often just an incentives problem. No good reason to think AI will break this trend unless you are envisioning a completely decentralized or single payer system that uses all ai doctors, both are at least partially political issues not intelligence. And if we are talking solid basic primary care for the developing world, I just question how smart the ai needs to be. I’d assume a 130 iq llm with perfect vision and full knowledge of medical lit would be more than sufficient, and that seems like it will be the next major gemini release?
(3) will leave this for now.
Kinda got sidetracked here and will leave this comment here for now because so long, but I guess takeaway from this section: You can’t claim cluelessness on the harms and then assume the benefits are guaranteed.
2 thoughts here just thinking about persuasiveness. I’m not quite sure what you mean by normal people and also if you still want your arguments to be actually arguments or just persuasion-max.
show don’t tell for 1-3
For anyone who hasn’t intimately used frontier models but is willing to with an open mind, I’d guess you should just push them to use and actually engage mentally with them and their thought traces, even better if you can convince them to use something agentic like CC.
Ask and/or tell stories for 4
What can history tell us about what happens when a significantly more tech savy/powerful nation finds another one?
no “right” answer here though the general arc of history is that significantly more powerful nations capture/kill/etc.
What would it be like to be a native during various european conquests in the new world (esp ignoring effects of smallpox/disease to the extent you can)?
Incan perspective? Mayan?
I especially like Orellena’s first expedition down the amazon. As far as I can tell, Orellena was not especially bloodthirsty, had some interest/respect for natives. Though he is certainly misaligned with the natives.
Even if Orellana is “less bloodthirsty,” you still don’t want to be a native on that river. You hear fragmented rumors—trade, disease, violence—with no shared narrative; you don’t know what these outsiders want or what their weapons do; you don’t know whether letting them land changes the local equilibrium by enabling alliances with your enemies; and you don’t know whether the boat carries Orellana or someone worse.
Do you trade? attack? flee? coordinate? Any move could be fatal, and the entire situation destabilizes before anyone has to decide “we should exterminate them.”
and for all of these situations you can actually see what happened (approximately) and usually it doesn’t end well.
Why is AI different?
not rhetorical and gives them space to think in a smaller, more structured way that doesn’t force an answer.
Just finding about about this & crux website. So cool. Would love to see something like this for charity ranking (if it isn’t already somewhere on the site).
Don’t you need a philosophy axioms layer between outputs and outcomes? Existential catastrophe definitions seems to be assuming a lot of things.
Would also need to think harder about why/in what context i’m using this but “governance” being a subcomponent when it’s arguably more important/ can control literally everything else at the top level seems wrong.
Thanks for the post — There is definitely a certain fuzziness at times about value claims in the movement and I have been critical at times of similar things. Also chatgpt edited this but (nearly) all thoughts are my own hope that’s ok!
I see a few threads here that are easy to blur:
1) Metaethics (realism vs anti-realism) is mostly orthogonal to Ideal Reflection.
You can be a realist or anti-realist and still endorse (or reject) a norm like “defer to what an idealized version of you would believe, holding evidence fixed.” Ideal Reflection doesn’t have to claim there’s a stance-independent EV “out there”; it can be a procedural claim about which internal standpoint is authoritative (idealized deliberation vs current snap judgment), and about how to talk when you’re trying to approximate that standpoint. I’m not saying you claimed the opposite exactly but language was a bit confusing to me at times.2) Ideal Reflection is a metanormative framework; EA is basically a practical operationalization of it.
Ideal Reflection by itself is extremely broad. But on its own it doesn’t tell you what you value, and it doesn’t even guarantee that you can map possible world-histories to an ordinal ranking. It might seem less hand-wavy but its lack of assumptions makes it hard to see what non trivial claims can follow. Once you add enough axioms/structure to make action-guiding comparisons possible (some consequentialist-ish evaluative ranking, plus willingness to act under uncertainty), then you can start building “upward” from reflection to action.
It also seems to me (and is part of what makes EA distinctive) that EA ecosystem was built by unusually self-reflective people — sometimes to a fault — who tried hard to notice when they were rationalizing, to systematize their uncertainty, and to actually let arguments change their minds.On that picture, EA is a specifc operationalization/instance of Ideal Reflection for agents who (a) accept some ranking over world-states/world-histories, and (b) want scalable, uncertainty-aware guidance about what to do next.
3) But this mainly helps with the “upward” direction; it doesn’t make the “downward” direction easier.
I think of philosophy as stacked layers: at the bottom are the rules of the game; at the top is “what should I do next.” EA (and the surrounding thought infrastructure) clarifies many paths upward once you’ve committed to enough structure to compare outcomes. But it’s not obvious why building effective machinery for action gives us privileged access to bedrock foundations. People have been trying to “go down” for a long time. So in practice a lot of EAs seem to do something like: “axiomatize enough to move, then keep climbing,” with occasional hops between layers when the cracks become salient.4) At the community level, there’s a coordination story that explains the quasi-objective EV rhetoric and the sensitivity to hidden axioms.
Even among “utilitarians,” the shape of the value function can differ a lot — and the best next action can be extremely sensitive to those details (population ethics, welfare weights across species, s-risk vs x-risk prioritization, etc.). Full transparency about deep disagreements can threaten cohesion, so the community ends up facilitating a kind of moral trade: we coordinate around shared methods and mid-level abstractions, and we get the benefits of specialization and shared infrastructure, even without deep convergence.It’s true—Institutionally, founder effects + decentralization + concentrated resources (in a world with billionaires) create path dependence: once people find a lane and drive — building an org, a research agenda, a funding pipeline — they implicitly assume a set of rules and commit resources accordingly. As the work becomes more specific, certain foundational assumptions become increasingly salient, and it’s easy for implicit axioms to harden and complexify over time. To some extent you can say that is what happened, although on the object level it feels like we have picked pretty good stuff to work on in my view. And charitably, when 80k writes fuzzy definitions of the good, it isn’t necessarily that the employees and org don’t have more specific values, it’s that they think its better to leave it at the level of abstraction to build the best coalition right now. And also that they are trying to help you build up from what you have to making a decision.
I don’t see this strain of argument as particularly action relevant. I feel like you are getting way to caught up in the abstractions of what “agi” is and such. This is obviously a big deal, this is obviously going to happen “soon” and/or already “happening”, it’s obviously time to take this very serious and act like responsible adults.
Ok so you think “AGI” is likely 5+ years away. Are you not worried about anthropic having a fiduciary responsibility to it’s shareholders to maximize profits? I guess reading between the lines you see very little value in slowing down or regulating AI? While leaving room for the chance that our whole disagreement does revolve around our object level timeline differences, I think you probably are missing the forrest from the trees here in your quest to prove the incorrectness of people with shorter timelines.
I am not a doom maximilist in the sense that I think this technology is already profoundly world-bending and scary today. I am worried about my cousin becoming a short form addicted goonbot with an AI best friend right now—whether or not robot bees are about to gorge my eyes out.
I think there are a reasonably long list of sensible regulations around this stuff (both x-risk related and more minor stuff) that would probably result in a large drawdown in these companies valuations and really the stock market at large. For example but not limited to—AI companionship, romance, porn should probably be on a pause right now while the government performs large scale AB testing, the same thing we should have done with social media and cellphone use especially in children that our government horribly failed to do because of its inability to utilitize RCTs and the absolute horrifying average age of our president and both houses of congress.
it’s quite easy, I actually already did it with printful + shopify. I stalled out because (1) I realized it’s much more confusing to deal with all the copyright stuff and stepping on toes (I don’t want to be competing with ea itself or ea orgs and didn’t feel like coordinating with a bunch of people. (2) you kind of get raked using a easy fully automated stack. Not a big deal but with shipping hoodies end up being like 35-40 and t shirts almost 20. I felt like given the size of EA we should probably just buy a heat press or embroidery machine since we probably want to produce 100s+.
Anyway feel free to reach out and we can chat!
here is the example site I spun up, again not actually trying to sell those products was just testing if I could do it https://tp403r-fy.myshopify.com/
Thank you for writing this. To be honest, I’m pretty shocked that the main discussions around the anthropic IPO have been about “patient philanthropy” concerns and not the massive, earth shattering conflicts of interest (both for us as non-anthropic members of the EA community and for anthropic itself which will now have a “fiduciary responsibility”). I think this shortform does a pretty good job summarizing my concern. The missing mood is big. I also just have a sense that way too many of us are living in group houses and attending the same parties together, and that AI employees are included in this + I think if you actually hear those conversations at parties they are less like “man I am so scared” and more like “holy shit that new proto-memory paper is sick”. Conflicts of interest, nepotism, etc. are not taken seriously enough by the community and this just isn’t a new problem or something I have confidence in us fixing.
Without trying to heavily engage in a timelines debate, I’ll just say it’s pretty obvious we are in go time. I don’t think anyone should be able to confidently say that we are more than a single 10x or breakthrough away from machines being smarter than us. I’m not personally huge in beating the horn for pause AI, I think there are probably better ways to regulate than that. That being said, I genuinely think it might be time for people to start disclosing their investments. I’m paranoid about everyone’s motives (including my own).You are talking about the movement scale issues, with the awareness that crashing anthropic stock could crash ea wealth. That’s charitable but let’s be less charitable—way too many people here have yolo’d significant parts of their networth on ai stocks, low delta snp/ai calls, etc. and are still holding the bag. Assuming many of you are anything like me, you feel in your brain that you want the world to go well, but I personally feel happier when my brokerage account goes up 3% than when I hear news that AI timelines are way longer than we thought because xyz.
Again kind of just repeating you here but I think it’s important and under discussed.
I don’t really understand this perspective. Let me try to make sure I’m understanding you.
(1) Anthropic wrote a company policy/governance document that claimed something
(2) This document was the foundation of much of the communities and companies perspective on how to think about and interact with AI safety, including making major donations and career choices. There are large irreversable path dependencies here.
(3) the document always felt quite dubious to you, to the point where it felt like it wouldn’t hold the whole time, whether purposely or due to a lack of clarity on anthropics part (I agree completely!)
(4) While this wasn’t all 100% predicatable write when rsp was written, it surely has become increasingly obvious to anthropic leadership for months at this point. Nothing that has happened in the last 6 months is all that surprising and in fact basically right on trend, and dario has stated this himself many times. Yet anthropic continued to wait, taking in significantly more funding and increasingly roping in huge swaths of this community, and only when it was literally the case that they were about to violate their own document (or already had), they change it.
(5) This makes you feel better than if they kept lying/decieving/whatever more charitable word that could be used here.
is this approximately your perspective? Obviously I’m throwing my own biasing perspective in here and apologies if I’m misinterpretting.
I mean sure in a trivial sense I feel better about them doing 5. Taking a step back, it really barely matters and is beside the point. Them admiting 5 It’s just a natural segway for us to discuss 1-4. Nothing they say about their own commitments matter anymore really. Incentives matter.
FWIW though I am still highly confused on if anthropic is net positive or negative and quite open to despite all of this thinking we should still be throwing our weight completely behind them.