OscarD🔸

Karma: 1,721

OscarD🔸May 1, 2025, 7:17 AM
7 points
3 ∶ 0
on: Debate: should EA avoid using AI art outside of research?
Should EA avoid using AI art for non-research purposes?
Seems somewhat epistemically toxic to give in to a populist backlash against AI art if I don’t buy the arguments for it being bad myself.

OscarD🔸Apr 17, 2025, 6:03 PM
4 points
0 ∶ 0
on: AI Tools for Existential Security
I just remembered another sub-category that seems important to me: AI-enabled very accurate lie detection. This could be useful for many things, but most of all for helping make credible commitments in high-stakes US-China ASI negotiations.

OscarD🔸Apr 17, 2025, 5:48 PM
5 points
1 ∶ 0
in reply to: calebp’s comment on: Creating ‘Making God’: a Feature Documentary on risks from AGI
Thanks Caleb, very useful. @ConnorA I’m interested in your thoughts re how to balance comms on catastrophic/existential risks and things like Deepfakes. (I don’t know about the particular past efforts Caleb mentioned, and I think I am more open to comms of Deepfakes being useful to develop a broader coalition, even though deepfakes are a tiny fraction of what I care about wrt AI.)

OscarD🔸Apr 16, 2025, 6:39 PM
7 points
1 ∶ 0
on: Creating ‘Making God’: a Feature Documentary on risks from AGI
Have you applied to LTFF? Seems like the sort of thing they would/should fund. @Linch @calebp if you have actually already evaluated this project I would be interested in your thoughts as would others I imagine! (Of course, if you decided not to fund it, I’m not saying the rest of us should defer to you, but it would be interesting to know and take into account.)

OscarD🔸Apr 16, 2025, 12:42 PM
4 points
0 ∶ 0
in reply to: guneyulasturker 🔸’s comment on: Summary of Epoch’s AI timelines podcast
Unclear—as they note early on, many people have even shorter timelines than Ege, so not representeative in that sense. But probably many of the debates are at least relevant axes people disagree on.

OscarD🔸Apr 13, 2025, 1:22 PM
2 points
0 ∶ 0
in reply to: Yarrow’s comment on: Summary of Epoch’s AI timelines podcast
o1-pro!

OscarD🔸Apr 12, 2025, 12:01 PM
4 points
0 ∶ 0
on: Is it 3 Years, or 3 Decades Away? Disagreements on AGI Timelines
Here is a long AI summary of the podcast.

OscarD🔸Apr 10, 2025, 1:25 PM
5 points
1 ∶ 0
in reply to: Holly Elmore ⏸️ 🔸’s comment on: Selling out to AI companies is bad. Period. You will be corrupted.
If these people weren’t really helping the companies it seems surprising salaries are so high?

OscarD🔸Apr 10, 2025, 1:21 PM
5 points
2 ∶ 0
on: Enough about AI timelines— we already know what we need to know.
I think I directionally agree!
One example of timelines feeling very decision-relevant is for people who are looking to specialise in partisan influence, you might want to specialise far more in Republicans the larger your credence in TAI/ASI by Jan 2029. Whereas for longer timelines on priors Democrats have a ~50% chance of controlling the presidency from 2029, so specialising in Dem political comms could make more sense.

OscarD🔸Apr 6, 2025, 5:09 PM
3 points
0 ∶ 0
on: Advice on Advice: A Framework For Evaluating Advice
Of course criticism is only a partially overlapping set with advice, but this post reminded me a bit of this take on giving and receiving criticism.

OscarD🔸Apr 5, 2025, 5:36 PM
10 points
0 ∶ 0
on: The AI Adoption Gap: Preparing the US Government for Advanced AI
I overall agree we should prefer USG to be better AI-integrated. I think this isn’t a particularly controversial or surprising conclusion though, so I think the main question is how high a priority this is, and I am somewhat skeptical it is on the ITN pareto frontier. E.g. I would assume plenty of people care about government efficiency and state capacity generally, and a lot of these interventions are generally about making USG more capable rather than too targeted towards longtermist priorities.
So this felt like neither the sort of piece targeted to mainstream US policy folks, nor that convincing for why this should be an EA/longtermist focus area. Still, I hadn’t thought much about this before, and so doing this level of medium-depth investigation feels potentially valuable, but I’m unconvinced that e.g. OP should spin up a grantmaker focused on this (not that you were necessarily recommending this).
Also, a few reasons govts may have a better time adopting AI come to mind:
- Access to large amounts of internal private data
- Large institutions can better afford one-time upfront costs to train or finetune specialised models, compared to small businesses
But I agree the opposing reasons you give are probably stronger.
we should do what we normally do when juggling different priorities: evaluate the merits and costs of specific interventions, looking for “win-win” opportunities
If only this were how USG juggled its priorities!

OscarD🔸Apr 3, 2025, 12:27 PM
2 points
0 ∶ 0
in reply to: titotal’s comment on: Will explosive growth stem primarily from AI R&D automation?
Yes, this seems right, hard to know which effect will dominate. I’m guessing you could assemble pretty useful training data of past R&D breakthroughs which might help, but that will only get you so far.

OscarD🔸Apr 1, 2025, 10:20 AM
11 points
2 ∶ 0
on: 80,000 Hours: Job Board → Job Birds
Clearly only IBBIS should be allowed to advertise on the job board from now on, impeccable marketing skills @Tessa A 🔸 :)

OscarD🔸Mar 25, 2025, 4:02 PM
4 points
1 ∶ 0
in reply to: NickLaing’s comment on: EA Survey 2024: How People Get Involved in EA
This seems to be out of context?

OscarD🔸Mar 23, 2025, 4:01 PM
2 points
0 ∶ 0
in reply to: Owen Cotton-Barratt’s comment on: AI Tools for Existential Security
Yeah I think I agree with all this; I suppose since ‘we’ have the AI policy/strategy training data anyway that seems relatively low effort and high value to do, but yes if we could somehow get access to the private notes of a bunch of international negotiators that also seems very valuable! Perhaps actually asking top forecasters to record their working and meetings to use as training data later would be valuable, and I assume many people already do this by default (tagging @NunoSempere). Although of course having better forecasting AIs seems more dual-use than some of the other AI tools.

OscarD🔸Mar 23, 2025, 3:49 PM
2 points
0 ∶ 0
in reply to: Owen Cotton-Barratt’s comment on: AI Tools for Existential Security
Yes, I suppose I am trying to divide tasks/projects up into two buckets based on whether they require high context and value-alignment and strategic thinking and EA-ness. And I think my claim was/is that UI design is comparatively easy to outsource to someone without much of the relevant context and values. And therefore the comparative advantage of the higher-context people is to do things that are harder to outsource to lower-context people. But I know ~nothing about UI design, maybe being higher context is actually super useful.

OscarD🔸Mar 19, 2025, 7:40 PM
13 points
0 ∶ 0
on: Moral error as an existential risk
Nice post! I agree moral errors aren’t only a worry for moral realists. But they do seem especially concerning for realists, as the moral truth may be very hard to discover, even for superintelligences. For antirealists, the first 100 years of a long reflection may get you most of the way to where your views will converge towards after a billion years of reflecting on your values. But the first 100 years of a long reflection are less guaranteed to get you close to the realist moral truth. So a 100-years-reflection is e.g. 90% likely to avoid massive moral errors for antirealists, but maybe only 40% likely to do so for realists.
--
Often when there are long lists like this I find it useful for my conceptual understanding to try to create some scructure to fit each item into, here is my attempt.
A moral error is making a moral decision that is quite suboptimal. This can happen if:
- The agent has correct moral views, but makes a failure of judgement/rationality/empirics/decision theory and so chooses badly by their own lights.
- The agent is adequately rational, but has incorrect views about ethics, namely the mapping from {possible universe trajectories} to {impartial value}. This could take the form of:
  - A mistake in picking out who is a moral patients, {universe trajectory} --> {moral patients}. (animals, digital beings)
  - A mistake in assigning lifetime wellbeing scores to each moral patient {moral patients} --> {list of lifetime wellbeing}. (theories of wellbeing, happiness vs suffering)
  - A mistake in aggregating correct wellbeing scores over the correct list of moral patients into the overall impartial value of the universe {list of lifetime wellbeings + possibly other relevant facts} --> {impartial value}. (population ethics, diversity, interestingness)
--
Some minor points:
- I think the fact that people wouldn’t take bets involving near-certain death and a 1-in-a-billion chance of a long amazing life is more evidence about people being risk averse than that lifetime wellbeing is bounded above.
- As currently written, choosing Variety over Homogeneity would only be a small moral error, not a massive one, as epsilon is small.

OscarD🔸Mar 19, 2025, 4:19 PM
2 points
0 ∶ 0
on: Three Types of Intelligence Explosion
Great set of posts (including the ‘how far’ and ‘how sudden’ related ones). I only skimmed the parts I had read drafts of, but still have a few comments, mostly minor:
1. Accelerating progress framing
We define “accelerating AI progress” as “each increment of capability advancement (e.g. GPT-3 → GPT-4) happens more quickly than the last”.
I am a bit skeptical of this definition, both because it is underspecified, and I’m not sure it is pointing at the most important thing.
- Underspecified: how many GPT jumps need to be in the ‘each quicker than the last’ regime? This seems more than just a semantic quibble, as clearly the one-time speedup leads to at least one GPT jump being faster, and the theoretical limits lead to this eventually stopping, but I’m not sure where within this you want to call it ‘accelerating’.
- Framing: Basically, we are trying to condense a whole graph into a few key numbers, so this might be quite loss-y and we need to focus on the variables that are most strategically important, which I think are:
  - Timeline: date that transition period starts
  - Suddenness: time in transition period
  - Plateau height: in effective compute, defining the plateau as when rate of progress drops back below 2025 levels.
  - Plateau date: how long it takes to get there.
  - I’m not sure there is an important further question of whether the line is curving up or down between the transition period and the plateau (or more precisely, when it transitions from curving up (as in the transition period) to curving down (as in the plateau)). I suppose ‘accelerating’ could include plateauing quite quickly, and decelerating could include still going very fast and reaching a very high plateau quickly, which to most people wouldn’t intuitively feel like ‘deceleration’.
2. Max rate of change
Theoretical limits for the speed of progress are 100X as fast as recent progress.
It would be good to flag in the main text that the justification for this is in Appendix 2 (initially I thought it was a bare asertion). Also, it is interesting that in @kokotajlod’s scenario the ‘wildly superintelligent’ AI maxes out at 1 million-fold AI R&D speedup; I commented to them on a draft that this seemed implausibly high to me. I have no particular take on whether 100x is too low or too high as the theoretical max, but it would be interesting to work out why there is this Forethought vs AI Futures difference.
3. Error in GPT OOMS calculations
- Algorithmic improvements compound multiplicatively rather than additively, so the formula in column G I think should be 3^years rather than 3*years?
  - This also clears up the current mismatch between columns G and H. Most straightforward would be for column H to be log10(G), same as column F. But since log(a^b) = b*log(a), once you make the correction to column G you get out column H = log(3^years) = years * log(3) = 0.48*years. Which is close to what you currently have of 0.4 * years, I assume there was just a rounding error somewhere.
- This won’t end up changing the main results though.
4. Physical limits
Regarding the effective physical limits of each feedback loop, perhaps it is worth noting that your estimates are very well grounded and high-confidence for the chip production feedback loop as we know more or less exactly the energy output of the sun. But the other two are super speculative. Which is fine, they are just quite different types of estimates, so we should remember to rely far less on them.
5. End of the transition period
- Currently, this is set at when AIs are almost as productive (9/10) as humans, but it would make more sense to me to end it when AIs are markedly superior to humans, e.g. 10x.
  - Maybe I am misunderstanding elasticities though, I only have a lay non-economist’s grasp of them.
- Overall it might be more intuitive to define the transition period in terms of how useful one additional human researcher vs AI researcher is, from human being 10x better to the AI being 10x better.
  - Defining what ‘one AI researcher’ is could be tricky, maybe we could use the pace of human thought in tokens per second as a way to standardise.
(Finally, Fn2 is missing a link.)

OscarD🔸Mar 18, 2025, 7:22 PM
8 points
0 ∶ 0
on: AI Tools for Existential Security
Thanks for this, I hadn’t thought much about the topic and agree it seems more neglected than it should be. But I am probably overall less bullish than you (as operationalised by e.g. how many people in the existential risk field should be making this a significant focus: I am perhaps closer to 5% than your 30% at present).
I liked your flowchart on ‘Inputs in the AI application pipeline,’ so using that framing:
- Learning algorithms: I agree this is not very tractable for us^[1] to work on.
- Training data: This seems like a key thing for us to contribute, particularly at the post-training stage. By supposition, a large fraction of the most relevant work on AGI alignment, control, governance, and strategy has been done by ‘us’. I could well imagine that it would be very useful to get project notes, meetings, early drafts etc as well as the final report to train a specialised AI system to become an automated alignment/governance etc researcher.
  - But my guess is just compiling this training data doesn’t take that much time. All it takes is when the time comes you convince a lot of the relevant people and orgs to share old google docs of notes/drafts/plans etc paired with the final product.
    There will be a lot of infosec considerations here, so maybe each org will end up training their own AI based on their own internal data. I imagine this is what will happen for a lot of for-profit companies.
  - Making sure we don’t delete old draft reports and meeting notes and things seems good here, but given storing google docs is so cheap and culling files is time-expensive, I think by default almost everyone just keeps most of their (at least textual) digital corpus anyway. Maybe there is some small intervention to make this work better though?
- Compute: It certainly seems great for more compute to be spent on automated safety work versus automated capabilities work. But this is mainly a matter of how much money each party has to pay for compute. So lobbying for governments to spend lots on safety compute, or regulations to get companies to spend more on safety compute seems good, but this is a bit separate/upstream from what you have in mind I think, it is more just ‘get key people to care more about safety’.
- Post-training enhancements: we will be very useful for providing RLHF to tell a budding automated AI safety researcher how good each of its outputs is. Research taste is key here. This feels somewhat continuous with just ‘managing a fleet of AI research assistants’.
- UI and complementary technologies: I don’t think we have a comparative advantage here, and can just outsource this to human or AI contractors to build nice apps for us, or use generic apps on the market and just feed in our custom training data.
In terms of which applications to focus on, my guess is epistemic tools and coordination-enabling tools will mostly be built by default (though of course as you note additional effort can still speed them up some). E.g. politicians and business leaders and academics would all presumably love to have better predictions for which policies will be popular, what facts are true, which papers will replicate etc. And negotiation tools might be quite valuable for e.g. negotiating corporate mergers and deals.
So my take is that probably a majority of the game here is in ‘automated AI safety/governance/strategy’ because there will be less corporate incentive here, and it is also our comparative advantage to work on.
Overall, I agree differential AI tool development could be very important, but think the focus is mainly on providing high-quality training data and RLHF for automated AI safety research, which is somewhat narrower than what you describe.
I’m not sure how much we actually disagree though, would be interested in your thoughts!
1. ^
  Throughout, I use ‘us’ to refer broadly to EA/longtermist/existential security type folks.

OscarD🔸Mar 18, 2025, 4:48 PM
2 points
0 ∶ 0
in reply to: Jim Buhler’s comment on: An Evolutionary Argument undermining Longtermist thinking?
So if we take as given that I am at 53% and Alice is at 45% that gives me some reason to do longtermist outreach, and gives Alice some reason to try to stop me, perhaps by making moral trades with me that get more of what we both value. In this case, cluelessness doesn’t bite as Alice and I are still taking action towards our longtermist ends.
However, I think what you are claiming, or at least the version of your position that makes most sense to me, is that both Alice and I would be making a failure of reasoning if we assign these specific credence, and that we should both be ‘suspending judgement’. And if I grant that, then yes it seems cluelessness bites as neither Alice or I know at all what to do now.
So it seems to come down to whether we should be precise Bayesians.
Re judgment calls, yes I think that makes sense, though I’m not sure it is such a useful category. I would think there is just some spectrum of arguments/pieces of evidence from ‘very well empirically grounded and justified’ through ‘we have some moderate reason to think so’ to ‘we have roughly no idea’ and I think towards the far right of this spectrum is what we are labeling judgement calls. But surely there isn’t a clear cut-off point.