Bogdan Ionut Cirstea

Karma: 106

Bogdan Ionut Cirstea’s Quick takes

Bogdan Ionut CirsteaMar 7, 2025, 10:33 AM

2 points

1 comment EA link

Bogdan Ionut Cirstea Mar 7, 2025, 10:33 AM
10 points
4 ∶ 1
on: Bogdan Ionut Cirstea’s Quick takes
There have been numerous scandals within the EA community about how working for top AGI labs might be harmful. So, when are we going to have this conversation: contributing in any way to the current US admin getting (especially exclusive) access to AGI might be (very) harmful?
[cross-posted from X and LessWrong]

Bogdan Ionut Cirstea Apr 6, 2024, 11:13 AM
3 points
1 ∶ 2
on: Open Philanthropy: Our Progress in 2023 and Plans for 2024
From the linked post:
As a result of our internal process, we decided to keep that new higher bar, while also aiming to roughly double our GCR spending over the next few years — if we can find sufficiently cost-effective opportunities.
At first glance, this seems potentially ‘wildly conservative’ to me, if I think of what this implies for the AI risk mitigation portion of the funding and how this intersects with (shortening) timelines [estimates].
My impression from looking briefly at recent grants is that probably ⇐ 150M$ was spent by Open Philanthropy on AI risk mitigation during the past year. A doubling of AI risk spending would imply ⇐ 300M$ / year.
AFAICT (including based on non-public conversations / information), at this point, median forecasts for something like TAI / AGI are very often < 10 years, especially from people who have thought the most about this question. And a very respectable share of those people seem to have < 5 year medians.
Given e.g. https://www.bloomberg.com/billionaires/profiles/dustin-a-moskovitz/, I assume, in principle, Open Philanthropy could spend > 20B$ in total. So 150M$ [/ year] is less than 1% of the total portfolio and even 300M$ [/ year] would be < 2%.
X-risk estimates from powerful AI vs. from other sources often have AI take more than half of the total x-risk (e.g. estimates from ‘The Precipice’ have AI take ~10% of ~17% for x-risk during the next ~100 years).
Considering all the above, the current AI risk mitigation spending plans seem to me far too conservative.
I also personally find it pretty unlikely that there aren’t decent opportunities to spend > 300M$ / year (and especially > 150M$ / year), given e.g. the growth in the public discourse about AI risks; and that some plans could potentially be [very] scalable in how much funding they could take in, e.g. field-building, non-mentored independent research, or automated AI safety R&D.
Am I missing something (obvious) here?
(P.S.: my perspective here might be influenced / biased in a few ways here, given my AI risk mitigation focus, and how that intersects / has intersected with Open Philanthropy funding and career prospects.)

Bogdan Ionut Cirstea Jul 23, 2023, 5:33 PM
1 point
0 ∶ 0
in reply to: Bogdan Ionut Cirstea’s comment on: Could someone help me understand why it’s so difficult to solve the alignment problem?
I’ll also note that the role of the Constitution in Constitutional AI (https://www.anthropic.com/index/claudes-constitution) seems quite related to your 3rd paragraph.

Bogdan Ionut Cirstea Jul 23, 2023, 8:30 AM
3 points
1 ∶ 0
on: Could someone help me understand why it’s so difficult to solve the alignment problem?
I think you’re on to something and some related thoughts are a significant part of my research agenda. Here are some references you might find useful (heavily biased towards my own thinking on the subject), numbered by paragraph in your post:
1. There’s a lot of cumulated evidence of significant overlap between LM and human linguistic representations, scaling laws of this phenomenon seem favorable and LM embeddings have also been used as a model of shared linguistic space for transmitting thoughts during communication. I interpret this as suggesting outer alignment will likely be solved by default for LMs.
2. I think I disagree quite strongly that “We don’t know how to get an AI system’s goals to robustly ‘point at’ objects like ‘the American people’ … [or even] simpler physical systems.”, e.g. I suspect many alignment-relevant concepts (like ‘Helpful, Harmless, Honest’) are abstract and groundable in language, see e.g. Language is more abstract than you think, or, why aren’t languages more iconic?. Also, the previous point (brain-LM comparisons), as well as LM performance, suggest the linguistic grounding is probably already happening to a significant degree.
3. Robustness here seems hard, see e.g. these references on shortcuts in in-context learning (ICL) / prompting: https://arxiv.org/abs/2303.03846 https://arxiv.org/abs/2305.17256 https://arxiv.org/abs/2305.13299 https://arxiv.org/abs/2305.14950 https://arxiv.org/abs/2305.19148. An easier / more robust target might be something like ‘be helpful’. Though I agree in general ICL as Bayesian inference (see e.g. http://ai.stanford.edu/blog/understanding-incontext/ and follow the citation trail, there are a lot of recent related works) suggests that the longer the prompt, the more likely it would be to ‘locate the task’.

Bogdan Ionut Cirstea Jul 14, 2023, 9:59 AM
4 points
0 ∶ 0
on: What new psychology research could best promote AI safety & alignment research?
There seems to be a nascent field in academia of using psychology tools/methods to understand LLMs, e.g. https://www.pnas.org/doi/10.1073/pnas.2218523120; it might be interesting to think about the intersection of this with alignment e.g. what experiments to perform, etc.
Maybe more on the neuroscience side, I’d be very excited to see (more) people think about how to build a neuroconnectionist research programme for alignment (I’ve also briefly mentioned this in the linkpost).

Bogdan Ionut Cirstea Feb 2, 2023, 11:32 AM
5 points
0 ∶ 0
in reply to: MichaelDickens’s comment on: We’re no longer “pausing most new longtermist funding commitments”
Maybe, though e.g. combined with

it would still result in a high likelihood of very short timelines to superintelligence (there can be inconsistencies between Metaculus forecasts, e.g. with

as others have pointed out before). I’m not claiming we should only rely on these Metaculus forecasts or that we should only plan for [very] short timelines, but I’m getting the impression the community as a whole and OpenPhil in particular haven’t really updated their spending plans with respect to these considerations (or at least this hasn’t been made public, to the best of my awareness), even after updating to shorter timelines.

Bogdan Ionut Cirstea Jan 31, 2023, 11:18 AM
38 points
17 ∶ 9
on: We’re no longer “pausing most new longtermist funding commitments”
Can you comment a bit more on how the specific number of years (20 and 50) were chosen? Aren’t those intervals [very] conservative, especially given that AGI/TAI timeline estimates have shortened for many? E.g., if one took seriously the predictions from

wouldn’t it be reasonable to also have scenarios under which you might want to spend at least the AI risk portfolio in something like 5-10 years instead? Maybe this is covered somewhat by ‘Of course, we can adjust our spending rate over time’, but I’d still be curious to hear more of your thoughts, especially since I’m not aware of OpenPhil updates on spending plans based on shortened AI timelines, even after e.g. Ajeya has discussed her shortened timelines.

Bogdan Ionut Cirstea Sep 29, 2022, 10:57 AM
3 points
0 ∶ 0
on: EA & LW Forums Weekly Summary (19 − 25 Sep 22′)
Thanks, this series of summaries is great! Minor correction: DeepMind released Sparrow (not OpenAI).

Bogdan Ionut Cirstea Apr 26, 2022, 6:26 PM
1 point
0 ∶ 0
on: [$20K In Prizes] AI Safety Arguments Competition
’One metaphor for my headspace is that it feels as though the world is a set of people on a plane blasting down the runway:
And every time I read commentary on what’s going on in the world, people are discussing how to arrange your seatbelt as comfortably as possible given that wearing one is part of life, or saying how the best moments in life are sitting with your family and watching the white lines whooshing by, or arguing about whose fault it is that there’s a background roar making it hard to hear each other.
I don’t know where we’re actually heading, or what we can do about it. But I feel pretty solid in saying that we as a civilization are not ready for what’s coming, and we need to start by taking it more seriously.′ (Holden Karnofsky)

Bogdan Ionut Cirstea Apr 26, 2022, 6:07 PM
1 point
0 ∶ 0
on: [$20K In Prizes] AI Safety Arguments Competition
‘If you know the aliens are landing in thirty years, it’s still a big deal now.’ (Stuart Russell)

Bogdan Ionut Cirstea Apr 26, 2022, 5:59 PM
1 point
0 ∶ 0
on: [$20K In Prizes] AI Safety Arguments Competition
‘Before the prospect of an intelligence explosion, we humans are like small children playing with a bomb. Such is the mismatch between the power of our plaything and the immaturity of our conduct. Superintelligence is a challenge for which we are not ready now and will not be ready for a long time. We have little idea when the detonation will occur, though if we hold the device to our ear we can hear a faint ticking sound. For a child with an undetonated bomb in its hands, a sensible thing to do would be to put it down gently, quickly back out of the room, and contact the nearest adult. Yet what we have here is not one child but many, each with access to an independent trigger mechanism. The chances that we will all find the sense to put down the dangerous stuff seem almost negligible. Some little idiot is bound to press the ignite button just to see what happens.’ (Nick Bostrom)

Bogdan Ionut Cirstea Apr 26, 2022, 5:57 PM
1 point
0 ∶ 0
on: [$20K In Prizes] AI Safety Arguments Competition
‘Let an ultraintelligent machine be defined as a machine that can far surpass all the intellectual activities of any man however clever. Since the design of machines is one of these intellectual activities, an ultraintelligent machine could design even better machines; there would then unquestionably be an ‘intelligence explosion,’ and the intelligence of man would be left far behind. Thus the first ultraintelligent machine is the last invention that man need ever make provided that the machine is docile enough to tell us how to keep it under control.′ (I. J. Good)

Bogdan Ionut Cirstea Apr 26, 2022, 5:56 PM
2 points
0 ∶ 0
on: [$20K In Prizes] AI Safety Arguments Competition
‘The AI does not hate you, nor does it love you, but you are made out of atoms which it can use for something else.’ (Eliezer Yudkowsky)

Bogdan Ionut Cirstea Apr 26, 2022, 5:56 PM
0 points
0 ∶ 0
on: [$20K In Prizes] AI Safety Arguments Competition
‘You can’t fetch the coffee if you’re dead’ (Stuart Russell)

Bogdan Ionut Cirstea Mar 19, 2022, 6:29 PM
7 points
0 ∶ 0
on: Career Advice: Philosophy + Programming → AI Safety
Consider applying for https://www.eacambridge.org/agi-safety-fundamentals

Bogdan Ionut Cirstea Dec 13, 2021, 9:46 AM
4 points
0 ∶ 0
on: Nines of safety: Terence Tao’s proposed unit of measurement of risk
If I remember correctly (from ‘The Precipice’) ‘Unaligned AI ~1 in 50 1.7’ should actually be ‘Unaligned AI ~1 in 10 1’.

If Superpositions can Suffer

EricBlairOct 21, 2021, 8:37 PM

11 points

9 comments6 min readEA link

Bogdan Ionut Cirstea Sep 22, 2021, 1:00 PM
12 points
0 ∶ 0
on: Why AI alignment could be hard with modern deep learning
From https://www.cold-takes.com/supplement-to-why-ai-alignment-could-be-hard/ : ‘A model about as powerful as a human brain seems like it would be ~100-10,000 times larger than the largest neural networks trained today, and I think could be trained using an amount of data and computation that—while probably prohibitive as of August 2021 -- would come within reach after 15-30 years of hardware and algorithmic improvements.’ Is it safe to assume that this is an updated, shorter timeline compared to https://www.alignmentforum.org/posts/KrJfoZzpSDpnrv9va/draft-report-on-ai-timelines ?

Bogdan Ionut Cirstea Sep 19, 2021, 8:10 PM
6 points
0 ∶ 0
on: What kind of event, targeted to undergraduate CS majors, would be most effective at getting people to work on AI safety?
Encouraging them to apply to the next round of the AGI Safety Fundamentals program https://www.eacambridge.org/agi-safety-fundamentals might be another idea. The curriculum there can also provide inspiration for reading group materials.

Bogdan Ionut Cirstea

Bog­dan Ionut Cirstea’s Quick takes

If Su­per­po­si­tions can Suffer

Bogdan Ionut Cirstea’s Quick takes

If Superpositions can Suffer