Currently pursuing a PhD at the “Mathematics for Our Future Climate” CDT at Reading University.
Previously MSc in applied mathematics/theoretical ML.
Not really active here—racism, Rationality and weirdness in the movement are so bad they made me give up on it.
How can you “solve every possible jailbreak”? And is it worth it crippling large-scale research into safeguarding from future AI because of fears about what the current models might be capable of?
(My own answer is “maybe”. It depends on how bad you think current models are for society—pretty bad in my opinion—vs. how likely you think it is an existentially-threatening AI will actually be born out of the current efforts).