Related: Advantages of Cutting Your Salary
In all seriousness, I think this is a good point
Related: Advantages of Cutting Your Salary
In all seriousness, I think this is a good point
First, predicting the values of our successors – what John Danaher (2021) calls axiological futurism – in worlds where these are meaningfully different from ours doesn’t seem intractable at all. Significant progress has already been made in this research area and there seems to be room for much more (see the next section and the Appendix).
Could you point more specifically to what progress you think has been made? As this research area seems to have only existed since 2021 we can’t have yet made successful predictions about future values so I’m curious what has been achieved.
What is the purpose of publicly deploying Claude? It seems like this will only have the effect of increasing arms race dynamics. If the reason is just to fund further safety research, then I think this is worth saying explicitly.
I also don’t like this post and I’ve deleted most of it. But I do feel like this is quite important and someone needs to say it.
Where in Cambridge will this take place (accommodation / venue)?
Is compensation for both students and mentors?
Will you provide/subsidize access to GPUs?
Disheartening to a hear a pretty weak answer to this critical question. Analysis of his answer:
I’m really not sure what this means and surprised Rob didn’t follow up on this. I think he must mean that they won’t be open sourcing the weights, which is certainly good. However, it’s unclear how much this matters if the model is available to call from an API. The argument may be that other actors can’t fine-tune the model to remove guardrails, which they have put in place to make the model completely safe. I was impressed to hear his claim about jailbreaks later on:
Although strangely he also said:
Which is trivial to disprove, so I’m not sure what he meant by that. Regardless, I think that providing API access to a model distributes a lot of the “power” of the model to everyone in the world.
There hasn’t ever been any very solid rebuttal of the intelligence explosion argument. It mostly gets dismissed of the basis of sounding like sci-fi. You can make a good argument that dangerous capabilities will emerge before we reach this point, and we may have a “slow take-off” in that sense. However, it seems to me that we should expect recursive self-improvement to happen eventually because there is no fundamental reason why it isn’t possible and it would clearly be useful for achieving any task. So the question is whether it will start before or after TAI. It’s pretty clear that no one knows the answer to this question so it’s absurd to be gambling the future of humanity on this point.
The AI race currently consists of a small handful of companies. A CEO who was actually trying to minimize the risk of extinction would at least attempt to coordinate a deceleration between these 4 or 5 actors before dismissing this as a hopeless tragedy of the commons.