I guess a few quick responses to each, although I haven’t read through your links yet.
I think agenty systems in general can still be very limited in how competent they are, due to the same data/training bottlenecks, even if you integrate a non-agential AGI into the system.
I did see Ajeya’s post and read Rohin’s summary. I think there might not be any one most reasonable prior for compute necessary for AGI (or whether hitting some level of compute is enough, even given enough data or sufficiently complex training environments), since this will need to make strong and basically unjustified assumptions about whether current approaches (or the next approaches we will come up with) can scale to AGI. Still, this doesn’t mean AGI timelines aren’t short; it might just means you should do a sensitivity analysis on different priors when you’re thinking of supporting or doing certain work. And, of course, they did do such a sensitivity analysis for the timeline question.
In response to this specifically, “As for whether we’d shut it off after we catch it doing dangerous things—well, it wouldn’t do them if it thought we’d notice and shut it off. This effectively limits what it can do to further its goals, but not enough, I think.”, what other kinds of ways do you expect it would go very badly? Is it mostly unknown unknowns?
I guess a few quick responses to each, although I haven’t read through your links yet.
I think agenty systems in general can still be very limited in how competent they are, due to the same data/training bottlenecks, even if you integrate a non-agential AGI into the system.
I did see Ajeya’s post and read Rohin’s summary. I think there might not be any one most reasonable prior for compute necessary for AGI (or whether hitting some level of compute is enough, even given enough data or sufficiently complex training environments), since this will need to make strong and basically unjustified assumptions about whether current approaches (or the next approaches we will come up with) can scale to AGI. Still, this doesn’t mean AGI timelines aren’t short; it might just means you should do a sensitivity analysis on different priors when you’re thinking of supporting or doing certain work. And, of course, they did do such a sensitivity analysis for the timeline question.
In response to this specifically, “As for whether we’d shut it off after we catch it doing dangerous things—well, it wouldn’t do them if it thought we’d notice and shut it off. This effectively limits what it can do to further its goals, but not enough, I think.”, what other kinds of ways do you expect it would go very badly? Is it mostly unknown unknowns?
Well, I look forward to talking more sometime! No rush, let me know if and when you are interested.
On point no. 3 in particular, here are some relevant parables (a bit lengthy, but also fun to read!) https://www.lesswrong.com/posts/5wMcKNAwB6X4mp9og/that-alien-message
https://www.lesswrong.com/posts/bTW87r8BrN3ySrHda/starwink-by-alicorn
https://www.gregegan.net/MISC/CRYSTAL/Crystal.html (I especially recommend this last one, it’s less relevant to our discussion but a better story and raises some important ethical issues.)