Xander123

Karma: 437

Xander123 6 Dec 2022 7:34 UTC
1 point
0 ∶ 0
in reply to: Zoe Williams’s comment on: Update on Harvard AI Safety Team and MIT AI Alignment
Thanks for this! Want to note that this was co-authored by 7 other people (the names weren’t transferred when it was crossposted from LW).

Xander123 22 Jul 2022 10:44 UTC
19 points
0 ∶ 0
in reply to: hb574’s comment on: Without specific countermeasures, the easiest path to transformative AI likely leads to AI takeover
I’m pretty unconvinced that your “suggests a significant number of fundamental breakthroughs remain to achieve PASTA” is strong enough to justify the odds being “approximately 0,” especially when the evidence is mostly just expecting tasks to stay hard as we scale (something which seems hard to predict, and easy to get wrong). Though it does seem that innovation in certain domains may lead to long episode lengths and inaccurate human evaluation, it also seems like innovation in certain fields (e.g., math) could easily not have this problem (i.e., in cases where verifying is much easier than solving).

Xander123 8 Apr 2022 2:02 UTC
2 points
0 ∶ 0
in reply to: Marcel D’s comment on: ‘Hot takes’ from EAGx Boston
Worth noting that (1) the AST is for people already planning to go into alignment after graduating (and isn’t an intro program), and (2) I usually have backups prepared in case people have already read the thing (I don’t think showing up 30 minutes in would be great!).