V. strong need for empirical evidence – Probably no x-risk work meet this bar
Medium need for empirical evidence – I expect the main x-risk things that could meet this bar is policy change or technical safety research, where that work can be shown to be immediately useful in non x-risks type situations (e.g. improving prediction ability, vaccine technologies, etc) as there is some feedback loop.
Weak need for empirical evidence – I expect most x-risk stuff that currently happens meets this bar except for some speculative research with out clear goals or justification (perhaps some of FHI work) or things taken on trust (perhaps choosing to funding AI research where all such research is kept private, like MIRI)
No need for empirical evidence – All x-risk work would meet this bar
(Above I characterised it as a single bar for quality of evidence and once something passes the bar you are good to go but obviously in practice it is not that simple as you will weigh of quality of evidence with other factors: scale, cost-effectiveness, etc.)
The higher you think the bar is the more likely it is that longtermist things and neartermist things will converge. At the very top they will almost certainly converge as you are stuck doing mostly things that can be justified with RCTs or similar levels of evidence. At the medium level convergence seems more likely than at the weak level .
I think the argument is that there are very good reasons to think the bar ought to be very very high, so convergence shouldn’t be that unlikely.
The higher you think the bar is the more likely it is that longtermist things and neartermist things will converge. At the very top they will almost certainly converge as you are stuck doing mostly things that can be justified with RCTs or similar levels of evidence.
I’m not sure anything would be fully justified with RCTs or similar levels of evidence, since most effects can’t be measured in practice (especially far future effects), so we’re left using weaker evidence for most effects or just ignoring them.
The RCTs or other high-rigor evidence that would be most exciting for long-term impact probably aren’t going to be looking at the same evidence base or metrics that would be the best for short-term impact.
Maybe see it as a spectrum. For example:
V. strong need for empirical evidence – Probably no x-risk work meet this bar
Medium need for empirical evidence – I expect the main x-risk things that could meet this bar is policy change or technical safety research, where that work can be shown to be immediately useful in non x-risks type situations (e.g. improving prediction ability, vaccine technologies, etc) as there is some feedback loop.
Weak need for empirical evidence – I expect most x-risk stuff that currently happens meets this bar except for some speculative research with out clear goals or justification (perhaps some of FHI work) or things taken on trust (perhaps choosing to funding AI research where all such research is kept private, like MIRI)
No need for empirical evidence – All x-risk work would meet this bar
(Above I characterised it as a single bar for quality of evidence and once something passes the bar you are good to go but obviously in practice it is not that simple as you will weigh of quality of evidence with other factors: scale, cost-effectiveness, etc.)
The higher you think the bar is the more likely it is that longtermist things and neartermist things will converge. At the very top they will almost certainly converge as you are stuck doing mostly things that can be justified with RCTs or similar levels of evidence. At the medium level convergence seems more likely than at the weak level .
I think the argument is that there are very good reasons to think the bar ought to be very very high, so convergence shouldn’t be that unlikely.
I’m not sure anything would be fully justified with RCTs or similar levels of evidence, since most effects can’t be measured in practice (especially far future effects), so we’re left using weaker evidence for most effects or just ignoring them.
Yes good point. In practice that bar is too high to get much done.
The RCTs or other high-rigor evidence that would be most exciting for long-term impact probably aren’t going to be looking at the same evidence base or metrics that would be the best for short-term impact.