So if we take as given that I am at 53% and Alice is at 45% that gives me some reason to do longtermist outreach, and gives Alice some reason to try to stop me, perhaps by making moral trades with me that get more of what we both value. In this case, cluelessness doesnât bite as Alice and I are still taking action towards our longtermist ends.
However, I think what you are claiming, or at least the version of your position that makes most sense to me, is that both Alice and I would be making a failure of reasoning if we assign these specific credence, and that we should both be âsuspending judgementâ. And if I grant that, then yes it seems cluelessness bites as neither Alice or I know at all what to do now.
So it seems to come down to whether we should be precise Bayesians.
Re judgment calls, yes I think that makes sense, though Iâm not sure it is such a useful category. I would think there is just some spectrum of arguments/âpieces of evidence from âvery well empirically grounded and justifiedâ through âwe have some moderate reason to think soâ to âwe have roughly no ideaâ and I think towards the far right of this spectrum is what we are labeling judgement calls. But surely there isnât a clear cut-off point.
Thanks for this, I hadnât thought much about the topic and agree it seems more neglected than it should be. But I am probably overall less bullish than you (as operationalised by e.g. how many people in the existential risk field should be making this a significant focus: I am perhaps closer to 5% than your 30% at present).
I liked your flowchart on âInputs in the AI application pipeline,â so using that framing:
Learning algorithms: I agree this is not very tractable for us[1] to work on.
Training data: This seems like a key thing for us to contribute, particularly at the post-training stage. By supposition, a large fraction of the most relevant work on AGI alignment, control, governance, and strategy has been done by âusâ. I could well imagine that it would be very useful to get project notes, meetings, early drafts etc as well as the final report to train a specialised AI system to become an automated alignment/âgovernance etc researcher.
But my guess is just compiling this training data doesnât take that much time. All it takes is when the time comes you convince a lot of the relevant people and orgs to share old google docs of notes/âdrafts/âplans etc paired with the final product.
There will be a lot of infosec considerations here, so maybe each org will end up training their own AI based on their own internal data. I imagine this is what will happen for a lot of for-profit companies.
Making sure we donât delete old draft reports and meeting notes and things seems good here, but given storing google docs is so cheap and culling files is time-expensive, I think by default almost everyone just keeps most of their (at least textual) digital corpus anyway. Maybe there is some small intervention to make this work better though?
Compute: It certainly seems great for more compute to be spent on automated safety work versus automated capabilities work. But this is mainly a matter of how much money each party has to pay for compute. So lobbying for governments to spend lots on safety compute, or regulations to get companies to spend more on safety compute seems good, but this is a bit separate/âupstream from what you have in mind I think, it is more just âget key people to care more about safetyâ.
Post-training enhancements: we will be very useful for providing RLHF to tell a budding automated AI safety researcher how good each of its outputs is. Research taste is key here. This feels somewhat continuous with just âmanaging a fleet of AI research assistantsâ.
UI and complementary technologies: I donât think we have a comparative advantage here, and can just outsource this to human or AI contractors to build nice apps for us, or use generic apps on the market and just feed in our custom training data.
In terms of which applications to focus on, my guess is epistemic tools and coordination-enabling tools will mostly be built by default (though of course as you note additional effort can still speed them up some). E.g. politicians and business leaders and academics would all presumably love to have better predictions for which policies will be popular, what facts are true, which papers will replicate etc. And negotiation tools might be quite valuable for e.g. negotiating corporate mergers and deals.
So my take is that probably a majority of the game here is in âautomated AI safety/âgovernance/âstrategyâ because there will be less corporate incentive here, and it is also our comparative advantage to work on.
Overall, I agree differential AI tool development could be very important, but think the focus is mainly on providing high-quality training data and RLHF for automated AI safety research, which is somewhat narrower than what you describe.
Iâm not sure how much we actually disagree though, would be interested in your thoughts!
Throughout, I use âusâ to refer broadly to EA/âlongtermist/âexistential security type folks.