Mechanism Design for AI Safety—Agenda Creation Retreat
Mechanism Design for AI Safety (MDAIS) has been running a reading group since last summer to discuss how principles of mechanism design can be applied to AI Safety. Sign-up for the reading group can be found here, and past readings can be found here.
Based on the quality of the discussions, we are optimistic about the possibility of mechanism design tools helping with AI safety, and are setting out to create an agenda that outlines promising directions. Our plan is to jumpstart this with a retreat that will facilitate in-person brainstorming and developing ideas. The application form is here(see below for details), and you can participate without having joined the MDAIS reading group.
Retreat Details
Where
The retreat will take place in Miami, Florida. Travel funding is available for attendees. A remote attendance option may be added if multiple qualified applicants indicate they would be unable to attend in person.
When
Retreat events will begin in the early afternoon on Friday, March 17th and continue to the late afternoon on Sunday March 19th. Lodging will be provided from Thursday to Sunday night, and social events may take place on those evenings.
Who
Most participants in the reading group are PhD students or post-docs. We expect a similar group makeup for the retreat, but welcome applications from people with research experience who fall outside that demographic. Capacity at the retreat will be approximately ten people.
What
In the weeks leading up to the retreat, participants will be encouraged to brainstorm ideas and share background readings. The primary activity at the retreat will be alternating between developing ideas individually or in small groups, and presenting these ideas to the broader group for feedback. An initial rough writeup of the agenda will take place at the retreat itself, which will be added to and edited over the following weeks.
Other activities will include icebreakers and conversations around AI more broadly. Senior researchers have been invited to call in and discuss their personal research and provide advice on creating an agenda. A number of fun social activities are also being planned.
Application Logistics
The application form can be found here. It should be relatively quick to complete, less than five minutes on average. Applications will close on Sunday, February 19th at 11:59pm PST, and acceptances will be sent out by February 24th. If you need a decision earlier than that, please indicate so in your application and we will try to get back to you within two days.
Killer. Psyched about the reading group. Miami overlaps with a surgery for me unfortunately.
Worried about a theory of change for anything in this direction, always feels a little “if the users would just …” to me. Worried that no matter how right we can become about the perfect incentive machine of a meticulously designed market, assuming we solved everything about calibration concerns and how we know we’re right, we wouldn’t be able to do the product engineering to back it up, wouldn’t get adoption, etc.
Seems like this is a case where timelines research is deeply actionable. Borrowing a little from Soares’ objection to his model of Critch, I think the “crunch time” for mechdzn strategies leads deployment by an order of decades.