🔶
Esben Kran
Results from the AI x Democracy Research Sprint
Very interesting! We had a submission for the evals research sprint in August last year on the same topic. Check it out here: Turing Mirror: Evaluating the ability of LLMs to recognize LLM-generated text (apartresearch.com)
Demonstrate and evaluate risks from AI to society at the AI x Democracy research hackathon
Join the AI Evaluation Tasks Bounty Hackathon
You are completely right. My main point is that the field of AI safety is under-utilizing commercial markets while commercial AI indeed prioritizes reliability and security to a healthy level.
AI safety needs to scale, and here’s how you can do it
Thank you so much for the talk, Paul! It was exciting to see the vignettes besides the very practical first case. It will be interesting to see the entry of Straumli on the evaluations scene since I think you have a solid case for success.
CoI statement: Straumli donated the prize money for the Governance Sprint, though nothing goes to me or Apart, just the AI safety community.
I work as co-director of Apart Research, specifically with research management, AI safety research consulting, and field-building. I’m entrepreneurially focused.
Thank you for hosting this! I’ll repost a question on Asya’s retrospective post regarding response times for the fund.
our median response time from January 2022 to April 2023 was 29 days, but our current mean (across all time) is 54 days (although the mean is very unstable)
I would love to hear more about the numbers and information here. For instance, how did the median and mean change over time? What does the global distribution look like? The disparity between the mean and median suggests there might be significant outliers; how are these outliers addressed? I assume many applications become desk rejects; do you have the median and mean for the acceptance response times?
Agency Foundations Challenge: September 8th-24th, $10k Prizes
I was incredibly impressed by the tables of numbers in their impact evaluation. After conversing with the team, I’ve witnessed their high ability to produce results, and their evaluation research methods certainly attest to this. This appears to be one of those rare opportunities where donations could have a significant counterfactual impact.
Edit: I am not in any way affiliated with FEM and randomly met one of the co-founders on a flight where we had a conversation about their work.
Thank you for sharing your reflections and for the work you’ve done on the EA Funds, Asya! I appreciate the role the Funds have played over the past years.
our median response time from January 2022 to April 2023 was 29 days, but our current mean (across all time) is 54 days (although the mean is very unstable)
A few questions arise from your mention of the Funds’ response times. I would love to hear more about the numbers and information here. For instance, how did the median and mean change over time? What does the global distribution look like? The disparity between the mean and median suggests there might be significant outliers; how are these outliers addressed? I assume many applications become desk rejects; do you have the median and mean for the acceptance response times?
- 1 Sep 2023 7:31 UTC; 22 points) 's comment on Long-Term Future Fund Ask Us Anything (September 2023) by (
The focus of FLI on lethal autonomous weapons systems (LAWS) generally seems like a good and obvious framing for a concrete extinction scenario. Currently, a world war will without a doubt use semi-autonomous drones with the possibility of a near-extinction risk from nuclear weapons.
A similar war in 2050 seems very likely to use fully autonomous weapons under a development race, leading to bad deployment practices and developmental secrecy (without international treaties). With these types of “slaughterbots”, there is the chance of dysfunction (e.g. misalignment) leading to full eradication. Besides this, cyberwarfare between agentic AIs might lead to broad-scale structural damage and for that matter, the risk of nuclear war brought about through simple orders given to artificial superintelligences.
The main risks to come from the other scenarios mentioned in the replies here are related to the fact that we create something extremely powerful. The main problems arise from the same reasons that one mishap with a nuke or a car can be extremely damaging while one mishap (e.g. goal misalignment) with an even more powerful technology can lead to even more unbounded (to humanity) damage.
And then there are the differences between nuclear and AI technologies that make the probability of this happening significantly higher. See Yudkowsky’s list.
This a unique, interesting and simple proposal I have not seen presented in academic form yet. With the development of the article, you’ll of course need to change the framing of a few sections to introduce the idea, the viability, along with the multi-purpose potential of the proposal.
Despite unlikely effective enforcement of the policy, it seems like a valuable idea to publish. Combining it with newer work in GPU monitoring firmware (Shavit, 2023) and your own proposals for required GPU server tracking.
To comment on kpurens comment, carbon taxation was a non-political issue before it became contentious and if the lobbying hadn’t hit as hard, it seems like there would be a larger chance for a global carbon tax. At the same time, compute governance seems more enforceable because of the centralization of data centers.
Join the AI governance and interpretability hackathons!
Announcing the European Network for AI Safety (ENAIS)
The CE incubatees are an absolutely amazing bunch and exactly the types of people I would want on these world-bettering projects. Charity Entrepreneurship is no doubt one of the EA projects I am most excited about due to pure impact, research prowess and future potential.
When I compare the CE program to YC (see also OWID@YC), it feels even better due to the great co-founder matching process and the success rate along with the excellence within the focus areas (people don’t come with their own esoteric tech startups).
For other commenters who talk about the use of reach numbers instead of e.g. QALY or WELLBYs, having seen some of their spreadsheets and programs, I am in deep awe and respect of some of the superheroes that have saved hundreds of lives (with many of these being counterfactual given the effectiveness and new vectors of impact) though I’ll let them summarize their numbers.
I am very excited about where the projects will be in one and even five years and commend everyone involved.
Answering on behalf of Apart Research!
We’re a non-profit research and community-building lab with a strategic target on high-volume frontier technical research. Apart is currently raising a round to run the lab throughout 2025 and 2026 but here I’ll describe what your marginal donation may enable.
In just two years, Apart Research has established itself as a unique and efficient part of the AI safety ecosystem. Our research output includes 13 peer-reviewed papers published since 2023 at top venues including NeurIPS, ICLR, ACL, and EMNLP, with six main conference papers and nine workshop acceptances. Our work has been cited by OpenAI’s Superalignment team, and our team members have contributed to significant publications like Anthropic’s “Sleeper Agents” paper.
With this track record, we’re able to capitalize on our position as an AI safety lab and mobilize our work to impactful frontiers of technical work in governance, research methodology, and AI control.
Besides our ability to accelerate a Lab fellow’s research career at an average direct cost of around $3k, enable research sprint participants for as little as $30, and enable growth at local groups at similar high price/impact ratios, your marginal donation can enable us to run further impactful projects:
Donate to Apart ResearchImproved access to our program ($7k-$25k): Professional rewamp of our website and documentation would make our programs and research outputs more accessible to talented researchers worldwide. Besides our establishment as a lab through our paper acceptances, a redesign will help us cater even more to institutional funding and technical professionals, which will help scale our impact through valuable counterfactual funding and talent discovery. At the higher end, we will also be able to make our internal resources publicly available. These resources are specifically designed to accelerate AI safety technical careers.
Higher conference attendance support ($20k): Currently, we only support one fellow per team to attend conferences. Additional funding would enable a second team member to attend, at approximately $2k per person.
Improving worldview diversity in AI safety ($10k-$20k): We’ve been working on all continents now and find a lot of value in our approach to enable international and underrepresented professional talent (besides our work at organizations such as 7 of the top 10 universities). With this funding, you would enable more targeted outreach from Apart’s side and existing lab members’ participation in conferences to discuss and represent AI safety to otherwise underrepresented professional groups.
Continuing impactful research projects ($15k-$30k): We will be able to extend timely and critical research projects. For instance, we’re looking to port our cyber-evaluations work to Inspect, making it a permanent part of UK AISI catastrophic risk evaluations. Our recent paper also finds novel methods to test whether LLMs game public benchmarks and we would like to expand the work to run the same test on other high-impact benchmarks while making the results more accessible. These projects have direct impacts on AI evaluation methodology but we see other opportunities like this for expanding projects at reasonable follow-up costs.
You’ll be supporting a growing organization with the Apart Lab fellowship already doubling from Q1′24 to Q3′24 (17 to 35 fellows) and our research sprints having moved thousands closer to AI safety.
Given current AGI development timelines, the need to scale and improve safety research is urgent. In our view, Apart seems like one of the better investments to reduce AI risk.
If this sounds interesting and you’d like to hear more (or have a specific marginal project you’d like to see happen), my inbox is open.