Critiques of non-existent AI safety labs: Yours
“Starting a company is like chewing glass. Eventually, you start to like the taste of your own blood.”
Building a new organization is extremely hard. It’s hard when you’ve done it before, even several times. It’s even harder the first time.
Some new organizations are very similar to existing organizations. The founders of the new org can go look at all the previous closeby examples, learn from them, copy their playbook and avoid their mistakes. If your org is shaped like a Y-combinator company, you can spend dozens of hours absorbing high-quality, expert-crafted content which has been tested and tweaked and improved over hundreds of companies and more than a decade. You can do a 15 minute interview to go work next to a bunch of the best people who are also building your type of org, and learn by looking over their shoulder and troubleshooting together. You get to talk to a bunch of people who have actually succeeded building an org-like-yours.
How likely is org building success, in this premier reference class, rich with prior examples to learn from, with a tried and true playbook, a tight community of founder peers, the advice of many people who have tried to do your kind of thing and won?
5%.
https://pitchbook.com/news/articles/y-combinator-accelerator-success-rate-unicorns
An AI safety lab is not the same as a Y-combinator company.
It is. WAY. FUCKING. HARDER.
Y-combinator crowd has a special category for orgs which are trying build something that requires > ~any minor research breakthrough: HARD tech.
Yet the vast majority of these Hard Tech companies are actually building on top of an academic field which basically has the science figured out. Ginkgo Bioworks did not need to figure out the principles of molecular biology, nor the tools and protocols of genetic engineering. They took the a decades old, well-developed paradigm, and worked within it to incrementally build something new.
How does this look for AI safety?
And how about timing. Y-combinator reference class companies take a long time to build. Growing headcount slowly, running lean: absolutely essential if you are stretching out your last funding round over 7 years to iterate your way from a 24 hour livestream tv show of one guy’s life to a game streaming company.
Remind me again, what are your timelines?
I could keep going on this for a while. People? Fewer. Funding? Monolithic. Advice from the winners? HA.
Apply these updates to our starting reference class success rate of
ONE. IN. TWENTY.
Now count the AI safety labs.
Multiply by ~3.
That is the roughly the number of people who are not the subject of this post.
For all the rest of us, consider several criticisms and suggestions, which were not feasible to run by the subjects of this post before publication
0. Nobody knows what they are fucking doing when founding and running an AI safety lab and everyone who says they do is lying to you.
1. Nobody has ever seen an organization which has succeeded at this goal.
2. Nobody has ever met the founder of such an organization, nor noted down their qualifications.
3. If the quote at the top of this post doesn’t evoke a visceral sense memory for you, consider whether you have an accurate mental picture of what it looks like and feels like to be succeeding at this kind of thing from the inside. Make sure you imagine having fully internalized that FAILURE IS YOUR FAULT and no one else’s, and are defining success correctly. (I believe it should be “everyone doesn’t die” rather than “be highly respected for your organization’s contributions” or “avoid horribly embarrassing mistakes”.)
4. If that last bit feels awful and stress inducing, I expect that is because it is. Even for and especially for the handfulls of people who are not the subjects of this post. So much so that I’m guessing that whatever it is that allows people to say “yes” to that responsibility is the ~only real qualification to adding a one to the number of AI safety labs we counted earlier.
5. You have permission. You do not need approval. You are allowed to do stupid things, have no relevant experience, be an embarrassing mess, and even ~*~fail to respond criticism~*~
6. Some of us know what it looks like to be chewing glass, and we have tasted our own blood. We know the difference between the continuous desperate dumpster fires and the real mistakes. We will be silently cheering you through the former and grieving with you on the latter. Sometimes we will write you a snarky post under a pseudonym when we really should be sleeping.
522 companies went through Y-combinator over the last year. Imagine that.
Thank you for reading this loveletter to the demeaning occupation of desperately trying. It’s addressed to you, if you’d like.
I am confused about what your claims are, exactly (or what you’re trying to say).
One interpretation, which makes sense to me, is the following
I really like and appreciate this point. Speaking for me personally, I too often fall into the trap of criticising someone for doing something not perfectly and not 1. Appreciating that they have tried at all and that it was potentially really hard, and 2. Criticising all the people who didn’t do anything and chose the safe route. There is a good post about this: Invisible impact loss (and why we can be too error-averse).
In addition, I think it could be a valid point to say that we should be more understanding if e.g. the research agendas of AIS labs are/were off in the past as this is a problem that no one really knows how to solve and that is just very hard. I don’t really feel qualified to comment on that.
Your post could also be claiming something else:
For instance, you seem to claim that the reference class of people who can advise people working on AI safety is some group whose size is the number of AI safety labs multiplied by 3. (This is what I understand your point to be if I look at the passage that starts with “Some new organizations are very similar to existing organizations. The founders of the new org can go look at all the previous closeby examples, learn from them, copy their playbook and avoid their mistakes.” and ends in “That is the roughly the number of people who are not the subject of this post.”)
If this is what you want to say, I think the message is wrong in important ways. In brief:
I agree that when people work on hard and important things, we should appreciate them, but I disagree that we should avoid criticism of work like this. Criticism is important precisely when the work matters. Criticism is important when the problems are strange and people are probably making mistakes.
The strong version of “they’re doing something that no one else has done before … they don’t have anyone to learn from” seems to take a very narrow reference class for a broad set of ways to learn from people. You can learn from people who aren’t doing the exact thing that you’re doing.
1. A claim like: “We should not criticise / should have a very high bar for criticizing AI safety labs / their founders (especially not if you yourself have not started an AIS lab).”
As stated above, I think it is important to appreciate people for trying at all, and it’s useful to notice that work not getting done is a loss. That being said, criticism is still useful. People are making mistakes that others can notice. Some organizations are less promising than others, and it’s useful to make those distinctions so that we know which to work in or donate to.
In a healthy EA/LT/AIS community, I want people to criticise other organisations, even if what they are doing is very hard and has never been done before. E.g. you could make the case that what OP, GiveWell, and ACE are doing has never been done before (although it is slightly unclear to me what exactly “doing something that has never been done before” means), and I don’t think anyone would say that those organisations should be beyond criticism.
This ties nicely into the second point I think is wrong:
2. A claim like: “they’re doing something that no one else has done before … they don’t have anyone to learn from”
A quote from your post:
A point I think you’re making:
“They are doing something that no one else has done before [build a successful AI safety lab], and therefore, if they make mistakes, that is way understandable because they don’t have anyone to learn from.”
It is true that the closer your organisation is to an already existing org/cluster of orgs, the more you will be able to copy. But just because you’re working on something new that no one has worked on (or your work is different in other important aspects), it doesn’t mean that you cannot learn from other organisations, their successes and failures. For things like having a healthy work culture, talent retention, and good governance structures, there are examples in the world that even AIS labs can learn from.
I don’t understand the research side of things well enough to comment on whether/how much AIS labs could learn from e.g. academic research or for-profit research labs working on problems different from AIS.
Hey, sorry I’m in a rush and couldn’t read your whole comment. I wanted to jump in anyway to clarify that you’re totally right to be confused about what my claims are. I wasn’t trying to make claims, really, I was channelling an emotion I had late at night into a post that I felt compelled to hit submit on. Hence: “loveletter to the demeaning occupation of desperately trying”
I really value the norms of discourse here, their carefulness, modestness, and earnestness. From the skim of your comment I’m guessing after a closer read I’d think it was a great example of that, which I appreciate.
I don’t expect I’ll manage to rewrite this post in the way which makes everything I believe clear (and I’m not sure that would be very valuable for others if I did)
FWIW, I most read the core message of this post as: “you should start an AI safety lab. What are you waiting for? ;)”.
The post felt to me like debunking reasons people might feel they aren’t qualified to start an AI safety lab.
I don’t think this was the primary intention though. I feel like I came away with that impression because of the Twitter contexts in which I saw this post referenced.
Seems like academic research groups would be a better reference class than YC companies for most alignment labs.
If they’re trying to build an org that scales a lot, and is funded by selling products, YC companies is a good reference class, but if they’re an org of researchers working somewhat independently or collaborating on hard technical problems, funded by grants, that sounds much more similar to an academic research group.
Unsure how to define success for an academic research group, any ideas? They seem to more often be exploratory and less goal-oriented.
As someone who did recently set up an AI safety lab, success rates have certainly been on my mind. It’s certainly challenging, but I think the reference class we’re in might be better than it seems at first.
I think a big part of what makes succeeding as a for-profit tech start-up challenging is that so many other talented individuals are chasing the same, good ideas. For every Amazon there are 1000s of failed e-commerce start-ups. Clearly, Amazon did something much better than the competition. But what if Amazon didn’t exist? What if there was a company that was a little more expensive, and had longer shipping times? I’d wager that company would still be highly successful.
Far fewer people are working on AI safety. That’s a bad thing, but it does at least mean that there’s more low-hanging fruit to be tapped. I agree with [Adam Binks](https://forum.effectivealtruism.org/posts/PJLx7CwB4mtaDgmFc/critiques-of-non-existent-ai-safety-labs-yours?commentId=eLarcd8no5iKqFaNQ) that academic labs might be a better reference class. But even there, AI safety has had far less attention paid to it than e.g. developing treatments for cancer or unifying quantum mechanics and general relativity.
So overall it’s far from clear to me that it’s harder to make progress on AI safety than solve outstanding challenge problems in academia, or in trying to make a $1 bn+ company.
Thanks for writing this. It felt a bit like an AI safety version of Roosevelt’s Man in the arena
I’m honestly not sure whether this is an argument in support of AI labs or against?
it’s roughly in support of AI labs, particularly scrappier ones.
~65% of charity entrepreneurship charities are at least moderately successful, with half of those being very successful. They’re probably a closer reference class, being donor-funded organisations run by EAs for impact.
One way in which AI safety labs are different than the reference class of Y-combinator startups is in their impact. Conditioned on the median Forum user’s assessment of X-risk from AI, the leader of a major AI safety lab probably has more impact that the median U.S. senator, Fortune 500 CEO, or chief executive of smaller regional or even national governments, etc. Those jobs are hard in their own ways, but we expect and even encourage an extremely high amount of criticism.
I am not suggesting that is the proper reference class for leaders of AI labs that have raised at least $10MM . . . and I don’t think it is. But I think the proper scope of criticism is significantly higher than for (e.g.) the median CEO whose company went through Y Combinator.[1] If a startup CEO messes up and their company explodes, the pain is generally going to be concentrated in the company’s investors, lenders, and employees . . . a small number of people, each of whom who consented to bearing that risk to a significant extent. If I’m not one of those people, my standing to complain about the startup CEO’s mistakes is significantly constrained.
In contrast, if an AI safety lab goes off the rails and becomes net-negative, that affects us all (and futute generations). Even if the lab is merely ineffective, its existence would have drained fairly scarce resources (potential alignment researchers and EA funding) from others in the field.
I definitively agree that people need to be sensitive to how hard running an AI safety lab is, but also want to affirm that the idea of criticism is legitimate.
To be clear, I don’t think Anneal’s post suggests that this is the reference class for deciding how much criticism of AI lab leaders is warranted. However, since I didn’t see a clear reference class, I thought it was worthwhile to discuss this one.
Fail early, fail often. Many little dooms are good. One big doom is not so good.