Self-teaching myself about AI safety and thinking about how to save the world.
Like to have a chat? Me too! Please reach out / book a meeting: https://calendly.com/simon-skade/30min
Self-teaching myself about AI safety and thinking about how to save the world.
Like to have a chat? Me too! Please reach out / book a meeting: https://calendly.com/simon-skade/30min
Out of the alternative important skills you mentioned, I think many of them are very correlated, and I think the relevant stuff roughly boils down to rationality (and perhaps also ambition).
Being rational itself is also correlated with being an EA and with being intelligent, and overall I think intelligence and rationality (and ambition) are traits that are really strong predictors of impact.
The impact curve is very heavy-tailed, and smarter people can have OOMs more impact than people with 15 IQ points less. So no, I don’t think EA is focusing too much on smart people, indeed, it would surprise me if it had reached a level where it wouldn’t be good to focus even more on intelligence. (Not that I claim I sufficiently argued for this claim, but I can say that this is true in my world model.)
(Not sure if that has been suggested before, but) you should be able to sort comments by magic (the way posts are sorted on the frontpage) or some other better way to combine top+new properties for comments. Otherwise new contributions that are good are read far too rarely, so only very few people will read and upvote them, while the first comments directly receive many upvotes and so get even more upvotes later. Still, upvotes tell a bit about what comments are good, and not everyone wants to read everything.
I would definitely use it myself, but I would strongly suggest also making it the default way comments are sorted.
(That wouldn’t totally remove bad dynamics, but it would be a start.)
Related to the post and very related to this comment is this post: https://www.lesswrong.com/posts/M8cEyKmpcbYzC2Lv5/exercise-taboo-should
The post claims that saying things like should, good or bad, (or other words that carry moral judgement) can often lead to bad reasoning because you fail to anticipate the actual consequences. (I recommend reading this post, or at least the last two sections, the sentence here isn’t really a good summary.)
Actually, some replacements suggested in this post may not help in some cases:
Someone in the EA community should do a specific thing
[...]More people in the EA community should do a specific thing
[...]EA would be better if it had a certain property
[...]An issue is underemphasized by many people in the community, or by large EA institutions
[...]
The problem isn’t that those words are always bad, but that you need to say more specifically why they are bad, or you might miss something. Therefore those sentences should be followed with a “because” or “otherwise” or preceded with a reason, like in this very sentence.
Of course, in some truly obvious cases it is ok to just use “good” or “bad” or synonyms without a more explicit reason, but those words should be warning signs, so you can see if you didn’t reason well.
(Have you noticed how often I used good, bad or should in this comment, and what the fundamental reasons were that I didn’t bother justifying and just accepted as good or bad?) (Also, replacing “good” with something like “useful” or other synonyms doesn’t help, they should still be warning signs.)
AGI will (likely) be quite different from current ML systems.
I’m afraid I disagree with this. For example, if this were true, interpretability from Chris Olah or the Anthropic team would be automatically doomed; Value Learning from CHAI would also be useless, our predictions about forecasting that we use to convince people of the importance of AI Safety equally so.
Wow, the “quite” wasn’t meant that strongly, though I agree that I should have expressed myself a bit clearer/differently. And the work of Chris Olah, etc. isn’t useless anyway, but yeah AGI won’t run on transformers and not a lot of what we found won’t be that useful, but we still get experience in how to figure out the principles, and some principles will likely transfer. And AGI forecasting is hard, but certainly not useless/impossible, but you do have high uncertainties.
Breakthroughs only happen when one understands the problem in detail, not when people float around vague ideas.
Breakthroughs happen when one understands the problem deeply. I think agree with the “not when people float around vague ideas” part, though I’m not sure what you mean with that. If you mean “academia of philosophy has a problem”, then I agree. If you mean “there is no way Einstein could derive special or general relativity mostly from thought experiments”, then I disagree, though you do indeed be skilled to use thought experiments. I don’t see any bad kind of “floating around with vague ideas” in the AI safety community, but I’m happy to hear concrete examples from you where you think academia methodology is better!
(And I do btw. think that we need that Einstein-like reasoning, which is hard, but otherwise we basically have no chance of solving the problem in time.)
What academia does is to ask for well defined problems and concrete solutions. And that’s what we want if we want to progress.
I still don’t see why academia should be better at finding solutions. It can find solutions on easy problems. That’s why so many people in academia are goodharting all the time. Finding easy subproblems of which the solutions allow us to solve AI safety is (very likely) much harder than solving those subproblems.
Notice also that Shannon and many other people coming up with breakthroughs did so in academic ways.
Yes, in history there were some Einsteins in academia that could even solve hard problems, but those are very rare, and getting those brilliant not-goodharting people to work on AI safety is uncontroversially good I would say. But there might be better/easier/faster options than building the academic field of AI safety to find those people and make them work on AI safety.
Still, I’m not saying it’s a bad idea to promote AI safety in academia. I’m just saying it won’t nearly suffice to solve alignment, not by a longshot.
(I think the bottom of your comment isn’t as you intended it to be.)
I must say I strongly agree with Steven.
If you are saying academia has a good track record, then I must say (1) wrong for stuff like ML, where in recent years much (arguably most) relevant progress is made outside of academia, and (2) it may have a good track record for the long history of science, and when you say it’s good at solving problems, sure I think it might solve alignment in 100 years, but we need it in 10, and academia is slow. (E.g. read Yudkowsky’s sequence on science, if you don’t think that academia is slow.)
Do you have some reason why you think that a person can make more progress in academia than elsewhere? I agree that academia has people, and it’s good to get those people, but academia has badly shaped incentives, like (from my other comment): “Academia doesn’t have good incentives to make that kind of important progress: You are supposed to publish papers, so you (1) focus on what you can do with current ML systems, instead of focusing on more uncertain longer-term work, and (2) goodhart on some subproblems that don’t take that long to solve, instead of actually focusing on understanding the core difficulties and how one might address them.” So I expect a person can make more progress outside of academia. Much more, in fact.
Some important parts of the AI safety problem seem to me like they don’t fit well into academia work. There are of course exceptions, people in academia who can make useful progress here, but they are rare. I am not that confident in this, as my understanding of AI safety isn’t that deep, but I’m not just making this up. (EDIT: This mostly overlaps with the first two points I made, that academia is slow and that there are bad incentives, and maybe some other minor considerations about why excellent people (e.g. John Wentworth) may rather choose to not work in academia. What I’m saying is that I think that AI safety is a problem where those obstacles are big obstacles, whereas there might be other fields where those obstacles aren’t thaaat bad.)
There exists the EA Forum feature suggestion thread for such things, though an app may be a special case because it is a rather big feature, but I still think it rather fits there.
We won’t solve AI safety by just throwing a bunch of (ML) researchers on it.
AGI will (likely) be quite different from current ML systems. Also, work on aligning current ML systems won’t be that useful, and generally what we need is not small advancements, but we rather need breakthroughs. (This is a great post for getting started on understanding why this is the case.)
We much rather need a few Paul Christiano level researchers that build a very deep understanding of the alignment problem and then can make huge advances, than we need many still-great-but-not-that-extraordinary researchers.
Academia doesn’t have good incentives to make that kind of important progress: You are supposed to publish papers, so you (1) focus on what you can do with current ML systems, instead of focusing on more uncertain longer-term work, and (2) goodhart on some subproblems that don’t take that long to solve, instead of actually focusing on understanding the core difficulties and how one might address them.
I think paradigms are partially useful and we should probably create some for some specific approaches to AI safety, but I think the default paradigms that would develop in academia are probably pretty bad, so that the research isn’t that useful.
Promoting AI safety in academia is probably still good, but for actually preventing existential risk, we need some other way of creating incentives to usefully contribute to AI safety. I don’t know yet how to best do it, but I think there are better options.
Getting people into AI safety without arguing about x-risk seems nice, but mostly because I think this strategy is useful for convincing people of x-risk later, so they then can work on important stuff.
Another advantage of an app may be that you could download posts, in case you go somewhere where you don’t have Internet access, but I think this is rare and not a sufficient reason to create an app either.
Why should there be one? The EAForum website works great on mobile. So my guess is that there is no EA Forum app because it’s not needed / wouldn’t be that useful, except perhaps for app notifications, but that doesn’t seem that important.
that is likely to contain all the high quality ideas that weren’t funded yet.
No, not at all. I agree that this list is valuable, but I expect there to be many more high quality ideas / important projects that are not mentioned in this list. Those are just a few obvious ideas of what we could do next.
(Btw. you apparently just received a strong downvote while I wrote this. That wasn’t me, my other comment was strong downvoted too.)
Jup, would have been even funnier if the post content was just ”.”, but perhaps this wouldn’t have helped that much convincing people that short posts are ok. xD
I think another class of really important projects are research projects that try to evaluate what needs to be done. (Like priorities research, though even a bit more applied and generating and evaluating ideas and forecasting to see what seems best.)
The projects that are now on your project list, are good options when we consider what currently seem like good things to do. But in the game against x-risk, we want to be able to look more moves ahead, consider how our opponent may strike us down, and probably invest a lot of effort into improving our long-term position on the gameboard, because we really don’t want to lose that game.
Sadly, I don’t think there are that many people who can do that kind of research well, but finding those seems really important.
(I intend to write more about this soon.)
Nice, we now have some good project ideas, next we need people to execute them.
I wouldn’t expect that to happen automatically in many cases. Therefore, I am particularly excited about projects that help as accelerators for getting other projects started, like actively finding the right people (and convincing them to start/work on a specific project) or making promising people more capable.
In particular, I’d be excited about a great headhunting organization to get the right people (EAs and non-EAs) to work on the right projects. (Like you considered in the project idea “EA ops”, though I think it would also help a lot for e.g. finding great AI safety researchers.)
The way you phrased the projects “Talent search” and “innovative educational experiments” generally sound too narrow to me. I don’t only want to find and help talented youths, but also e.g. get great professors to work on AI safety, and support all sorts of people through e.g. leadership and productivity training.
Ooops I missed that, thanks!
What is the minimum amount of money a project should require?
From reading on your website, I somehow get the intuition that you are rather interested in relatively big projects, say requiring $30k+, rather more.
In particular, it does not seem to me like you are looking for applications like “Hey, could you give me 5000$ to fund my research project I plan to do the next months?”. But I may be mistaken and I haven’t read anything explicit about it not being possible (maybe I just overlooked it).
(And yes, I know there’s the LTFF for such things, I’m just curious regardless.)
I think most of the variance of estimates may come from the high variance in estimations of how big x-risk is. (Ok, a lot of the variance here comes from different people using different methods to estimate the answer to the question, but assuming people all would use one method, I expect a lot of variance coming from this.)
Some people may say there is a 50% probability of x-risk this century, and some may say 2%, which causes the amount of money they would be willing to spend to be quite different.
But because in both cases x-risk reduction is still (by far) the most effective thing you can do, it may make sense to ask how much you would pay for a 0.01% reduction of the current x-risk. (i.e. from 3% to 2.9997% x-risk.)
I think this would cause more agreement, because it is probably easier to estimate for example how much some grant decreases AI risk in respect to the overall AI risk, than to expect how high the overall AI risk is.
That way, however, the question might be slightly more confusing, and I do think we should also make progress on better estimating the overall probability of an existential catastrophe occurring in the next couple of centuries, but I think the question I suggest might still be the better way to estimate what we want to know.
I agree that it makes much more sense to estimate x-risk on a timescale of 100 years (as I said in the sidenote of my answer), but I think you should specify that in the question, because “How many EA 2021 $s would you trade off against a 0.01% chance of existential catastrophe?” together with your definition of x-risk, implies taking the whole future of humanity into account.
I think it may make sense to explicitly only talk about the risk of existential catastrophe in this or in the next couple of centuries.
I think reducing x-risk is by far the most cost-effective thing we can do, and in an adequate world all our efforts would be flowing into preventing x-risk.
The utility of 0.01% x-risk reduction is many magnitudes greater than the global GDP, and even if you don’t care at all about future people, you should still be willing to pay a lot more than currently is paid for 0.01% x-risk reduction, as Korthon’s answer suggests.
But of course, we should not be willing to trade so much money for that x-risk reduction, because we can invest the money more efficiently to reduce x-risk even more.
So when we make the quite reasonable assumption that reducing x-risk is much more effective than doing anything else, the amount of money we should be willing to trade should only depend on how much x-risk we could otherwise reduce through spending that amount of money.
To find the answer to that, I think it is easier to consider the following question:
How much more likely is an x-risk event in the next 100 years if EA looses X dollars?
When you find the X that causes a difference in x-risk of 0.01%, the X is obviously the answer to the original question.
I only consider x-risk events in the next 100 years, because I think it is extremely hard to estimate how likely x-risk more than 100 years into the future is.
Consider (for simplicity) that EA currently has 50B$.
Now answer the following questions:
How much more likely is an x-risk event in the next 100 years if EA looses 50B$?
How much more likely is an x-risk event in the next 100 years if EA looses 0$?
How much more likely is an x-risk event in the next 100 years if EA looses 20B$?
How much more likely is an x-risk event in the next 100 years if EA looses 10B$?
How much more likely is an x-risk event in the next 100 years if EA looses 5B$?
How much more likely is an x-risk event in the next 100 years if EA looses 2B$?
Consider answering those questions for yourself before scrolling down and looking at my estimated answers for those questions, which may be quite wrong. Would be interesting if you also comment your estimates.
The x-risk from EA loosing 0$ to 2B$ should increase approximately linearly, so if is the x-risk if EA looses 0$ and is the x-risk if EA looses 2B$, you should be willing to pay for a 0.01% x-risk reduction.
(Long sidenote: I think that if EA looses money right now, it does not significantly affect the likelihood of x-risk more than 100 years from now. So if you want to get your answer for the “real” x-risk reduction, and you estimate a chance of an x-risk event that happens strictly after 100 years, you should multiply your answer by to get the amount of money you would be willing to spend for real x-risk reduction. However, I think it may even make more sense to talk about x-risk as the risk of an x-risk event that happens in the reasonably soon future (i.e. 100-5000 years), instead of thinking about the extremely long-term x-risk, because there may be a lot we cannot foresee yet and we cannot really influence that anyways, in my opinion.)
Ok, so here are my numbers to the questions above (in that order):
17%,10%,12%,10.8%,10.35%,10.13%
So I would pay for a 0.01% x-risk reduction.
Note that I do think that there are even more effective ways to reduce x-risk, and in fact I suspect most things longtermist EA is currently funding have a higher expected x-risk reduction than 0.01% per 154M$. I just don’t think that it is likely that the 50 billionth 2021 dollar EA spends has a much higher effectiveness than 0.01% per 154M$, so I think we should grant everything that has a higher expected effectiveness.
I hope we will be able to afford to spend many more future dollars to reduce x-risk by 0.01%.
I think it is important to keep in mind that we are not very funding constrained. It may be ok to have some false positives, false negatives may often be worse, so I wouldn’t be too careful.
I think grantmaking is probably still too reluctant to fund stuff that has an unlikely chance of high impact, especially if they are uncertain because the people aren’t EAs.
For example, I told a very exceptional student (who has like 1 in a million problem solving capability) to apply for Atlas fellowship, although I don’t know him well, because from my limited knowledge it increases the chance that he will work on alignment from 10% to 20-25%, and the $50k are easily worth it.
Though of course having more false positives causes more people that only pretend to do sth good to apply, which isn’t easy to handle for our current limited number of grantmakers. We definitely need to scale up grantmaking ability anyways.
I think that non-EAs should know that they can get funding if they do something good/useful. You shouldn’t need to pretend to be an EA to get funding, and defending against people who pretend they do good projects seem easier in many cases, e.g. you can often just start giving a little funding and promise more funding later if they show they progress.
(I also expect that we/AI-risk-reduction gets even much more funding as the problem gets more known/acknowledged. I’d guess >$100B in 2030, so I don’t think funding ever becomes a bottleneck, but not totally sure of course.)