Thanks for this! I think it would be helpful to plot the median changes in extinction probabilities against the number of words in the article/video. I’m noticing a correlation as I click through the links and would be curious how strong it is (so this effect can be disentangled from the style of the source).
Joshc
Does most of your impact come from what you do soon?
Yep, I have some ideas. Please DM me and give some info about yourself if you are interested in hearing them :)
Thanks for the referral. I agree that the distinction between serial time and parallel time is important and that serial time is more valuable. I’m not sure if it is astronomically more valuable though. There are two points we could have differing views on:
- the amount of expected serial time a successful (let’s say $10 billion dollar) AI startup is likely to counterfactually burn. In the post I claimed that this seems unlikely to be more than a few weeks. Would you agree with this?
- the relative value of serial time to money (which is exchangeable with parallel time). If you agree with the first statement, would you trade $10 billion dollars for 3 weeks of serial time at the current margin?
If you would not trade $10 billion for 3 weeks that could be because:
- I’m more optimistic about empirical research / think the time iterating later when we have the systems is significantly more important than the time now when we can only try to reason about them.
- you think money will be much less useful than I expect it to be
I would be interested in pinning down where I disagree with you (and others who probably disagree with the post for similar reasons).
If your claim is that ‘applying AI models for economically valuable tasks seems dangerous, i.e. the AIs themselves could be dangerous’ then I agree. A scrappy applications company might be more likely to end the world than OpenAI/DeepMind… it seems like it would be good, then, if more of these companies were run by safety conscious people.
A separate claim is the one about capabilities externalities. I basically agree that AI startups will have capabilities externalities, even if I don’t expect them to be very large. The question, then, is how much expected money we would be trading for expected time and what is the relative value between these two currencies.
It’s unclear to me that having EA people starting an AI startup is more tractable than convincing other people that the work is worth funding
Yeah, this is unclear to me to. But you can encourage lots of people to pursue earn-to-give paths (maybe a few will succeed). Not many are in a position to persuade people, and more people having this as an explicit goal seems dangerous.
Also, as an undergraduate student with short timelines, the startup path seems like a better fit.I don’t see how the flexibility of money makes any difference? Isn’t it frustratingly difficult to predict which uses of money will actually be useful for AI safety? In that case, you still have the same problem.
I have to make important career decisions right now. It’s hard to know what will be useful in the future, but it seems likely that money will be. I could have made that point clearer.
That’s a good point. Here’s another possibility:
Require that students go through a ‘research training program’ before they can participate in the research program. It would have to actually help prepare them for technical research though. Relabeling AGISF as a research training program would be misleading, so you would want to add a lot more technical content (reading papers, coding assignments, etc.) It would probably be pretty easy to gauge how much the training program participants care about X-risk / safety and factor that in when deciding whether to accept them into the research program.
The social atmosphere can also probably go a long way in influencing people’s attitudes towards safety. Making AI risk an explicit focus of the club, talking about it a lot at socials, inviting AI safety researchers to dinners, etc might do most of the work tbh.
Oo exciting. Yeah, the research program looks like it is closer to what I’m pitching.
Though I’d also be excited about putting research projects right at the start of the pipeline (if they aren’t already). It looks like AGISF is still at the top of your funnel and I’m not sure if discussion groups like these will be as good for attracting talent.
AI Safety groups should imitate career development clubs
Applications are now open for Intro to ML Safety Spring 2023
Prizes for ML Safety Benchmark Ideas
Late to the party here, but I was wondering why these organizations need aligned engineering talent. Anthropic seems like the kind of org that talented, non-aligned people would be interested in...
These are reasonable concerns, thanks for voicing them. As a result of unforeseen events, we became responsible for running this iteration only a couple of weeks ago. We thought that getting the program started quickly — and potentially running it at a smaller scale as a result — would be better than running no program at all or significantly cutting it down.
The materials (lectures, readings, homework assignments) are essentially ready to go and have already been used for MLSS last summer. Course notes are supplementary and are an ongoing project.
We are putting a lot of hours into making sure this program gets started without a hitch and runs smoothly. We are sorry the deadlines are so aggressive and agree that it would have been better to launch earlier. If you have trouble getting your application in on time, please don’t hesitate to contact us about getting an extension. We also plan to run another iteration in the Spring and announce the program further in advance.
Yes! Thanks for asking.
Fixed, thanks!
Announcing an Empirical AI Safety Program
Yeah, I would be in favor of interaction in simulated environments—other’s might disagree, but I don’t think this influences the general argument very much as I don’t think leaving some matter for computers will reduce the number of brains by more than an order of magnitude or so.
Having a superintelligence aligned to normal human values seems like a big win to me!
Not super sure what this means but the ‘normal human values’ outcome as I’ve defined it hardly contributes to EV calculations at all compared to the utopia outcome. If you disagree with this, please look at the math and let me know if I made a mistake.
Yep, I didn’t initially understand you. That’s a great point!
This means the framework I presented in this post is wrong. I agree now with your statement:the EV of partly utilitarian AI is higher than that of fully utilitarian AI.
I think the framework in this post can be modified to incorporate this and the conclusions are similar. The quantity that dominates the utility calculation is now the expected representation of utilitarianism in the AGI’s values.
The two handles become:
(1) The probability of misalignment.
(2) The expected representation of utilitarianism in the moral parliament conditional on alignment.
The conclusion of the post, then, should be something like “interventions that increase (2) might be underrated” instead of “interventions that increase the probability of fully utilitarian AGI are underrated.”
The competition was cancelled. I think the funding for it was cut, though @Oliver Z can say more. I was not involved in this decision.