“In the day I would be reminded of those men and women,
Brave, setting up signals across vast distances,
Considering a nameless way of living, of almost unimagined values.”
Emrik
Thanks a lot : )
(Honestly just posting comments on posts linking to relevant stuff you can think of is both cheap and decent value.)
Fair points. On the third hand, the more AGI researchers there are, the more “targets” there are for important arguments to reach, and the higher impact systematic AI governance interventions will have.
At this point, I seem to have lost track of my probabilities somewhere in the branches, let me try to go back and find it...
Good discussion, ty. ^^
You’d be well able to compute the risk on your own, however, if you seriously considered doing any big outreach efforts. I think people should still have a large prior on action for anything that looks promising to them. : )
This is confused, afaict? When comparing the impact of time-buying vs direct work, the probability of success for both activities is negated by the number of people pushing capabilities. So it cancels out, and you don’t need to think about the number of people in opposition.
The unique thing about time-buying is that its marginal impact increases with the number of alignment workers,[1] whereas the marginal impact of direct work plausibly decreases with the number of people already working on it (due to fruit depletion and coordination issues).[2]
If there are 300 people doing direct alignment and you’re an average worker, you can expect to contribute 0.3% of the direct work that happens per unit time. On the other hand, if you spend your time on time-buying instead, you only need to expect to save 0.3% units of time per unit of time you spend in order to break even.[3]
- ^
Although the marginal impact of every additional unit of time scales with the number of workers, there are probably still diminishing returns to more people working on time-buying.
- ^
Probably direct work scales according to some weird curve idk, but I’m guessing we’re past the peak. Two people doing direct work collaboratively do more good per person than one person. But there are probably steep diminishing marginal returns from economy-of-scale/specialisation, coordination, and motivation in this case.
- ^
Impact is a multiplication of the number of workers , their average rate of work , and the time they have left to work . And because multiplication is commutative, if you increase one of the variables by a proportion , that is equivalent to increasing any of the other variables with the same proportion. .
- ^
You have a point!
The first sentence points out that I am doing an average amount of alignment work, and that amount is . I realise this is a little silly, but it makes the heuristic smaller. Updated the comment to instead. Thanks.
I basically agree with this breakdown from the post:
How do you account for the fact that the impact of a particular contribution to object-level alignment research can compound over time?
Let’s say I have a technical alignment idea now that is both hard to learn and very usefwl, such that every recipient of it does alignment research a little more efficiently. But it takes time before that idea disseminates across the community.
At first, only a few people bother to learn it sufficiently to understand that it’s valuable. But every person that does so adds to the total strength of the signal that tells the rest of the community that they should prioritise learning this.
Not sure if this is the right framework, but let’s say that researchers will only bother learning it if the strength of the signal hits their person-specific threshold for prioritising it.
Number of researchers are normally distributed (or something) over threshold height, and the strength of the signal starts out below the peak of the distribution.
Then (under some assumptions about the strength of individual signals and the distribution of threshold height), every learner that adds to the signal will, at first, attract more than one learner that adds to the signal, until the signal passes the peak of the distribution and the idea reaches satiation/fixation in the community.
If something like the above model is correct, then the impact of alignment research plausibly goes down over time.
But the same is true of a lot of time-buying work (like outreach). I don’t know how to balance this, but I am now a little more skeptical of the relative value of buying time.
Importantly, this is not the same as “outreach”. Strong technical alignment ideas are most likely incompatible with almost everyone outside the community, so the idea doesn’t increase the number of people working on alignment.
Do you mean you find it hard to avoid thinking about capabilities research or hard to avoid sharing it?
It seems reasonable to me that you’d actually want to try to advance the capabilities frontier, to yourself, privately, so you’re better able to understand the system you’re trying to align, and also you can better predict what’s likely to be dangerous.
This post is perhaps the most important thing I’ve read on the EA forum. (Update: Ok, I’m less optimistic now, but still seems very promising.)
The main argument that I updated on was this:
Multiplier effects: Delaying timelines by 1 year gives the entire alignment community an extra year to solve the problem.
In other words, if I am capable of doing an average amount of alignment work per unit time, and I have units of time available before the development of transformative AI, I will have contributed work. But if I expect to delay transformative AI by units of time if I focus on it, everyone will have that additional time to do alignment work, which means my impact is , where is the number of people doing work. Naively then, if , I should be focusing on buying time.[1]
This analysis further favours time-buying if the total amount of work per unit time accelerates, which is plausibly the case if e.g. the alignment community increases over time.
- ^
This assumes time-buying and direct alignment-work is independent, whereas I expect doing either will help with the other to some extent.
- ^
Would you be able to give tangible examples where alignment research has advanced capabilities? I’ve no doubt it’s happened due to alignment-focused researchers being chatty about their capabilities-related findings, but idk examples.
Naively, the main argument (imo) can be summed up as:
If I am capable of doing an average amount of alignment work per unit time, and I have units of time available before the development of transformative AI, I will have contributed work. But if I expect to delay transformative AI by units of time if I focus on it, everyone will have that additional time to do alignment work, which means my impact is , where is the number of people doing work. If , I should be focusing on buying time.[1]
This analysis further favours time-buying if the total amount of work per unit time accelerates, which is plausibly the case if e.g. the alignment community increases over time.
- ^
This assumes time-buying and direct alignment-work is independent, whereas I expect doing either will help with the other to some extent.
- ^
Multiplier effects: Delaying timelines by 1 year gives the entire alignment community an extra year to solve the problem.
This is the most and fastest I’ve updated on a single sentence as far back as I can remember. Probably the most important thing I’ve ever read on the EA forum. I am deeply gratefwl for learning this, and it’s definitely worth Taking Seriously. Hoping to look into it in January unless stuff gets in the way.
(Update: I’m much substantially less optimistic about time-buying than I was when I wrote this comment, but I still think it’s high priority to look into.)
I have one objection to claim 3a, however: Buying-time interventions are plausibly more heavy-tailed than alignment research in some cases because 1) the bottleneck for buying time is social influence and 2) social influence follows a power law due to preferential attachment. Luckily, the traits that make for top alignment researchers have limited (but not insignificant) overlap with the traits that make for top social influencers. So I think top alignment researchers should still not switch in most cases on the margin.
Mh, agreed. The general arguments in the post are probably overwhelmed in most cases by considerations specific to each case.
I kinda disagree with yesterday-me on how important these arguments are. I’m not entirely sure why. I think writing out this post helped me see how limited they are, and decision-relevant evidence related to specific cases will likely overwhelm them. But anyway:
Clarification
Overall I’d guess most of the disagreement can be rounded off to you thinking that AI safety is known to be the top priority, and so the benefits of forecasting in terms of prioritization are pretty small. Is that fair?
I don’t try to argue the object-level. I’m instead suggesting reasons why direct work could be a higher priority under a greater range of uncertainty than people might think.
If this is true, it doesn’t necessarily mean that people should deprioritise forecasting. But it does mean that if your estimates are already within the range where direct work is higher priority, and expected evidence seems unlikely to shift estimates out of that range, then forecasting is marginally wastefwl.
The Future Fund’s estimates and resilience (or that of a large part of the community) might not be within that range, however. In which case they should probably prioritise forecasting.
I’m only saying “if you think this, then that”. The arguments could still be valid (and therefore potentially usefwl), even if the premises don’t hold in specific cases.
I’m not saying “forecasting is wastefwl”, I’m saying “here are some reasons that may help you analyse”. My opinions shouldn’t matter to the value of the post, since I explicitly say that people shouldn’t defer to me.
The arguments are entirely filtered for anti-forecasting, because I expect people to already be aware of the pro-forecasting arguments I currently have on offer, and I only wish to provide tools that they may not already have.
Role-based socioepistemology, and “forecasters” vs “explorers”
I’m supposed to try to figure out what a good research community looks like, and that will involve different people filling different roles. I believe there are tangible methodological differences between optimal forecasting and optimal exploring, and I want to refine my model of what those differences are.
When I talk about “forecasters”, it’s usually because I want to contrast that with what I think good methodologies for “explorers” are. Truth is, I have no idea how to do good forecasting, so it usually ends up being rather strawman-ish.
When I say “explorer” I think of people like V.S. Ramachandran, Feynman, Kary Mullis, and people who aren’t afraid of being wrong a bunch in order to be extremely right on occasion.
Whereas forecasters need to produce work that people can safely defer to and use for prioritising between consequential decisions, so the negative impact of being wrong are much greater.
Exploring helps forecasting more than the other way around
The way I usually update my estimates on the importance of doing X (e.g. animal advocacy or AI alignment) is by spending most of my time actually doing X and thereby learning how worthwhile it is.
If X hits diminishing returns, or I uncover evidence that reduces my confidence in X, then I’ll spend more resources trying to look for alternative paths.
This way, I still get evidence related to prioritisation and forecasting, but I also make progress on object-level projects. The forecasting-related flow of information from project-execution is often sufficient that I don’t need to spend much time explicitly forecasting.
(I realise the terms are insufficiently well-defined, but hopefwly I communicate my intention.)
It seems plausible that if something like this algorithm is widely adopted in the community, we not only make progress on important projects faster, but we also uncover more evidence related to prioritisation and forecasting.
This is good stuff!
I really like your way of framing abstractions as “parametrizations” of the choice function. Another way to think of this is that you want your ontology of things in the world to consist of abstractions with loose coupling.
For example:
Let’s say you’re considering eating something, and you have both “eating an apple” and “eating a blueberry muffin” as options.
Also assume that you don’t have a class for “food” that includes a reference to “satiation” such that “if satiated, then food is low expected utility”. Instead, that rule is encoded into every class of food separately.
Then you’d have to run both “eating an apple” and “eating a blueberry muffin” into the choice function separately in order to figure out that they are low EV. If instead you had a reasonable abstraction for “food”, you could just run the choice function once and not have to bother evaluating subclasses.
Not only does loose coupling help with efficient computation, it also helps with increasing modularity and thereby reducing design debt.
If base-level abstractions are loosely connected, then even if you build your model of the world on top of them, they still have a limited number of dependencies to other abstractions.
Thus, if one of the base-level abstractions has a flaw, you can switch it out without having to refactor large parts of your entire model of the world.
A loosely coupled ontology also allows for further specialisation of each abstraction, without having to pay costs of compromise for when abstractions have to serve many different functions.
John von Neumann was a hedgefox.
“The spectacular thing about Johnny [von Neumann] was not his power as a mathematician, which was great, or his insight and his clarity, but his rapidity; he was very, very fast. And like the modern computer, which no longer bothers to retrieve the logarithm of 11 from its memory (but, instead, computes the logarithm of 11 each time it is needed), Johnny didn’t bother to remember things. He computed them. You asked him a question, and if he didn’t know the answer, he thought for three seconds and would produce and answer.”
-- Paul R. Halmos
Just letting you know of The Letten Prize in case you know anyone under 45 who’s done relevant work in Global Health & Development and might wish to spend a few minutes sending in an application. As I understand it, the prize recognises past achievements, so there’s no work to hand in before the deadline.
Prize money: 2,5 MNOK (~235 000 USD).
Deadline: February 6th, 2023.
Applicants last year: 50.
I think it’s well worth stating that there were only 50 applicants last round. That’s 50000kr for a random draw, and I think some readers of this forum are well above chance.
To be clear, are you mostly evaluating based on past performance (e.g. stuff already published), such that there’s not much candidates have time to achieve before February to increase their chances? Or are you weighting recent work more?
Sorry if this is only tangentially relevant, but I honestly think more courses, discussion groups, and especially virtual programs could benefit from using the EA Gather Town for their sessions. This doesn’t suit everyone, of course, but I think there are a lot of people for whom it would be optimal on the margin. I would be happy to help with this in any way I can.[1] Get in touch if you’re interested. : )
Yellow hosted some unofficial intro course cohorts here, and one of them became a regular coworker, and several others have returned to the space every now and then. (Yellow actually invited the students and hosted the courses on their own initiative, and they made a guide! Needless to say, Yellow is pretty awesome.)
One-off events that people travel to are really great for inspiration, learning seriously, and strong connections. But there are significant obstacles to keeping up those connections after people return home to their daily routines. The environments (locale, incentives, activities) where they made the connections are often very different from their habitual environments where they’d have to find a way to maintain the connections. If they live far apart, they might not be the kind of people who have much bandwidth for communicating online, so the connection fades despite wanting to keep in touch.
For fostering long-term high-communication connections between EAs, I suspect local or online activities are underexplored. Events that are more specifically optimised for kickstarting a perpetual social activity (e.g. coworking, or regular meetups in a place they can always return to) for those who want it seem more likely to enable people to keep in touch, and EA Gather is great for that. Probably locally hosted activities work too, but I don’t know much about them.
Either me or any of the other stewards could give quick intro tours to newcomers on e.g. how to connect with others via the space, community norms, benefits of coworking, etc. We could also build out or customise the space for what people want to use it for, but we have plenty of space so we might already have what you need for what you want to do.