It looks like Sam Harris interviewed Will MacAskill this year. He also interviewed Will in 2016. How might we tell if the previous interview created a similar number of new EA-survey-takers, or if this year’s was particularly successful? The data from that year https://forum.effectivealtruism.org/posts/Cyuq6Yyp5bcpPfRuN/ea-survey-2017-series-how-do-people-get-into-ea doesn’t seem to include a “podcast” option.
Quick take is this sounds like a pretty good bet, mostly for the indirect effects. You could do it with a ‘contest’ framing instead of a ‘I pay you to produce book reviews’ framing; idk whether that’s meaningfully better.
Yeah, I agree this is unclear. But, staying away from the word ‘intention’ entirely, I think we can & should still ask: what is the best explanation for why this model is the one that minimizes the loss function during training? Does that explanation involve this argument about changing user preferences, or not?
One concrete experiment that could feed into this: if it were the case that feeding users extreme political content did not cause their views to become more predictable, would training select a model that didn’t feed people as much extreme political content? I’d guess training would select the same model anyway, because extreme political content gets clicks in the short-term too. (But I might be wrong.)
I was surprised to see that this Gallup poll found no difference between college graduates and college nongraduates (in the US).
Younger people and more liberal people are much more likely to identify as not-straight, and EAs are generally young and liberal. I wonder how far this gets you to explaining this difference, which does need a lot of explaining since it’s so big. Some stats on this (in the US).
Thanks for this work!
I’m wondering about “crazy teenager builds misaligned APS system in a basement” scenarios and to what extent you see the considerations in this report as bearing on those.
To be a bit more precise: I’m thinking about worlds where “alignment is easy” for society at large (i.e. your claim 3 is not true), but building powerful AI is feasible even for people who are not interested in taking the slightest precautions, even those that would be recommended by ordinary self-interest. I think mostly about individuals or small groups rather than organizations.
I think these scenarios are distinct from misuse scenarios (which you mention below your report is not intended to cover), though the line is blurry. If someone who wanted to see enormous damage to the world built an AI with the intent of causing such damage, and was successful, I’d call that “misuse.” But I’m interested more in “crazy” than “omnicidal” here, where I don’t think it’s clear whether to call this “misuse” or not.
Maybe you see this as a pretty separate type of worry than what the report is intended to cover.
Well, I guess he did say you could ask him anything.
From reading the summary in this post, it doesn’t look like the YouTube video discussed bears on the question of whether the algorithm is radicalizing people ‘intentionally,’ which I take to be the interesting part of Russell’s claim.
I just don’t think we’ve seen anything that favors the hypothesis “algorithm ‘intentionally’ radicalizes people in order to get more clicks from them in the long run” over the hypothesis “algorithm shows people what they will click on the most (which is often extreme political content, and this causes them to become more radical, in a self-reinforcing cycle.)”
I think that experiment wouldn’t prove anything about the algorithm’s “intentions,” which seem to be the interesting part of the claim. One experiment that maybe would (I have no idea if this is practical) is giving the algorithm the chance to recommend two pieces of content: a) high likelihood of being clicked on, b) lower likelihood of being clicked on, but makes the people who do click on it more polarized. Not sure if a natural example of such a piece of content exists.
Good question. I’m not sure why you’d privilege Russell’s explanation over the explanation “people click on extreme political content, so the click-maximizing algorithm feeds them extreme political context.”
Agreed. The slight initial edge that drives the eventual enormous success in the winner-takes-most market can also be provided by something other than talent — that is, by something other than people trying to do things and succeeding at what they tried to do. For example, the success of Fifty Shades of Grey seems best explained by luck.
The “EA as relief” framing resonated with me (though my background is different) and I appreciate your naming it!
“There is a genius for impoverishment always at work in the world. And it has its way, as if its proceedings were not only necessary but even sensible. Its rationale, its battle cry, is Competition.”
— Marilynne Robinson
Strong upvote for Month of May.
To the extent which reducing demand for chicken prevents or delays the slaughtering of existing chickens, I don’t see why there is an asymmetry. I place positive value on chickens living their chicken lives (when those lives are net-positive, whatever that means). Go beyond that and you get into population ethics.
But more importantly, I think this post uses the term “good action” strictly to mean “action which has positive expected value,” while the common usage of “good” is broader and can include actions which are merely less negative than an alternative.
I don’t think the focus here should be only on suffering. Sometimes, I seek out art/media that depicts human flourishing, out of a desire to increase my altruistic motivation by reminding myself just what it is that we’re working to protect + create.
Obviously a ton of art/media contains “people being happy,” but when I’m looking for this, I look specifically for depictions of people who are very different from each other and from me, that show these people as being unique and weird and not at all how you thought they would be. Good examples are the tv show High Maintenance and the documentary In Jackson Heights. It’s a certain aesthetic that increases my altruistic motivation because it reminds me, by showing me more of it than I normally see, of what a vast expanse human experience really is.
(For animals, it’s more socially acceptable to just watch them intently for long periods of time.)
I suppose an example would be that increasing economic growth in a country doesn’t matter if the country later gets blown up or something.
Like how would I know if the world was more absorber-y or more sensitive to small changes?
I’m not sure; that’s a pretty interesting question.
Here’s a tentative idea: using the evolution of brains, we can conclude that whatever sensitivity the world has to small changes, it can’t show up *too* quickly. You could imagine a totally chaotic world, where the whole state at time t+(1 second) is radically different depending on minute variations in the state at time t. Building models of such a world that were useful on 1 second timescales would be impossible. But brains are devices for modelling the world that are useful on 1 second timescales. Brains evolved; hence they conferred some evolutionary advantage. Hence we don’t live in this totally chaotic world; the world must be less chaotic than that.
It seems like this argument gets less strong the longer your timescales are, as our brains perhaps faced less evolutionary pressure to be good at prediction on timescales of like 1 year, and still less to be good at prediction on timescales of 100 years. But I’m not sure; I’d like to think about this more.
Hey, glad this was helpful! : )
To apply this to conception events—imagine we changed conception events so that girls were much more likely to be conceived than boys (say because in the near-term that had some good effects eg. say women tended to be happier at the time). My intuition here is that there could be long-term effects of indeterminate sign (eg. from increased/decreased population growth) which might dominate the near-term effects. Does that match your intuition?
Yes, that matches my intuition. This action creates a sweeping change a really complex system; I would be surprised if there were no unexpected effects.
But I don’t see why we should believe all actions are like this. I’m raising the “long-term effects don’t persist” objection, arguing that it seems true of *some* actions.