Quotes about the long reflection

The Global Pri­ori­ties In­sti­tute’s re­search agenda says:

The idea of the long re­flec­tion is that of a long pe­riod—per­haps tens of thou­sands of years—dur­ing which hu­man civil­i­sa­tion, per­haps with the aid of im­proved cog­ni­tive abil­ity, ded­i­cates it­self to work­ing out what is ul­ti­mately of value (INFORMAL: MacAskill 2018; Lewis 2018). It may be ar­gued that such a pe­riod would be war­ranted be­fore de­cid­ing whether to un­der­take an ir­re­versible de­ci­sion of im­mense im­por­tance, such as whether to at­tempt spread­ing to the stars. Do we find our­selves, or are we likely to find our­selves, in a situ­a­tion where a ‘long re­flec­tion’ would in fact be war­ranted? If so, how should it be im­ple­mented?

I think this is a fas­ci­nat­ing idea. And ap­par­ently Toby Ord’s book The Precipice (out to­day!) and Will MacAskill’s up­com­ing book may in­clude dis­cus­sion of the idea, which I’m look­ing for­ward to.

But as best I can quickly tell, there seemed to be very few pub­li­cly ac­cessible sources on the long re­flec­tion (at least be­fore Ord’s book). So I thought I’d make a quite un­am­bi­tious post that just col­lects all rele­vant quotes I’ve found af­ter look­ing through all the Google hits for “the “long re­flec­tion” macaskill” and through all posts on the EA Fo­rum and LessWrong that came up when I searched “long re­flec­tion”. At the end, I also list some other re­lated work and con­cepts.

Please com­ment to let me know if you’re aware of any other sources which I haven’t men­tioned here.

80,000 Hours in­ter­view with MacAskill

Quote from 80,000 Hours’ summary

Through­out his­tory we’ve con­sis­tently be­lieved, as com­mon sense, truly hor­rify­ing things by to­day’s stan­dards. Ac­cord­ing to Univer­sity of Oxford Pro­fes­sor Will MacAskill, it’s ex­tremely likely that we’re in the same boat to­day. If we ac­cept that we’re prob­a­bly mak­ing ma­jor moral er­rors, how should we pro­ceed?

If our moral­ity is tied to com­mon sense in­tu­itions, we’re prob­a­bly just pre­serv­ing these bi­ases and moral er­rors. In­stead we need to de­velop a moral view that crit­i­cises com­mon sense in­tu­itions, and gives us a chance to move be­yond them. And if hu­man­ity is go­ing to spread to the stars it could be worth ded­i­cat­ing hun­dreds or thou­sands of years to moral re­flec­tion, lest we spread our er­rors far and wide.

Will is an As­so­ci­ate Pro­fes­sor in Philos­o­phy at Oxford Univer­sity, au­thor of Do­ing Good Bet­ter, and one of the co-founders of the effec­tive al­tru­ism com­mu­nity. In this in­ter­view we dis­cuss a wide range of top­ics:

  • How would we go about a ‘long re­flec­tion’ to fix our moral er­rors?

  • [oth­ers]

Quotes from the in­ter­view itself

Will MacAskill: If you re­ally ap­pre­ci­ate moral un­cer­tainty, and es­pe­cially if you look back through the his­tory of hu­man progress, we have just be­lieved so many morally abom­inable things and been, in fact, very con­fi­dent in them. [...]

Even for peo­ple who re­ally ded­i­cated their lives to try­ing to work out the moral truths. Aris­to­tle, for ex­am­ple, was in­cred­ibly morally com­mit­ted, in­cred­ibly smart, way ahead of his time on many is­sues, but just thought that slav­ery was a pre-con­di­tion for some peo­ple hav­ing good things in life. There­fore, it was jus­tified on those grounds. A view that we’d now think of as com­pletely abom­inable.

That makes us think that, wow, we prob­a­bly have mis­takes similar to that. Really deep mis­takes that fu­ture gen­er­a­tions will look back and think, “This is just a moral trav­esty that peo­ple be­lieved it.” That means, I think, we should place a lot of weight on moral op­tion value and gain­ing moral in­for­ma­tion. That means just do­ing fur­ther work in terms of figur­ing out what’s moral the case. Do­ing re­search in moral philos­o­phy, and so on. Study­ing it for your­self.

Se­condly, into the fu­ture, en­sur­ing that we keep our op­tions open. I think this pro­vides one ad­di­tional ar­gu­ment for en­sur­ing that the hu­man race doesn’t go ex­tinct for the next few cen­turies. It also pro­vides an ar­gu­ment for the sort of in­stru­men­tal state that we should be try­ing to get to as a so­ciety, which I call the long re­flec­tion. We can talk about that more.

Robert Wiblin: Hu­man­ity should thrive and grow, and then just turn over en­tire planets to aca­demic philos­o­phy. Is that the view? I think I’m char­i­ta­ble there.

Will MacAskill: Yeah, ob­vi­ously the con­clu­sion of a moral philoso­pher say­ing, “Mo­ral philos­o­phy is in­cred­ibly im­por­tant” might seem very self-serv­ing, but I think it is straight­for­wardly the im­pli­ca­tion you get if you at least en­dorse the premises of tak­ing moral un­cer­tainty very se­ri­ously, and so on. If you think we can at least make some progress on moral philos­o­phy. If you re­ject that view you have to kind of re­ject one of the un­der­ly­ing premises.

[...]

Robert Wiblin: Be­fore, you men­tioned that if hu­man­ity doesn’t go ex­tinct in the fu­ture, there might be a lot time and a lot of peo­ple and very ed­u­cated peo­ple who might be able to do a lot more re­search on this topic and figure out what’s valuable. That was a long re­flec­tion. What do you think that would ac­tu­ally look like in prac­tice, ideally?

Will MacAskill: Yeah. The key idea is just, differ­ent peo­ple have differ­ent sets of val­ues. They might have very differ­ent views for what does an op­ti­mal fu­ture look like. What we re­ally want ideally is a con­ver­gent goal be­tween differ­ent sorts of val­ues so that we can all say, “Look, this is the thing that we’re all get­ting be­hind that we’re try­ing to en­sure that hu­man­ity…” Kind of like this is the pur­pose of civ­i­liza­tion. The is­sue, if you think about pur­pose of civ­i­liza­tion, is just so much dis­agree­ment. Maybe there’s some­thing we can aim for that all sorts of differ­ent value sys­tems will agree is good. Then, that means we can re­ally get co­or­di­na­tion in aiming for that.

I think there is an an­swer. I call it the long re­flec­tion, which is you get to a state where ex­is­ten­tial risks or ex­tinc­tion risks have been re­duced to ba­si­cally zero. It’s also a po­si­tion of far greater tech­nolog­i­cal power than we have now, such that we have ba­si­cally vast in­tel­li­gence com­pared to what we have now, amaz­ing em­piri­cal un­der­stand­ing of the world, and sec­ondly tens of thou­sands of years to not re­ally do any­thing with re­spect to mov­ing to the stars or re­ally try­ing to ac­tu­ally build civ­i­liza­tion in one par­tic­u­lar way, but in­stead just to en­gage in this re­search pro­ject of what ac­tu­ally is a value. What ac­tu­ally is the mean­ing of life? And have, maybe it’s 10 billion peo­ple, de­bat­ing and work­ing on these is­sues for 10,000 years be­cause the im­por­tance is just so great. Hu­man­ity, or post-hu­man­ity, may be around for billions of years. In which case spend­ing a mere 10,000 is ac­tu­ally ab­solutely noth­ing.

In just the same way as if you think as an in­di­vi­d­ual, how much time should you re­flect in your own val­ues be­fore choos­ing your ca­reer and com­mit­ting to one par­tic­u­lar path.

Robert Wiblin: Prob­a­bly at least a few min­utes. At least .1% of the whole time.

Will MacAskill: At least a few min­utes. Ex­actly. When you’re think­ing about the vast­ness of the po­ten­tial fu­ture of civ­i­liza­tion, the equiv­a­lent of just a few min­utes is tens of thou­sands of years.

Then, there’s ques­tions about how ex­actly do you struc­ture that. I think it would be great if there was more work done re­ally flesh­ing that out. Per­haps that’s some­thing you’ll have time to do in the near fu­ture. One thing you want to do is have as lit­tle locked in as pos­si­ble. So, you want to be very open both on… You don’t want to com­mit to one par­tic­u­lar moral method­ol­ogy. You just want to com­mit to things that seem ex­tremely good for ba­si­cally what­ever moral view you might think ends up as cor­rect or what moral episte­mol­ogy might be cor­rect.

Just peo­ple hav­ing a higher IQ but ev­ery­thing else be­ing equal, that just seems strictly good. Peo­ple hav­ing greater em­piri­cal un­der­stand­ing just seems strictly good. Peo­ple hav­ing a bet­ter abil­ity to em­pathize. That all seems ex­tremely good. Peo­ple hav­ing more time. Have co­op­er­a­tion seems ex­tremely good. Then I think, yeah, like you say, many differ­ent peo­ple can get be­hind this one vi­sion for what we want hu­man­ity to ac­tu­ally do. That’s po­ten­tially ex­cit­ing be­cause we can co­or­di­nate.

It might be that one of the con­clu­sions we come to takes moral un­cer­tainty into ac­count. We might say, ac­tu­ally, there’s some fun­da­men­tal things that we just can’t ul­ti­mately re­solve and so we want to do a com­pro­mise be­tween them. Maybe that means that for civ­i­liza­tion, part of civ­i­liza­tion’s de­voted to com­mon sense, thick val­ues of pur­suit of art, and flour­ish­ing, and so on, whereas large parts of the rest of civ­i­liza­tion are de­voted to other val­ues like pure bliss, bliss­ful state. You can imag­ine com­pro­mise sce­nar­ios there. It’s just large amounts of civ­i­liza­tion… The uni­verse is a big place.

Quotes from an AI Align­ment Pod­cast in­ter­view with MacAskill

Will MacAskill: In terms of an­swer­ing this al­ign­ment prob­lem, the deep one of just where ought so­cieties to be go­ing [work­ing out what’s ac­tu­ally right and what’s ac­tu­ally wrong and what ought we to be do­ing], I think the key thing is to punt it. The key thing is to get us to a po­si­tion where we can think about and re­flect on this ques­tion, and re­ally for a very long time, so I call this the long re­flec­tion. Per­haps it’s a pe­riod of a mil­lion years or some­thing. We’ve got a lot of time on our hands. There’s re­ally not the kind of scarce com­mod­ity, so there are var­i­ous stages to get into that state.

The first is to re­duce ex­tinc­tion risks down ba­si­cally to zero, put us a po­si­tion of kind of ex­is­ten­tial se­cu­rity. The sec­ond then is to start de­vel­op­ing a so­ciety where we can re­flect as much as pos­si­ble and keep as many op­tions open as pos­si­ble.

Some­thing that wouldn’t be keep­ing a lot of op­tions open would be, say we’ve solved what I call the con­trol prob­lem, we’ve got these kind of lap­dog AIs that are run­ning the econ­omy for us, and we just say, “Well, these are so smart, what we’re gonna do is just tell it, ‘Figure out what’s right and then do that.’” That would re­ally not be keep­ing our op­tions open. Even though I’m sym­pa­thetic to moral re­al­ism and so on, I think that would be quite a reck­less thing to do.

In­stead, what we want to have is some­thing kind of … We’ve got­ten to this po­si­tion of real se­cu­rity. Maybe also along the way, we’ve fixed the var­i­ous par­tic­u­larly bad prob­lems of the pre­sent, poverty and so on, and now what we want to do is just keep our op­tions open as much as pos­si­ble and then kind of grad­u­ally work on im­prov­ing our moral un­der­stand­ing where if that’s sup­ple­mented by AI sys­tem …

I think there’s tons of work that I’d love to see de­vel­op­ing how this would ac­tu­ally work, but I think the best ap­proach would be to get the ar­tifi­cially in­tel­li­gent agents to be just do­ing moral philos­o­phy, giv­ing us ar­gu­ments, per­haps cre­at­ing new moral ex­pe­riences that it thinks can be in­for­ma­tive and so on, but let­ting the ac­tual de­ci­sion mak­ing or judg­ments about what is right and wrong be left up to us. Or at least have some kind of gra­di­ated thing where we grad­u­ally tran­si­tion the de­ci­sion mak­ing more and more from hu­man agents to ar­tifi­cial agents, and maybe that’s over a very long time pe­riod.

What I kind of think of as the con­trol prob­lem in that sec­ond level al­ign­ment prob­lem, those are is­sues you face when you’re just ad­dress­ing the ques­tion of, “Okay. Well, we’re now gonna have an AI run econ­omy,” but you’re not yet need­ing to ad­dress the ques­tion of what’s ac­tu­ally right or wrong. And then my main thing there is just we should get our­selves into a po­si­tion where we can take as long as we need to an­swer that ques­tion and have as many op­tions open as pos­si­ble.

Lu­cas: I guess here given moral un­cer­tainty and other is­sues, we would also want to fac­tor in is­sues with as­tro­nom­i­cal waste into how long we should wait?

Will: Yeah. That’s definitely in­form­ing my view, where it’s at least plau­si­ble that moral­ity has an ag­grega­tive com­po­nent, and if so, then the sheer vast­ness of the fu­ture may, be­cause we’ve got half a billion to a billion years left on Earth, a hun­dred trillion years be­fore the starts burn out, and then … I always for­get these num­bers, but I think like a hun­dred billion stars in the Milky Way, ten trillion galax­ies.

With just vast re­sources at our dis­posal, the fu­ture could be as­tro­nom­i­cally good. It could also be as­tro­nom­i­cally bad. What we want to in­sure is that we get to the good out­come, and given the time scales in­volved, even what seem like an in­cred­ibly long de­lay, like a mil­lion years, is ac­tu­ally just very lit­tle time in­deed.

Lu­cas: In half a sec­ond I want to jump into whether or not this is ac­tu­ally likely to hap­pen given race dy­nam­ics and that hu­man be­ings are kind of crazy. The sort of timeline here is that we’re solv­ing the tech­ni­cal con­trol prob­lem up into and on our way to sort of AGI and what might be su­per­in­tel­li­gence, and then we are also sort of ideal­iz­ing ev­ery­one’s val­ues and lives in a way such that they have more in­for­ma­tion and they can think more and have more free time and be­come ideal­ized ver­sions of them­selves, given con­straints within is­sues of val­ues can­cel­ing each other out and things that we might end up just deem­ing to be im­per­mis­si­ble.

After that is where this pe­riod of long re­flec­tion takes place, and sort of the dy­nam­ics and me­chan­ics of that are seem­ing open ques­tions. It seems that first comes com­puter sci­ence and global gov­er­nance and co­or­di­na­tion and strat­egy is­sues, and then comes long time of philos­o­phy.

Will: Yeah, then comes the mil­lion years of philos­o­phy, so I guess not very sur­pris­ing a philoso­pher would sug­gest this. Then the dy­nam­ics of the setup is an in­ter­est­ing ques­tion, and a su­per im­por­tant one.

One thing you could do is just say, “Well, we’ve got ten billion peo­ple al­ive to­day, let’s say. We’re gonna di­vide the uni­verse into ten billionths, so maybe that’s a thou­sand galax­ies each or some­thing.” And then you can trade af­ter that point. I think that would get a pretty good out­come. There’s ques­tions of whether you can en­force it or not into the fu­ture. There’s some ar­gu­ments that you can. But maybe that’s not the op­ti­mal pro­cess, be­cause es­pe­cially if you think that “Wow! Maybe there’s ac­tu­ally some an­swer, some­thing that is cor­rect,” well, maybe a lot of peo­ple miss that.

I ac­tu­ally think if we did that and if there is some cor­rect moral view, then I would hope that in­cred­ibly well in­formed peo­ple who have this vast amount of time, and per­haps in­tel­lec­tu­ally aug­mented peo­ple and so on who have this vast amount of time to re­flect would con­verge on that an­swer, and if they didn’t, then that would make me more sus­pi­cious of the idea that maybe there is a real face to the mat­ter. But it’s still the early days we’d re­ally want to think a lot about what goes into the setup of that kind of long re­flec­tion.

[The dis­cus­sion from that point to “If it’s the case that there is a right an­swer.” are also very rele­vant.]

[See also Ro­hin Shah’s brief sum­mary of and com­men­tary about this in­ter­view.]

Cause pri­ori­ti­za­tion for down­side-fo­cused value sys­tems by Lukas Gloor

Quote from the article

I’m us­ing the term down­side-fo­cused to re­fer to value sys­tems that in prac­tice (given what we know about the world) pri­mar­ily recom­mend work­ing on in­ter­ven­tions that make bad things less likely. [...]

By con­trast, other moral views place great im­por­tance on the po­ten­tial up­sides of very good fu­tures [...] I will call these views up­side-fo­cused.

[...]

Some peo­ple have ar­gued that even (very) small cre­dences in up­side-fo­cused views [which roughly means moral views which place great im­por­tance on the po­ten­tial up­sides of very good fu­tures], such as 1-20% for in­stance, would in it­self already speak in fa­vor of mak­ing ex­tinc­tion risk re­duc­tion a top pri­or­ity be­cause mak­ing sure there will still be de­ci­sion-mak­ers in the fu­ture pro­vides high op­tion value. I think this gives by far too much weight to the ar­gu­ment from op­tion value. Op­tion value does play a role, but not nearly as strong a role as it is some­times made out to be. To elab­o­rate, let’s look at the ar­gu­ment in more de­tail: The naive ar­gu­ment from op­tion value says, roughly, that our de­scen­dants will be in a much bet­ter po­si­tion to de­cide than we are, and if suffer­ing-fo­cused ethics or some other down­side-fo­cused view is in­deed the out­come of their moral de­liber­a­tions, they can then de­cide to not colonize space, or only do so in an ex­tremely care­ful and con­trol­led way. If this pic­ture is cor­rect, there is al­most noth­ing to lose and a lot to gain from mak­ing sure that our de­scen­dants get to de­cide how to pro­ceed.

I think this ar­gu­ment to a large ex­tent misses the point, but see­ing that even some well-in­formed effec­tive al­tru­ists seem to be­lieve that it is very strong led me re­al­ize that I should write a post ex­plain­ing the land­scape of cause pri­ori­ti­za­tion for down­side-fo­cused value sys­tems. The prob­lem with the naive ar­gu­ment from op­tion value is that the de­ci­sion al­gorithm that is im­plic­itly be­ing recom­mended in the ar­gu­ment, namely fo­cus­ing on ex­tinc­tion risk re­duc­tion and leav­ing moral philos­o­phy (and s-risk re­duc­tion in case the out­come is a down­side-fo­cused moral­ity) to fu­ture gen­er­a­tions, makes sure that peo­ple fol­low the im­pli­ca­tions of down­side-fo­cused moral­ity in pre­cisely the one in­stance where it is least needed, and never oth­er­wise. If the fu­ture is go­ing to be con­trol­led by philo­soph­i­cally so­phis­ti­cated al­tru­ists who are also mod­est and will­ing to change course given new in­sights, then most bad fu­tures will already have been averted in that sce­nario. An out­come where we get long and care­ful re­flec­tion with­out down­sides is far from the only pos­si­ble out­come. In fact, it does not even seem to me to be the most likely out­come (al­though oth­ers may dis­agree). No one is most wor­ried about a sce­nario where epistem­i­cally care­ful thinkers with their heart in the right place con­trol the fu­ture; the dis­cus­sion is in­stead about whether the prob­a­bil­ity that things will ac­ci­den­tally go off the rails war­rants ex­tra-care­ful at­ten­tion. (And it is not as though it looks like we are par­tic­u­larly on the rails cur­rently ei­ther.) Re­duc­ing non-AI ex­tinc­tion risk does not pre­serve much op­tion value for down­side-fo­cused value sys­tems be­cause most of the ex­pected fu­ture suffer­ing prob­a­bly comes not from sce­nar­ios where peo­ple de­liber­ately im­ple­ment a solu­tion they think is best af­ter years of care­ful re­flec­tion, but in­stead from cases where things un­ex­pect­edly pass a point of no re­turn and com­pas­sion­ate forces do not get to have con­trol over the fu­ture. Down­side risks by ac­tion likely loom larger than down­side risks by omis­sion, and we are plau­si­bly in a bet­ter po­si­tion to re­duce the most press­ing down­side risks now than later. (In part be­cause “later” may be too late.)

This sug­gests that if one is un­cer­tain be­tween up­side- and down­side-fo­cused views, as op­posed to be­ing un­cer­tain be­tween all kinds of things ex­cept down­side-fo­cused views, the ar­gu­ment from op­tion value is much weaker than it is of­ten made out to be. Hav­ing said that, non-naively, op­tion value still does up­shift the im­por­tance of re­duc­ing ex­tinc­tion risks quite a bit – just not by an over­whelming de­gree. In par­tic­u­lar, ar­gu­ments for the im­por­tance of op­tion value that do carry force are for in­stance:

  • There is still some down­side risk to re­duce af­ter long reflection

  • Our de­scen­dants will know more about the world, and cru­cial con­sid­er­a­tions in e.g. in­finite ethics or an­throp­ics could change the way we think about down­side risks (in that we might for in­stance re­al­ize that down­side risks by omis­sion loom larger than we thought)

  • One’s adop­tion of (e.g.) up­side-fo­cused views af­ter long re­flec­tion may cor­re­late fa­vor­ably with the ex­pected amount of value or dis­value in the fu­ture (mean­ing: con­di­tional on many peo­ple even­tu­ally adopt­ing up­side-fo­cused views, the fu­ture is more valuable ac­cord­ing to up­side-fo­cused views than it ap­pears dur­ing an ear­lier state of un­cer­tainty)

The dis­cus­sion about the benefits from op­tion value is in­ter­est­ing and im­por­tant, and a lot more could be said on both sides. I think it is safe to say that the non-naive case for op­tion value is not strong enough to make ex­tinc­tion risk re­duc­tion a top pri­or­ity given only small cre­dences in up­side-fo­cused views, but it does start to be­come a highly rele­vant con­sid­er­a­tion once the cre­dences be­come rea­son­ably large. Hav­ing said that, one can also make a case that im­prov­ing the qual­ity of the fu­ture (more hap­piness/​value and less suffer­ing/​dis­value) con­di­tional on hu­man­ity not go­ing ex­tinct is prob­a­bly go­ing to be at least as im­por­tant for up­side-fo­cused views and is more ro­bust un­der pop­u­la­tion eth­i­cal un­cer­tainty – which speaks par­tic­u­larly in fa­vor of highly pri­ori­tiz­ing ex­is­ten­tial risk re­duc­tion through AI policy and AI al­ign­ment.

My commentary

Much of the rest of that ar­ti­cle is also some­what rele­vant to the con­cept of the long re­flec­tion.

From mem­ory, I think some­what similar points are made in the in­ter­est­ing post The ex­pected value of ex­tinc­tion risk re­duc­tion is pos­i­tive, though that post doesn’t use the term “long re­flec­tion”.

Other places where the term was used in a rele­vant way

Th­ese are sources that ex­plic­itly re­fer to the con­cept of the long re­flec­tion, but which es­sen­tially just re­peat parts of what the above quotes already say:

Th­ese are sources which may say some­thing new about the con­cept, but which I haven’t read prop­erly, so I don’t want to risk mis­lead­ingly pul­ling quotes from them out of con­text:

Some other some­what rele­vant concepts

  • Bostrom’s con­cept of tech­nolog­i­cal ma­tu­rity: “the at­tain­ment of ca­pa­bil­ities af­ford­ing a level of eco­nomic pro­duc­tivity and con­trol over na­ture close to the max­i­mum that could fea­si­bly be achieved.”

  • “Stably good fu­tures”: “those where so­ciety has achieved enough wis­dom and co­or­di­na­tion to guaran­tee the fu­ture against ex­is­ten­tial risks and other dystopian out­comes, per­haps with the aid of Friendly AI (FAI).”

    • The post con­trasts this against “Stably bad fu­tures (‘bad out­comes’)[, which] are those where ex­is­ten­tial catas­tro­phe has oc­curred.”

  • Op­tion value


I hope you’ve found this post use­ful. Hope­fully Toby Ord’s book and/​or Will MacAskill’s book will provide a more com­pre­hen­sive, de­tailed dis­cus­sion of the con­cept, in which case this post can serve just as a record of how the con­cept was dis­cussed in its early days. I’d also be in­ter­ested to see EA Fo­rum users writ­ing up their own fleshed out ver­sions of, cri­tiques of, or thoughts on the long re­flec­tion, ei­ther as com­ments here or as their own posts.

And as I said ear­lier, please com­ment to let me know if you’re aware of any other rele­vant sources which I haven’t men­tioned here.