Short-Term AI Alignment as a Priority Cause

In this post, I will ar­gue that short-term AI al­ign­ment should be viewed as to­day’s great­est pri­or­ity cause, whether you are con­cerned by long-term AGI risks or not.

To do so, I will first in­sist on the fact that AIs are au­tomat­ing in­for­ma­tion col­lec­tion, stor­age, anal­y­sis and dis­sem­i­na­tion; and that they are now do­ing a lot of this much bet­ter than hu­mans. Yet, many of the pri­or­ity cause ar­eas in EA strongly de­pend on col­lect­ing, stor­ing, an­a­lyz­ing and dis­sem­i­nat­ing qual­ity in­for­ma­tion. As of to­day, an al­igned large-scale AI would thus be a formidable ally for EA.

In this post, I will par­tic­u­larly fo­cus on the cases of pub­lic health, an­i­mal suffer­ing, crit­i­cal think­ing and ex­is­ten­tial risks, since these are lead­ing pri­or­ity causes in EA.

The Power of Information

I already dis­cussed this on LessWrong. But I fear that AI dis­cus­sions are an­noy­ingly too fo­cused on fu­tur­is­tic robots, even within the EA com­mu­nity. In con­trast, in this post, I pro­pose to stress the grow­ing role of al­gorithms and in­for­ma­tion pro­cess­ing in our so­cieties.

It’s in­deed note­wor­thy that the great­est dis­rup­tions in hu­man his­tory have ar­guably been the in­ven­tion of new in­for­ma­tion tech­nolo­gies. Lan­guage en­abled co­or­di­na­tion, writ­ing en­abled long-term in­for­ma­tion stor­age, print­ing press en­abled scal­able in­for­ma­tion dis­sem­i­na­tion, and com­put­ing ma­chines en­abled ul­tra-fast in­for­ma­tion pro­cess­ing. We also now have all sorts of sen­sors and cam­eras to scale data col­lec­tion, and wor­ld­wide fiber-op­tics for su­per-re­li­able wor­ld­wide in­for­ma­tion com­mu­ni­ca­tion.

In­for­ma­tion tech­nolo­gies pow­ered new sorts of economies, or­ga­ni­za­tions, sci­ence dis­cov­er­ies, in­dus­trial rev­olu­tions, agri­cul­tural prac­tices and product cus­tomiza­tion. They also moved our so­cieties to­wards in­for­ma­tion so­cieties. Th­ese days, es­sen­tially all jobs are in­for­ma­tion pro­cess­ing jobs, from the CEO of the largest com­pany down to the baby-sit­ter. Scien­tists, jour­nal­ists, man­agers, soft­ware de­vel­op­ers, lawyers, doc­tors, work­ers, teach­ers, drivers, reg­u­la­tors, and even effec­tive al­tru­ists — or me writ­ing this post. All spend their days do­ing mostly in­for­ma­tion pro­cess­ing; they col­lect, store, an­a­lyze and com­mu­ni­cate in­for­ma­tion.

When Yu­val Noah Harari came to EPFL, his talk was greatly fo­cused on in­for­ma­tion. “Those who con­trol the flow of data in the world con­trol the fu­ture, not only of hu­man­ity, but also per­haps of life it­self”, he said. This is be­cause, ac­cord­ing to Harari, “hu­mans are hack­able”. As psy­chol­ogy keeps show­ing it, the in­for­ma­tion we are ex­posed to rad­i­cally bi­ases our be­liefs, prefer­ences and habits, with both short-term and long-term effects.

Now ask your­self. To­day, who is the most in con­trol the flow of in­for­ma­tion? Which en­tity holds more than any other, ac­cord­ing to Harari, “the fu­ture of life”?

I would ar­gue that, by far, al­gorithms have taken the con­trol of the flow of in­for­ma­tion. Well, not all al­gorithms. Ar­guably, a hand­ful of al­gorithms are con­trol­ling the flow of in­for­ma­tion more than all hu­mans com­bined; and ar­guably, the YouTube al­gorithm is more in con­trol of in­for­ma­tion than any other al­gorithm — with 1 billion watch-time hours per day for 2 billion users, 70% of which are re­sults of recom­men­da­tions.

And as al­gorithms be­come bet­ter and bet­ter at com­plex in­for­ma­tion pro­cess­ing, be­cause of eco­nom­i­cal in­cen­tives, they seem bound to take more and more con­trol of the in­for­ma­tion that pow­ers our in­for­ma­tion so­cieties. It seems crit­i­cal that they be al­igned with what we truly want them to do.

How short-term al­ign­ment can help all EA causes

In the se­quel, I will par­tic­u­larly fo­cus on the global im­pacts that the al­ign­ment of large-scale al­gorithms, like the YouTube al­gorithm, could have on some of the main EA causes.

Im­pact on pub­lic health

Much of health­care is an in­for­ma­tion pro­cess­ing challenge. In par­tic­u­lar, early di­ag­no­sis is of­ten crit­i­cal to effi­cient treat­ment. En­abling anomaly de­tec­tion with non-in­tru­sive sen­sors, like a mere pic­ture with a phone, could en­able great im­prove­ment in pub­lic health, es­pe­cially if ac­com­panied by ad­e­quate med­i­cal recom­men­da­tions (which may be as sim­ple as “you should see a doc­tor”). While ex­cit­ing, and while there are ma­jor AI Safety challenges in this re­gard, I will not dwell on them since al­ign­ment is ar­guably not the bot­tle­neck here.

On the other hand, much of pub­lic health has to do with daily habits, which are strongly in­fluenced by recom­mender sys­tems like the YouTube al­gorithm. Un­for­tu­nately, as long as they are un­al­igned, these recom­mender sys­tems might en­courage poor habits, like fast food con­sump­tion, tak­ing the car for trans­port or binge-watch­ing videos for hours with­out ex­er­cis­ing.

More al­igned recom­mender sys­tems might in­stead en­courage good habits, for in­stance in terms of hy­giene habits, qual­ity food recom­men­da­tions and en­courage­ments to do sports. By cus­tomiz­ing ad­e­quately video recom­men­da­tions, it might be even pos­si­ble to mo­ti­vate users to cook healthy food or prac­tice sports that the users are more likely to en­joy.

A more tractable benefi­cial recom­men­da­tion could be the pro­mo­tion of ev­i­denced-based medicine with large effect sizes, like vac­ci­na­tion. The World Health Or­ga­ni­za­tion re­ported 140,000 deaths by measles in 2018, for which a vac­cine ex­ists. Un­for­tu­nately, the anti-vac­ci­na­tion pro­pa­ganda seems to have slowed down the sys­tem­atic vac­ci­na­tion of chil­dren. Even if only 10% of deaths by measles could have been avoided by ex­po­sure to bet­ter in­for­ma­tion, this still rep­re­sents tens of thou­sands of lives that could be saved by more al­igned recom­men­da­tion al­gorithms for measles alone.

As a more EA ex­am­ple, we can con­sider the case of the Malaria Con­sor­tium (or other GiveWell top char­i­ties). Much of philan­thropy could be­come a lot more effec­tive if dona­tors were bet­ter in­formed. An al­igned recom­mender could stress this fact, and recom­mend effec­tive char­i­ties, as op­posed to ap­peal­ing in­effec­tive ones. Thou­sands, if not hun­dreds of thou­sands of lives, could prob­a­bly be saved by ex­pos­ing po­ten­tial dona­tors to bet­ter qual­ity in­for­ma­tion.

To con­clude this sec­tion, I would like to stress the grow­ing challenges with men­tal health. This will ar­guably be the ul­ti­mate fron­tier of health­care, and a ma­jor cause for util­i­tar­i­ans. Un­for­tu­nately, fight­ing ad­dic­tion, loneli­ness, de­pres­sion and suicide seems nearly in­tractable through con­ven­tional chan­nels. But data from so­cial me­dias may provide formidable rad­i­cally new means to di­ag­nose these men­tal health con­di­tions, as a Face­book study sug­gests. In­ter­est­ingly, by al­ign­ing recom­mender al­gorithms, so­cial me­dias could provide means to treat such con­di­tions, for in­stance by recom­mend­ing effec­tive ther­a­peu­tic con­tents. In­deed, stud­ies showed that the mere ex­po­sure to the prin­ci­ples of cog­ni­tive be­hav­ioral ther­apy im­proved pa­tients’ con­di­tions. Alter­na­tively, al­gorithms could sim­ply recom­mend con­tents that en­courage view­ers in need to see a psy­chi­a­trist.

Im­pact on an­i­mal suffering

Another im­por­tant cause in EA is an­i­mal suffer­ing. Here, again, it seems that in­for­ma­tion is crit­i­cal. Most peo­ple seem to sim­ply be un­aware of the hor­ror and scale of in­dus­trial farm­ing. They also seem to ne­glect the im­pact of their daily con­sump­tions on the in­cen­tive struc­ture that mo­ti­vates in­dus­trial farm­ing.

But this is not all. Our food con­sump­tion habits ar­guably strongly de­pend on our so­cial and in­for­ma­tional en­vi­ron­ments. By fos­ter­ing com­mu­ni­ties that, for in­stance, like to try differ­ent sub­sti­tutes to meat, it seems more likely to con­vince a larger pop­u­la­tion to at least try such sub­sti­tutes, which could re­duce sig­nifi­cantly our im­pacts on an­i­mal suffer­ing, and on the en­vi­ron­ment.

(I once pointed this out to Ed Win­ters, a ve­gan YouTu­ber ac­tivist, who ac­knowl­edged that the num­ber of views of his videos seems mostly con­trol­led by the YouTube al­gorithm. Our dis­cus­sion was recorded, and I guess it will be on his YouTube chan­nel soon...)

It may also be pos­si­ble to nudge biol­o­gists and as­piring biol­o­gists to­wards re­search on, say, meat sub­sti­tutes. This seems crit­i­cal to ac­cel­er­ate the de­vel­op­ment of such sub­sti­tutes, but also of their price, which could then have a strong im­pact on an­i­mal suffer­ing.

Fi­nally, one of the great challenges of cul­ti­vated meat may be its so­cial ac­cep­tance. There may be a lot of skep­ti­cism merely due to a mi­s­un­der­stand­ing, ei­ther of the na­ture of cul­ti­vated meat, or of the “nat­u­ral­ness” of con­ven­tional meat.

Im­pact on crit­i­cal thinking

This leads us to what may be one of the most im­pact­ful con­se­quences of al­igned recom­mender sys­tems. It might be pos­si­ble to pro­mote much more effec­tively crit­i­cal think­ing, at least within in­tel­lec­tual com­mu­ni­ties. Im­prov­ing the way a large pop­u­la­tion thinks may be one of the most effec­tive way to do a lot of good in a ro­bust man­ner.

As a con­vinced Bayesian (with an up­com­ing book on the topic!), I feel that the sci­en­tific com­mu­nity (and oth­ers) would gain a lot by pon­der­ing at much greater length their episte­mol­ogy, that is, how they came to be­lieve what they be­lieve, and what they ought to do to ac­quire more re­li­able be­liefs. Un­for­tu­nately, most sci­en­tists seem to ne­glect the im­por­tance of think­ing in bets. While they usu­ally ac­knowl­edge them­selves that they are poor in prob­a­bil­ity the­ory, they mostly seem fine with their in­abil­ity to think prob­a­bil­is­ti­cally. When it comes to prepar­ing our­selves for an un­cer­tain fu­ture, this short­com­ing seems very con­cern­ing. Ar­guably, this is why AI re­searchers are not suffi­ciently con­cerned by AGI risks.

An al­igned al­gorithm could pro­mote con­tents that stress the im­por­tance of think­ing prob­a­bil­is­ti­cally, the fun­da­men­tal prin­ci­ples to do so and the use­ful ex­er­cises to train our in­tu­itions of prob­a­bil­ity, like the Bayes-up ap­pli­ca­tion.

Per­haps more im­por­tantly still, an al­igned al­gorithm could be crit­i­cal to pro­mote in­tel­lec­tual hon­esty. Stud­ies sug­gest that what’s lack­ing in peo­ple’s rea­son­ings is of­ten not in­for­ma­tion it­self, but the abil­ity to pro­cess in­for­ma­tion in an effec­tive un­bi­ased man­ner. Typ­i­cally, more in­formed Repub­li­cans are also more likely to deny cli­mate change. One hy­poth­e­sis is that this is be­cause they also gain the abil­ity to bet­ter lie to them­selves.

In this video (and prob­a­bly her up­com­ing book), Ju­lia Galef ar­gues that the most effec­tive way to com­bat our habit to lie to our­selves is to de­sign in­cen­tives that re­ward in­tel­lec­tual hon­esty, chang­ing our own minds, pro­vid­ing clear ar­gu­ments, dis­miss­ing our own bul­lshits, and so on. While many of such re­wards may be de­signed in­ter­nally (by and for our own brains), be­cause we are so­cial crea­tures, most will likely need to come from our en­vi­ron­ments. To­day, much of this en­vi­ron­ment and of the so­cial re­wards we re­ceive come from so­cial me­dias; and un­for­tu­nately, most peo­ple usu­ally re­ceive greater re­wards (likes and retweets) by be­ing offen­sive, sar­cas­tic and car­i­cat­u­ral.

An al­igned al­gorithm could al­ign our own re­wards with what mo­ti­vates in­tel­lec­tual hon­esty, by fa­vor­ing con­nec­tions with com­mu­ni­ties that value in­tel­lec­tual hon­esty, mod­esty and growth mind­set. Thereby, the al­igned al­gorithm may be effec­tive in al­ign­ing our­selves with what we truly de­sire; not with our bul­lshits.

Im­pact on ex­is­ten­tial risks

What may be most counter-in­tu­itive is that short-term al­ign­ment may be ex­tremely use­ful for long-term AGI al­ign­ment (and other ex­is­ten­tial risks). In fact, to be hon­est, this is why I care so much about short-term al­ign­ment. I care about short-term al­ign­ment be­cause I see this as the most effec­tive way to in­crease the prob­a­bil­ity of achiev­ing long-term AGI al­ign­ment.

An al­igned recom­mender al­gorithm could typ­i­cally pro­mote video con­tents on long-term con­cerns. This would be crit­i­cal to nudge peo­ple to­wards longer-term per­spec­tives, and to com­bat our fa­mil­iar­ity bias. This seems cru­cial as well to defend the re­spectabil­ity of long-term per­spec­tives.

Per­haps more im­por­tantly, the great ad­van­tage of fo­cus­ing on short-term al­ign­ment is that it makes it a lot eas­ier to con­vince sci­en­tists, philoso­phers, but also en­g­ineers, man­agers and poli­ti­ci­ans to in­vest time and money on al­ign­ment. Yet, all such ex­per­tises (and still oth­ers) seem crit­i­cal for ro­bust al­ign­ment. We will likely need the formidable in­ter­dis­ci­plinary col­lab­o­ra­tion of thou­sands, if not hun­dreds of thou­sands, of schol­ars and pro­fes­sion­als to in­crease sig­nifi­cantly the prob­a­bil­ity of long-term AGI al­ign­ment. So let’s start re­cruit­ing them, one af­ter the other, us­ing ar­gu­ments that they will find more com­pel­ling.

But this is not all. Since short-term al­ign­ment is ar­guably not com­pletely differ­ent from long-term al­ign­ment, this re­search may be an ex­cel­lent prac­tice to bet­ter out­line the cruxes and the pit­falls we will en­counter for long-term al­ign­ment. In fact, some of the re­search on short-term al­ign­ment (see for in­stance this page on so­cial choice) might be giv­ing more re­li­able in­sights into long-term al­ign­ment than long-term al­ign­ment re­search it­self, which can be ar­gued to be some­times too spec­u­la­tive.

Typ­i­cally, it does not seem un­likely that long-term al­ign­ment will have to al­ign al­gorithms of big (pri­vate or gov­ern­men­tal) or­ga­ni­za­tions, even though most peo­ple in these big or­ga­ni­za­tions ne­glect the nega­tive side effects of their al­gorithms.


I have some­times faced a sort of con­tempt for short-term agen­das within EA. I hope to have con­vinced you in this post that this con­tempt may have been highly counter-pro­duc­tive, be­cause it might have led to the ne­glect of short-term AI al­ign­ment re­search. Yet, short-term AI al­ign­ment re­search seems crit­i­cal to nu­mer­ous EA causes, per­haps even in­clud­ing long-term AGI al­ign­ment.

To con­clude, I would like to stress the fact that this post is the re­sult of years of re­flex­ions by a few of us, mostly based in Lau­sanne, Switzer­land. Our re­flex­ions cul­mi­nated in the pub­li­ca­tion of a Ro­bustly Benefi­cial AI book in French called Le Fab­uleux Chantier, whose English trans­la­tion is pend­ing (feel free to con­tact us di­rectly to see the cur­rent draft). But we have also ex­plored other in­for­ma­tion dis­sem­i­na­tion for­mats, like the Ro­bustly Benefi­cial Pod­cast (YouTube, iTunes, RSS) and the Ro­bustly Benefi­cial Wiki.

In fact, af­ter suc­cess­fully ini­ti­at­ing a re­search group at EPFL (with pa­pers at ICML, NeurIPS,...), we are in the pro­cess of start­ing an AI Safety com­pany, called Cal­i­carpa, to ex­ploit our pub­lished re­sults and soft­wares (see for ex­am­ple this). Also, we have con­vinced a re­searcher in Morocco to tackle these ques­tions, who’s now build­ing a team and look­ing for 3 post­docs to do so.