# riceissa

Karma: 774
Page 1
• How did you de­cide on “blog posts, cross-posted to EA Fo­rum” as the main out­put for­mat for your or­ga­ni­za­tion? How de­liber­ate was this choice, and what were the rea­sons go­ing into it? There are many other out­put for­mats that could have been cho­sen in­stead (e.g. pa­pers, wiki pages, in­ter­ac­tive/​tool web­site, blog+stan­dalone web pages, on­line book, timelines).

• wik­ieahuborg_w-20180412-his­tory.xml con­tains the dump, which can be im­ported to a Me­di­aWiki in­stance.

• Re: The old wiki on the EA Hub, I’m afraid the old wiki data got cor­rupted, it wasn’t backed up prop­erly and it was deemed too difficult to re­store at the time :(. So it looks like the in­for­ma­tion in that wiki is now lost to the winds.

I think a dump of the wiki is available at https://​​archive.org/​​de­tails/​​wiki-wik­ieahuborg_w.

• Do you have any thoughts on Qualia Re­search In­sti­tute?

• Over the years, you have pub­lished sev­eral pieces on ways you’ve changed your mind (e.g. about EA, an­other about EA, weird ideas, he­do­nic util­i­tar­i­anism, and a bunch of other ideas). While I’ve en­joyed read­ing the posts and the se­lec­tion of ideas, I’ve also found most of the posts frus­trat­ing (the he­do­nic util­i­tar­i­anism one is an ex­cep­tion) be­cause they mostly only give the di­rec­tion of the up­date, with­out also giv­ing the rea­son­ing and ad­di­tional ev­i­dence that caused the up­date* (e.g. in the EA post you write “I am erring on the side of writ­ing this faster and in­clud­ing more of my con­clu­sions, at the cost of not very clearly ex­plain­ing why I’ve shifted po­si­tions”). Is there a rea­son you keep writ­ing in this style (e.g. you don’t have time, or you don’t want to “give away the an­swers” to the reader), and if so, what is the rea­son?

*Why do I find this frus­trat­ing? My ba­sic rea­son­ing is some­thing like this: I think this style of writ­ing forces the reader to do a weird kind of Au­mann rea­son­ing where they have to guess what ev­i­dence/​ar­gu­ments Buck might have had at the start, and what ev­i­dence/​ar­gu­ments he sub­se­quently saw, in or­der to try to re­con­struct the up­date. When I en­counter this kind of writ­ing, I mostly just take it as so­cial in­for­ma­tion about who be­lieves what, with­out both­er­ing to go through the Au­mann rea­son­ing (be­cause it seems im­pos­si­ble or would take way too much effort). See also this com­ment by Wei Dai.

• Do you think non-al­tru­is­tic in­ter­ven­tions for AI al­ign­ment (i.e. AI safety “prep­ping”) make sense? If so, do you have sug­ges­tions for con­crete ac­tions to take, and if not, why do you think they don’t make sense?

(Note: I pre­vi­ously asked a similar ques­tion ad­dressed at some­one else, but I am cu­ri­ous for Buck’s thoughts on this.)

• How do you see suc­cess/​an “ex­is­ten­tial win” play­ing out in short timeline sce­nar­ios (e.g. less than 10 years un­til AGI) where al­ign­ment is non-triv­ial/​turns out to not solve it­self “by de­fault”? For ex­am­ple, in these sce­nar­ios do you see MIRI build­ing an AGI, or as­sist­ing/​ad­vis­ing an­other group to do so, or some­thing else?

• [Meta] Dur­ing the AMA, are you plan­ning to dis­t­in­guish (e.g. by giv­ing short replies) be­tween the case where you can’t an­swer a ques­tion due to MIRI’s non-dis­clo­sure policy vs the case where you won’t an­swer a ques­tion sim­ply be­cause there isn’t enough time/​it’s too much effort to an­swer?

• The 2017 MIRI fundraiser post says “We plan to say more in the fu­ture about the crite­ria for strate­gi­cally ad­e­quate pro­jects in 7a” and also “A num­ber of the points above re­quire fur­ther ex­pla­na­tion and mo­ti­va­tion, and we’ll be pro­vid­ing more de­tails on our view of the strate­gic land­scape in the near fu­ture”. As far as I can tell, MIRI hasn’t pub­lished any fur­ther ex­pla­na­tion of this strate­gic plan. Is MIRI still plan­ning to say more about its strate­gic plan in the near fu­ture, and if so, is there a con­crete timeframe (e.g. “in a few months”, “in a year”, “in two years”) for pub­lish­ing such an ex­pla­na­tion?

(Note: I asked this ques­tion a while ago on LessWrong.)

• I asked a ques­tion on LessWrong re­cently that I’m cu­ri­ous for your thoughts on. If you don’t want to read the full text on LessWrong, the short ver­sion is: Do you think it has be­come harder re­cently (say 2013 vs 2019) to be­come a math­e­mat­i­cian at MIRI? Why or why not?

• In Novem­ber 2018 you said “we want to hire as many peo­ple as en­g­ineers as pos­si­ble; this would be dozens if we could, but it’s hard to hire, so we’ll more likely end up hiring more like ten over the next year”. As far as I can tell, MIRI has hired 2 en­g­ineers (Ed­ward Kmett and James Payor) since you wrote that com­ment. Can you com­ment on the dis­crep­ancy? Did hiring turn out to be much more difficult than ex­pected? Are there not enough good en­g­ineers look­ing to be hired? Are there a bunch of en­g­ineers who aren’t on the team page/​haven’t been an­nounced yet?

• On the SSC road­trip post, you say “After our trip, I’ll write up a post-mortem for other peo­ple who might be in­ter­ested in do­ing things like this in the fu­ture”. Are you still plan­ning to write this, and if so, when do you ex­pect to pub­lish it?

• Back in July, you held an in-per­son Q&A at REACH and said “There are a bunch of things about AI al­ign­ment which I think are pretty im­por­tant but which aren’t writ­ten up on­line very well. One thing I hope to do at this Q&A is try say­ing these things to peo­ple and see whether peo­ple think they make sense.” Could you say more about what these im­por­tant things are, and what was dis­cussed at the Q&A?

• I read the pa­per (skip­ping al­most all the math) and Philip Tram­mell’s blog post. I’m not sure I un­der­stood the pa­per, and in any case I’m pretty con­fused about the topic of how growth in­fluences x-risk, so I want to ask you a bunch of ques­tions:

1. Why do the time axes in many of the graphs span hun­dreds of years? In dis­cus­sions about AI x-risk, I mostly see some­thing like 20-100 years as the rele­vant timescale in which to act (i.e. by the end of that pe­riod, we will ei­ther go ex­tinct or else build an al­igned AGI and reach a tech­nolog­i­cal sin­gu­lar­ity). Look­ing at Figure 7, if we only look ahead 100 years, it seems like the risk of ex­tinc­tion ac­tu­ally goes up in the ac­cel­er­ated growth sce­nario.

2. What do you think of Wei Dai’s ar­gu­ment that safe AGI is harder to build than un­safe AGI and we are cur­rently putting less effort into the former, so slower growth gives us more time to do some­thing about AI x-risk (i.e. slower growth is bet­ter)?

3. What do you think of Eliezer Yud­kowsky’s ar­gu­ment that work for build­ing an un­safe AGI par­allelizes bet­ter than work for build­ing a safe AGI, and that un­safe AGI benefits more in ex­pec­ta­tion from hav­ing more com­put­ing power than safe AGI, both of which im­ply that slower growth is bet­ter from an AI x-risk view­point?

4. What do you think of Nick Bostrom’s urn anal­ogy for tech­nolog­i­cal de­vel­op­ments? It seems like in the anal­ogy, faster growth just means pul­ling out the balls at a faster rate with­out af­fect­ing the prob­a­bil­ity of pul­ling out a black ball. In other words, we hit the same amount of risk but ev­ery­thing just hap­pens sooner (i.e. growth is neu­tral).

5. Look­ing at Figure 7, my “story” for why faster growth low­ers the prob­a­bil­ity of ex­tinc­tion is this: The richer peo­ple are, the less they value marginal con­sump­tion, so the more they value safety (rel­a­tive to con­sump­tion). Faster growth gets us sooner to the point where peo­ple are rich and value safety. So faster growth effec­tively gives so­ciety less time in which to mess things up (how­ever, I’m con­fused about why this hap­pens; see the next point). Does this sound right? If not, I’m won­der­ing if you could give a similar in­tu­itive story.

6. I am con­fused why the height of the haz­ard rate in Figure 7 does not in­crease in the ac­cel­er­ated growth case. I think equa­tion (7) for might be the cause of this, but I’m not sure. My own in­tu­ition says ac­cel­er­ated growth not only con­denses along the time axis, but also stretches along the ver­ti­cal axis (so that the area un­der the curve is mostly un­af­fected).

As an ex­treme case, sup­pose growth halted for 1000 years. It seems like in your model, the graph for haz­ard rate would be con­stant at some fixed level, ac­cu­mu­lat­ing ex­tinc­tion prob­a­bil­ity dur­ing that time. But my in­tu­ition says the haz­ard rate would first drop near zero and then stay con­stant, be­cause there are no new dan­ger­ous tech­nolo­gies be­ing in­vented. At the op­po­site ex­treme, sup­pose we sud­denly get a huge boost in growth and effec­tively reach “the end of growth” (near pe­riod 1800 in Figure 7) in an in­stant. Your model seems to say that the graph would com­press so much that we al­most cer­tainly never go ex­tinct, but my in­tu­ition says we do ex­pe­rience a lot of risk for ex­tinc­tion. Is my in­ter­pre­ta­tion of your model cor­rect, and if so, could you ex­plain why the height of the haz­ard rate graph does not in­crease?

This re­minds me of the ques­tion of whether it is bet­ter to walk or run in the rain (keep­ing dis­tance trav­eled con­stant). We can imag­ine a mod­ifi­ca­tion where the rain­drops are mo­tion­less in the air.

• Sev­eral back­ground vari­ables give rise to wor­ld­views/​out­looks about how to make the tran­si­tion to a world with AGIs go well. An­swer­ing this ques­tion re­quires as­sign­ing val­ues to the back­ground vari­ables or plac­ing weights on the var­i­ous wor­ld­views, and then think­ing about how likely “Dis­ney­land with no chil­dren” sce­nar­ios are un­der each wor­ld­view, by e.g. look­ing at how they solve philo­soph­i­cal prob­lems (par­tic­u­larly de­liber­a­tion) and how likely ob­vi­ous vs non-ob­vi­ous failures are.

That is to say, I think an­swer­ing ques­tions like this is pretty difficult, and I don’t think there are any deep pub­lic analy­ses about it. I ex­pect most EAs who don’t spe­cial­ize in AI al­ign­ment to do some­thing on the or­der of “un­der MIRI’s views the main difficulty is get­ting any sort of al­ign­ment, so this kind of failure mode isn’t the main con­cern, at least un­til we’ve solved al­ign­ment; un­der Paul’s views we will sort of have con­trol over AI sys­tems, at least in the be­gin­ning, so this kind of failure seems like one of the many things to be wor­ried about; over­all I’m not sure how much weight I place on each view, and don’t know what to think so I’ll just wait for the AI al­ign­ment field to pro­duce more in­sights”.

• The in­con­sis­tency is it­self a lit­tle con­cern­ing.

I am one of the con­trib­u­tors to the Dona­tions List Web­site (DLW), the site you link to. DLW is not af­fili­ated with the EA Ho­tel in any­way (al­though Vipul, the main­tainer of DLW, made a dona­tion to the EA Ho­tel). Some rea­sons for the dis­crep­ancy in this case:

• As stated in bold let­ters at the top of the page, “Cur­rent data is pre­limi­nary and has not been com­pletely vet­ted and nor­mal­ized”. I don’t think this is the main rea­son in this case.

• Pul­ling data into DLW is not au­to­matic, so there is a lag be­tween when the dona­tions are made and when they ap­pear on DLW.

• DLW only tracks pub­lic dona­tions.

• The rea­son may be some­what sim­ple: most AI al­ign­ment re­searchers do not par­ti­ci­pate (post or com­ment) on LW/​AF or par­ti­ci­pate only a lit­tle.

I’m won­der­ing how many such peo­ple there are. Speci­fi­cally, how many peo­ple (i) don’t par­ti­ci­pate on LW/​AF, (ii) don’t already get paid for AI al­ign­ment work, and (iii) do se­ri­ously want to spend a sig­nifi­cant amount of time work­ing on AI al­ign­ment or already do so in their free time? (So I want to ex­clude re­searchers at or­ga­ni­za­tions, ran­dom peo­ple who con­tact 80,000 Hours for ad­vice on how to get in­volved, peo­ple who at­tend a MIRI work­shop or AI safety camp but then hap­pily go back to do­ing non-al­ign­ment work, etc.) My own feel­ing be­fore read­ing your com­ment was that there are maybe 10-20 such peo­ple, but it sounds like there may be many more than that. Do you have a spe­cific num­ber in mind?

if you fol­low just LW, your un­der­stand­ing of the field of AI safety is likely some­what distorted

I’m aware of this, and I’ve seen Wei Dai’s post and the com­ments there. Per­son­ally I don’t see an easy way to get ac­cess to more pri­vate dis­cus­sions due to a va­ri­ety of fac­tors (not be­ing in­vited to work­shops, some work­shops be­ing too ex­pen­sive for it to be worth trav­el­ing to, not be­ing el­i­gible to ap­ply for cer­tain pro­grams, and so on).

• A trend I’ve no­ticed in the AI safety in­de­pen­dent re­search grants for the past two rounds (April and Au­gust) is that most of the grantees have lit­tle to no on­line pres­ence as far as I know (they could be us­ing pseudonyms I am un­aware of); I be­lieve Alex Turner and David Man­heim are the only ex­cep­tions. How­ever, when I think about “who am I most ex­cited to give in­di­vi­d­ual re­search grants to, if I had that kind of money?”, the names I come up with are peo­ple who leave in­ter­est­ing com­ments and posts on LessWrong about AI safety. (This isn’t sur­pris­ing be­cause I mostly in­ter­act with the AI safety com­mu­nity pub­li­cly on­line, so I don’t have much ac­cess to pri­vate info.) To give an idea of the kind of peo­ple I am think­ing of, I would name John Went­worth, Steve Byrnes, Ofer G., Mor­gan Sin­claire, and Evan Hub­inger as ex­am­ples.

This has me won­der­ing what’s go­ing on. Some pos­si­bil­ities I can think of:

1. the peo­ple who con­tribute on LW aren’t ap­ply­ing for grants

2. the pri­vate peo­ple are higher qual­ity than the on­line people

3. the pri­vate peo­ple have more cre­den­tials than the on­line peo­ple (e.g. Hertz Fel­low­ship, math con­tests ex­pe­rience)

4. fund man­agers are more re­cep­tive offline than on­line and it’s eas­ier to net­work offline

5. fund man­agers don’t fol­low on­line dis­cus­sions closely

I would ap­pre­ci­ate if the fund man­agers could weigh in on this so I have a bet­ter sense of why my own think­ing seems to di­verge so much from the ac­tual grant recom­men­da­tions.