Also in cell 1F of the results spreadsheet, “salty food” should be “sugary drinks”.
DanielFilan
Who are the 16 people you surveyed to determine the disvalue of reducing people’s freedom to buy sugary drinks?
[EDITED TO ADD: this comment burys the lede a bit, in a great-grandchild I do some napkin math that, if correct, indicates that the survey results mean that the reduced value of freedom entirely negates the health benefits]
we don’t use superintelligent singletons and probably won’t, I hope. We instead create context limited model instances of a larger model and tell it only about our task and the model doesn’t retain information.
FYI, current cutting-edge large language models are trained on a massive amount of text on the internet (in the case of GPT-4, likely approximately all the text OpenAI could get their hands on). So they certainly have tons of information about stuff other than the task at hand.
I asked Alex “no chance you can comment on whether you think assistance games are mostly irrelevant to modern deep learning?”
His response was “i think it’s mostly irrelevant, yeah, with moderate confidence”. He then told me he’d lost his EA forum credentials and said I should feel free to cross-post his message here.
(For what it’s worth, as people may have guessed, I disagree with him—I think you can totally do CIRL-type stuff with modern deep learning, to the extent you can do anything with modern deep learning.)
The core argument of Nick Bostrom’s bestselling book Superintelligence has also aged quite poorly: In brief, the book mostly assumed we will manually program a set of values into an AGI, and argued that since human values are complex, our value specification will likely be wrong, and will cause a catastrophe when optimized by a superintelligence. But most researchers now recognize that this argument is not applicable to modern ML systems which learn values, along with everything else, from vast amounts of human-generated data.
For what it’s worth, the book does discuss value learning as a way of an AI acquiring values—you can see chapter 13 as being basically about this.
I would describe the core argument of the book as the following (going off of my notes of chapter 8, “Is the default outcome doom?”):
It is possible to build AI that’s much smarter than humans.
This process could loop in on itself, leading to takeoff that could be slow or fast.
A superintelligence could gain a decisive strategic advantage and form a singleton.
Due to the orthogonality thesis, this superintelligence would not necessarily be aligned with human interests.
Due to instrumental convergence, an unaligned superintelligence would likely take over the world.
Because of the possibility of a treacherous turn, we cannot reliably check the safety of an AI on a training set.
There are things to complain about in this argument (a lot of “could”s that don’t necessarily cash out to high probabilities), but I don’t think it (or the book) assumes that we will manually program a set of values into an AGI.
- Sep 20, 2023, 7:07 AM; 27 points) 's comment on AI Pause Will Likely Backfire by (
Stuart Russell’s “assistance game” research agenda, started in 2016, is now widely seen as mostly irrelevant to modern deep learning— see former student Rohin Shah’s review here, as well as Alex Turner’s comments here.
The second link just takes me to Alex Turner’s shortform page on LW, where ctrl+f-ing “assistance” doesn’t get me any results. I do find this comment when searching for “CIRL”, which criticizes the CIRL/assistance games research program, but does not claim that it is irrelevant to modern deep learning. For what it’s worth, I think it’s plausible that Alex Turner thinks that assistance games is mostly irrelevant to modern deep learning (and plausible that he doesn’t think that) - I merely object that the link provided doesn’t provide good evidence of that claim.
The first link is to Rohin Shah’s reviews of Human Compatible and some assistance games / CIRL research papers. ctrl+f-ing “deep” gets me two irrelevant results, plus one description of a paper “which is inspired by [the CIRL] paper and does a similar thing with deep RL”. It would be hard to write such a paper if CIRL (aka assistance games) was mostly irrelevant to modern deep learning. The closest thing I can find is in the summary of Human Compatible, which says “You might worry that the proposed solution [of making AI via CIRL / assistance games] is quite challenging: after all, it requires a shift in the entire way we do AI.”. This doesn’t make assistance games irrelevant to modern deep learning—in 2016, it would have been true to say that moving the main thrust of AI research to language modelling so as to produce helpful chatbots required a shift in the entire way we did AI, but research into deeply learned large language models was not irrelevant to deep learning as of 2016 - in fact, it sprung out of 2016-era deep learning.
- Sep 20, 2023, 7:07 AM; 27 points) 's comment on AI Pause Will Likely Backfire by (
- Sep 19, 2023, 8:54 AM; 5 points) 's comment on AI Pause Will Likely Backfire by (
Sorry, maybe this is addressed elsewhere, but what relationship have you had with Nonlinear?
I suppose it’s relevant if you want to get a sense of the chances of ending up in a situation reminiscent of the one depicted in this post if you work for Nonlinear.
FWIW my intuition is that even if it’s permissible to illegally transport life-saving medicines, you shouldn’t pressure your employee to do so. Anyway I’ve set up a twitter poll, so we’ll see what others think.
I believe there is a reasonable risk should EAs… [d]ate coworkers, especially when there is a power differential and especially when there is a direct report relationship
I think you’re right that there’s some risk in these situations. But also: work is one of the main places where one is able to meet people, including potential romantic partners. Norms against dating co-workers therefore seem quite costly in lost romance, which I think is a big deal! I think it’s probably worth having norms against the cases you single out as especially risky, but otherwise, I’d rather our norms be laissez-faire.
For example some countries ban homosexuality, but your typical American would not consider it blameworthy to be gay.
I would object to my employer asking me to be homosexual.
Are you factoring in that CEA pays a few hundred bucks per attendee? I’d have a high-ish bar to pay that much for someone to go to a conference myself. Altho I don’t have a good sense of what the marginal attendee/rejectee looks like.
Definitely more plausible, but as a rule, “whenever you engage in some risky activity, you should do it to the standards of the top organizations who do it” doesn’t seem a priori plausible.
Would this post meet the standards of investigative journalism that’s typically published in mainstream news outlets such as the New York Times, the Washington Post, or the Economist?
I’m not exactly sure what this means, not being aware of what those standards are. It does strike me that IIUC those venues typically attempt to cover issues of national or international importance (or in the case of the NYT and WaPo, issues of importance to New York City or Washington, DC), and that’s probably the wrong bar for importance for whether someone should publish something on the EA forum or LessWrong.
Anyway, hope these responses satisfy your curiosity!
Does the piece conform to accepted journalist standards in terms of truth, balance, open-mindedness, context-sensitivity, newsworthiness, credibility of sources, and avoidance of libel? (Or is it a biased article that presupposed its negative conclusions, aka a ‘hit piece’, ‘takedown’, or ‘hatchet job’).
As a consumer of journalism, it strikes me that different venues have different such standards, so I’m not really sure what your first question is supposed to mean. Regarding your parenthetical, I think presupposing negative (or positive!) conclusions is to be avoided, and I endorse negatively judging pieces that do that.
Does the piece offer a coherent narrative that’s clearly organized according to a timeline of events, interactions, claims, counter-claims, and outcomes?
I think organization is a virtue, but not a must for a piece to be accurate or worth reading.
Does the piece show ‘scope-sensitivity’ in accurately judging the relative badness of different actions by different people and organizations, in terms of which things are actually trivial, which may have been unethical but not illegal, and which would be prosecutable in a court of law?
This strikes me as a good standard.
Did the author give the key targets of their negative coverage sufficient time and opportunity to respond to their allegations, and were their responses fully incorporated into the resulting piece, such that the overall content and tone of the coverage was fair and balanced?
Given the prominence of the comments sections in the venues where this piece has been published, I’d say allowing the targets to comment satisfies the value expressed by this. At any rate, I do think it’s good to incorporate responses from the targets of the coverage (as was done here), and I think that the overall tone of the coverage should be fair. I don’t know what “balance” is supposed to convey beyond fairness: I think that responses from the targets would ideally be reported where relevant and accurate, but otherwise I don’t think that e.g. half the piece should have to be praising the targets.
Were the anonymous sources credible? Did they have any personal or professional incentives to make false allegations? Are they mentally healthy, stable, and responsible?
I think the first two questions make sense as good criteria (altho criteria that are hard to judge externally). As for the last question, I think somebody could be depressed and routinely show up late to events while still being a good anonymous source, altho for some kinds of mental unhealth, instability, and irresponsibility, I see how they could be disqualifying.
Does the author have significant experience judging the relative merits of contradictory claims by different sources with different degrees of credibility and conflicts of interest?
I think most of us have been in situations where different people have told us different things about some topic, and those different people have had different degrees of credibility and conflict of interest? At any rate, I’m more interested in whether the piece is right than whether the author has had experience.
Does the author have any personal relationship to any of their key sources? Any personal or professional conflicts of interest? Any personal agenda? Was their payment of money to anonymous sources appropriate and ethical?
These seem like reasonable questions to ask. I whole-heartedly agree that such amateur journalists should only make payments that are appropriate and ethical—in fact, this strikes me as tautological.
OK I’m really confused—you calculate ~0.001 DALYs (which i guess is ~9 disability-adjusted life-hours) lost per person to eliminating people’s freedom to consume sugary drinks, adjust that down because you’re not eliminating it but just restricting it, make a second adjustment which I don’t understand but which I’ll assume is OK, multiply by the population of the average country to get a total number of DALYs, then:
But you estimate that taxing sugary drinks won’t eliminate the DMT2 disease burden, but instead reduce it by 0.02%. So shouldn’t this factor instead be 0.001% / 0.02% = 5%?