I guess I am way late to the party, but.....
What part of the MIRI research agenda do you think is the most accessible to people with the least background?
How could AI alignment research be made more accessible?
I guess I am way late to the party, but.....
What part of the MIRI research agenda do you think is the most accessible to people with the least background?
How could AI alignment research be made more accessible?
This is really interesting stuff, and thanks for the references.
It’d be nice to clarify what: “finite intergenerational equity over [0,1]^N” means (specifically, the “over [0,1]^N” bit).
Why isn’t the sequence 1,1,1,… a counter-example to Thm4.8 (dictatorship of the present)? I’m imagining exponential discounting, e.g. of 1⁄2 so the welfare function of this should return 2 (but a different number if u_t is changed, for any t).
Right, I was going to mention the fact that AIS concerned people are very interested in courting the ML community, and very averse to anything which might alienate them, but it’s already come up.
I’m not sure I agree with this strategy. I think we should maybe be more “good cop / bad cop” about it. I think the response so far from ML people is almost indefensible, and the AIS folks have won every debate so far, but there is of course this phenomena with debate where you think that your side won ;).
If it ends up being necessary to slow down research, or, more generally, carefully control AI technology in some way, then we might have genuine conflicts of interest with AI researchers which can’t be resolved solely by good cop tactics. This might be the case if, e.g. using SOTA AIS techniques significantly impairs performance or research, which I think it likely.
It’s still a huge instrumental good to get more ML people into AIS and supportive of it, but I don’t like to see AIS people bending over backwards to do this.
In general, I think that people are being too conservative about addressing the issue. I think we need some “radicals” who aren’t as worried about losing some credibility. Whether or not you want to try and have mainstream appeal, or just be straightforward with people about the issue is a strategic question that should be considered case-by-case.
Of course, it is a big problem that talking about AIS makes a good chunk of people think you’re nuts. It’s been my impression that most of those people are researchers, not the general public, who are actually quite receptive to the idea (although maybe for the wrong reasons...)
Hey I (David Krueger) remember we spoke about this a bit with Toby when I was at FHI this summer.
I think we should be aiming for something like CEV, but we might not get it, and we should definitely consider scenarios where we have to settle for less.
For instance, some value-aligned group might find that its best option (due to competitive pressures) is to create an AI which has a 50% probability of being CEV-like or “aligned via corrigibility”, but has a 50% probability of (effectively) prematurely settling on a utility function whose goodness depends heavily on the nature of qualia.
If (as I believe) such a scenario is likely, then the problem is time-sensitive.
Thanks for this!
A few comments:
RE: public policy / outreach:
“However, I now think this is a mistake.” What do you think is a mistake?
“Given this, I actually think policy outreach to the general population is probably negative in expectation.” I think this makes more sense if you see us as currently on or close to a winning path. I am more pessimistic about our current prospects for victory, so I favor a higher risk/reward. I tend to see paths to victory as involving a good chunk of the population having a decent level of understanding of AI and Xrisk.
“I think this is why a number of EA organisations seem to have seen sublinear returns to scale.” Which ones?
“There have been at least two major negative PR events, and a number of near misses.” again, I am very curious what you refer to!
MIRI seems like the most value-aligned and unconstrained of the orgs.
OpenAI also seems pretty unconstrained, but I have no idea what their perspective on Xrisk is, and all reports are that there is no master plan there.
I think I was too terse; let me explain my model a bit more.
I think there’s a decent chance (OTTMH, let’s say 10%) that without any deliberate effort we make an AI which wipes our humanity, but is anyhow more ethically valuable than us (although not more than something which we deliberately design to be ethically valuable). This would happen, e.g. if this was the default outcome (e.g. if it turns out to be the case that intelligence ~ ethical value). This may actually be the most likely path to victory.**
There’s also some chance that all we need to do to ensure that AI has (some) ethical value (e.g. due to having qualia) is X. In that case, we might increase our chance of doing X by understanding qualia a bit better.
Finally, my point was that I can easily imagine a scenario in which our alternatives are:
Build an AI with 50% chance of being aligned, 50% chance of just being an AI (with P(AI has property X) = 90% if we understand qualia better, 10% else)
Allow our competitors to build an AI with ~0% chance of being ethically valuable.
So then we obviously prefer option1, and if we understand qualia better, option 1 becomes better.
* I notice as I type this that this may have some strange consequences RE high-level strategy; e.g. maybe it’s better to just make something intelligent ASAP and hope that it has ethical value, because this reduces its X-risk, and we might not be able to do much to change the distribution of the ethical value the AI we create produces that much anyhow*. I tend to think that we should aim to be very confident that the AI we build is going to have lots of ethical value, but this may only make sense if we have a pretty good chance of succeeding.
Sure, but the examples you gave are more about tactics than content. What I mean is that there are a lot of people who are downplaying their level of concern about Xrisk in order to not turn off people who don’t appreciate the issue. I think that can be a good tactic, but it also risks reducing the sense of urgency people have about AI-Xrisk, and can also lead to incorrect strategic conclusions, which could even be disasterous when they are informing crucial policy decisions.
TBC, I’m not saying we are lacking in radicals ATM, the level is probably about right. I just don’t think that everyone should be moderating their stance in order to maximize their credibility with the (currently ignorant, but increasingly less so) ML research community.
EDIT: I forgot to link to the Google group: https://groups.google.com/forum/#!forum/david-kruegers-80k-people
Hi! David Krueger (from Montreal and 80k) here. The advice others have given so far is pretty good.
My #1 piece of advise is: start doing research ASAP!
Start acting like a grad student while you are still an undergrad. This is almost a requirement to get into a top program afterwards. Find a supervisor and ideally try to publish a paper at a good venue before you graduate.
Stats is probably a bit more relevant than CS, but some of both is good. I definitely recommend learning (some) programming. In particular, focus on machine learning (esp. Deep Learning and Reinforcement Learning). Do projects, build a portfolio, and solicit feedback.
If you haven’t already, please check out these groups I created for people wanting to get into AI Safety. There are a lot of resources to get you started in the Google Group, and I will be adding more in the near future. You can also contact me directly (see https://mila.umontreal.ca/en/person/david-scott-krueger/ for contact info) and we can chat.
“This is something the EA community has done well at, although we have tended to focus on talent that current EA organization might wish to hire. It may make sense for us to focus on developing intellectual talent as well.”
Definitely!! Are there any EA essay contests or similar? More generally, I’ve been wondering recently if there are many efforts to spread EA among people under the age of majority. The only example I know of is SPARC.
“But maybe that’s just because I am less satisfied with the current EA “business model”/”product” than most people.”
Care to elaborate (or link to something?)
People are motivated both by:
competition and status and
cooperation and identifying with the successes of a group. I think we should aim to harness both of these forms of motivation.
Do you have any info on how reliable self-reports are wrt counterfactuals about career changes and EWWC pledging?
I can imagine that people would not be very good at predicting that accurately.
I was overall a bit negative on Sarah’s post, because it demanded a bit too much attention, (e.g. the title), and seemed somewhat polemic. It was definitely interesting, and I learned some things.
I find the most evocative bit to be the idea that EA treats outsiders as “marks”.
This strikes me as somewhat true, and sadly short-sighted WRT movement building.
I do believe in the ideas of EA, and I think they are compelling enough that they can become mainstream.
Overall, though, I think it’s just plain wrong to argue for an unexamined idea of honesty as some unquestionable ideal. I think doing so as a consequentialist, without a very strong justification, itself smacks of disingenuousness and seems motivated by the same phony and manipulative attitude towards PR that Sarah’s article attacks.
What would be more interesting to me would be a thoughtful survey of potential EA perspectives on honesty, but an honest treatment of the subject does seem to be risky from a PR standpoint. And it’s not clear that it would bring enough benefit to justify the cost. We probably will all just end up agreeing with common moral intuitions.
(cross posted on facebook):
I was thinking of applying… it’s a question I’m quite interested in. The deadline is the same as ICML tho!
I had an idea I will mention here: funding pools:
You and your friends whose values and judgement you trust and who all have small-scale funding requests join together.
A potential donor evaluates one funding opportunity at random, and funds all or none of them on the basis of that evaluation.
You have now increased the ratio of funding / evaluation available to a potential donor by a factor of #projects
There is an incentive for you to NOT include people in your pool if you think their proposal is quite inferior to yours… however, you might be incentivized to include somewhat inferior proposals in order to reach a threshold where the combined funding opportunity is large enough to attract more potential donors.
I’m also very interested in hearing you elaborate a bit.
I guess you are arguing that AIS is a social rather than a technical problem. Personally, I think there are aspects of both, but that the social/coordination side is much more significant.
RE: “MIRI has focused in on an extremely specific kind of AI”, I disagree. I think MIRI has aimed to study AGI in as much generality as possible and mostly succeeded in that (although I’m less optimistic than them that results which apply to idealized agents will carry over and produce meaningful insights in real-world resource-limited agents). But I’m also curious what you think MIRIs research is focusing on vs. ignoring.
I also would not equate technical AIS with MIRI’s research.
Is it necessary to be convinced? I think the argument for AIS as a priority is strong so long as the concerns have some validity to them, and cannot be dismissed out of hand.
Will—I think “meta-reasoning” might capture what you mean by “meta-decision theory”. Are you familiar with this research (e.g. Nick Hay did a thesis w/Stuart Russell on this topic recently)?
I agree that bounded rationality is likely to loom large, but I don’t think this means MIRI is barking up the wrong tree… just that other trees also contain parts of the squirrel.
Nailed it.
(anyone have) any suggestions for how to make progress in this area?