Maybe! I’m only going after a steady stream of 2-3 chapters per week. Be in touch if you’re interested: I’m re-reading the first quarter of PLF since they published a new version in the time since I knocked out the first quarter of it.
quinn
My upcoming CEEALAR stay
A small pile of thoughts on psychology of entrepreneurship
[Question] What would an EA do in the american revolution?
[Question] What would an EA do in the french revolution?
EA Philly’s Infodemics Event Part 1: Jeremy Blackburn
I’ve been increasingly hearing advice to the effect that “stories” are an effective way for an AI x-safety researcher to figure out what to work on, that drawing scenarios about how you think it could go well or go poorly and doing backward induction to derive a research question is better than traditional methods of finding a research question. Do you agree with this? It seems like the uncertainty when you draw such scenarios is so massive that one couldn’t make a dent in it, but do you think it’s valuable for AI x-safety researchers to make significant (i.e. more than 30% of their time) investments in both 1. doing this directly by telling stories and attempting backward induction, and 2. training so that their stories will be better/more reflective of reality (by studying forecasting, for instance)?
I’m thrilled about this post—during my first two-three years of studying math/cs and thinking about AGI my primary concern was the rights and liberties of baby agents (but I wasn’t giving suffering nearly adequate thought). Over the years I became more of an orthodox x-risk reducer, and while the process has been full of nutritious exercises, I fully admit that becoming orthodox is a good way to win colleagues, not get shrugged off as a crank at parties, etc. and this may have played a small role, if not motivated reasoning then at least humbly deferring to people who seem like they’re thinking clearer than me.
I think this area is sufficiently undertheorized and neglected that the following is only hypothetical, but could become important: how is one to tradeoff between existential safety (for humans) and suffering risks (for all minds)?
Value is complex and fragile. There are numerous reasons to be more careful than kneejerk cosmopolitanism, and if one’s intuitions are “for all minds, of course!” it’s important to think through what steps one’d have to take to become someone who thinks safeguarding humanity is more important than ensuring good outcomes for creatures in other substrates. This was best written about, to my knowledge, in the old Value Theory sequence by Eliezer Yudkowsky and to some extent Fun Theory, while it’s not 100% satisfying I don’t think one go-to sequence is the answer, as a lot of this stuff should be left as exercise for the reader.
Is anyone worried about x-risk and s-risk signaling a future of two opposite factions of EA? That is to say, what are the odds that there’s no way for humanity-preservers and suffering-reducers to get along? You can easily imagine disagreement about how to tradeoff research resources between human existential safety and artificial welfare, but what if we had to reason about deployment? Do we deploy an AI that’s 90% safe against some alien paperclipping outcome, 30% reduction in artificial suffering; or one that’s 75% safe against paperclipping, 70% reduction in artificial suffering?
If we’re lucky, there will be a galaxy-brained research agenda or program, some holes or gaps in the theory or implementation that allows and even encourages coalitioning between humanity-preservers and suffering-reducers. I don’t think we’ll be this lucky, in the limiting case where one humanity-preserver and one suffering-reducer are each at the penultimate stages of their goals. However we shouldn’t be surprised if there is some overlap, the cooperative AI agenda comes to mind.
I find myself shocked at point #2, at the inadequacy of the state of theory of these tradeoffs. Is it premature to worry about that before the AS movement has even published a detailed agenda/proposal of how to allocate research effort grounded in today’s AI field? Much theorization is needed to even get to that point, but it might be wise to think ahead.
I look forward to reading the preprint this week, thanks
Awesome! I probably won’t apply as I lack political background and couldn’t tell you the first thing about running a poll, but my eyes will be keenly open in case you post a broader data/analytics job as you grow. Good luck with the search!
EA Philly’s Infodemics Event Part 2: Aviv Ovadya
High Impact Careers in Formal Verification: Artificial Intelligence
Cliffnotes to Craft of Research parts I, II, and III
Hi Luke, could you describe a candidate that would inspire you to flex the bachelor’s requirement for Think Tank Jr. Fellow? I took time off credentialed institutions to do lambda school and work (didn’t realize I want to be a researcher until I was already in industry), but I think my overall CS/ML experience is higher than a ton of the applicants you’re going to get (I worked on cooperative AI at AI Safety Camp 5 and I’m currently working on multi-multi delegation, hence my interest in AI governance). If possible, I’d like to hear from you how you’re thinking about the college requirement before I invest the time into writing a cumulative 1400 words.
Ah, just saw techpolicyfellowship@openphilanthropy.org at the bottom of the page. Sorry, will direct my question to there!
We’re writing to let you know that the group you tried to contact (techpolicyfellowship) may not exist, or you may not have permission to post messages to the group. A few more details on why you weren’t able to post:
* You might have spelled or formatted the group name incorrectly.
* The owner of the group may have removed this group.
* You may need to join the group before receiving permission to post.
* This group may not be open to posting.
If you have questions related to this or any other Google Group, visit the Help Center at https://support.google.com/a/openphilanthropy.org/bin/topic.py?topic=25838.
Thanks,
openphilanthropy.org admins
(cc’d to the provided email address)
In Think Tank Junior Fellow, OP writes
Recently obtained a bachelor’s or master’s degree (including Spring 2022 graduates)
How are you thinking about this requirement? Is there something flex about it (like when a startup says they want a college graduate) or are there bureaucratic forces at partner organizations locking it in stone (like when a hospital IT department says they want a college graduate)? Perhaps describe properties of a hypothetical candidate that would inspire you to flex this requirement?
What’s the latest on moral circle expansion and political circle expansion?
Were slaves excluded from the moral circle in ancient greece or the US antebellum south, and how does this relate to their exclusion from the political circle?
If AIs could suffer, is recognizing that capacity a slippery slope toward giving AIs the right to vote?
Can moral patients be political subjects, or must political subjects be moral agents? If there was some tipping point or avalanche of moral concern for chickens, that wouldn’t imply arguments for political representation of chickens, right?
Consider pre-suffrage women, or contemporary children: they seem fully admitted into the moral circle, but only barely admitted to the political circle.
A critique of MCE is that history is not one march of worse to better (smaller to larger), there are in fact false starts, moments of retrograde, etc. Is PCE the same but even moreso?
If I must make a really bad first approximation, I would say a rubber band is attached to the moral circle, and on the other end of the rubber band is the political circle, so when the moral circle expands it drags the political circle along with it on a delay, modulo some metaphorical tension and inertia. This rubber band model seems informative in the slave case, but uselessly wrong in the chickens case, and points to some I think very real possibilities in the AI case.
CW death
I’m imagining myself having a 6+ figure net worth at some point in a few years, and I don’t know anything about how wills work.
Do EAs have hit-by-a-bus contingency plans for their net worths?
Is there something easy we can do to reduce the friction of the following process: Ask five EAs with trustworthy beliefs and values to form a grantmaking panel in the event of my death. This grantmaking panel could meet for thirty minutes and make a weight allocation decision on the giving what we can app, or they can accept applications and run it that way, or they can make an investment decision that will interpret my net worth as seed money for an ongoing fund; it would be up to them.
I’m assuming this is completely possible in principle: I solicit those five EAs who have no responsibilities or obligations as long as I’m alive, if they agree I get a lawyer to write up a will that describes everything.
If one EA has done this, the “template contract” would be available to other EAs to repeat it. Would it be worth lowering the friction of making this happen?
Related idea: I can hardcode weight assignment for the giving what we can app into my will, surely a non-EA will-writing lawyer could wrap their head around this quickly. But is there a way to not have to solicit the lawyer every time I want to update my weights, in response to my beliefs and values changing while I’m alive?
It sounds at the face of it that the second idea is lower friction and almost as valuable as the first idea for most individuals.
Thanks for the comment. I wasn’t aware of yours and Rohin’s discussion on Arden’s post. Did you flesh out the inductive alignment idea on lw or alignment forum? It seems really promising to me.
I want to jot down notes more substantive than “wait until I post ‘Going Long on FV’ in a few months” today.
FV in AI Safety in particular
As Rohin’s comment suggests, both aiming proofs about properties of models toward today’s type theories and aiming tomorrow’s type theories toward ML have two classes of obstacles: 1. is it possible? 2. can it be made competitive?
I’ve gathered that there’s a lot of pessimism about 1, in spite of MIRI’s investment in type theory and in spite of the word “provably” in CHAI’s charter. My personal expected path to impact as it concerns 1. is “wait until theorists smarter than me figure it out”, and I want to position myself to worry about 2..
I think there’s a distinction between theories and products, and I think programmers need to be prepared to commercialize results. There’s a fundamental question: should we expect that a theory’s competitiveness can be improved one or more orders of magnitude by engineering effort, or will engineering effort only provide improvements of less than an order of magnitude? I think a lot depends on how you feel about this.
Asya:
Asya may not have been speaking about AI safety here, but my basic thinking is that if less primitive proof assistants end up drastically more competitive, and at the same time there are opportunities convert results in verified ML into tooling, expertise in this area could gain a lot of leverage.
FV in other paths to impact
Rohin:
It’s not clear to me that grinding FV directly is as wise as, say, CompTIA certifications. From the expectation that FV pays dividends in advanced cybersec, we cannot conclude that FV is relevant to early stages of a cybersec path.
Related: Information security careers for GCR reduction. I think the software safety standards in a wide variety of fields have a lot of leverage over outcomes.