Background in philosophy, international development, statistics. Doing a technical AI PhD at Bristol.
Financial conflict of interest: technically the British government through the funding council.
I’m actually pretty happy for this warning to spread; it’s not a big problem now(?), but will be if growth continues. Vigilance is the way to make the critique untrue.
OTOH you don’t necessarily want to foreground it as the first theme of EA, or even the main thing to worry about.
Looks like a great year Jaime!
Strongly agree that freedom to take side projects is a huge upside to PhDs. What other job lets you drop everything to work full-time for a month, on something with no connection to your job description?
I think this is your best post this year. Because rarely said, despite these failure modes seeming omnipresent. (I fall into em all the time!)
Yep, skip Phlebas at first—but do come back to it later, because despite being silly and railroading, it is the clearest depiction of the series’ main theme, which is people’s need for Taylorian strong evaluation, the dissatisfaction of unlimited pleasure and freedom, liberalism as unstoppable, unanswerable assimilator.
I wrote a longtermist critique of the Culture here.
Surface Detail is about desperately trying to prevent an s-risk. Excession is the best on most axes.
Not a bio guy, but in general: talk to more people! List people you think are doing good work and ask em directly.
Also generically: try to do some real work in as many of them as you can. I don’t know how common undergrad research assistants are in your fields, or in Australian unis, but it should be doable (if you’re handling your courseload ok).
PS: Love the username.
Big old US >> UK pay gap imo. Partial explanation for that: 32 days holiday in the UK vs 10 days US.
(My base pay was 85% of total; 100% seems pretty normal in UK tech.)Other big factor: this was in a sorta sleepy industry that tacitly trades off money for working the contracted 37.5 h week, unlike say startups. Per hour it was decent, particularly given 10% study time.
If we say hustling places have a 50 h week (which is what one fancy startup actually told me they expected), then 41 looks fine.
Agree with the spirit—there is too much herding, and I would love for Schubert’s distinctions to be core concepts. However, I think the problem you describe appears in the gap between the core orgs and the community, and might be pretty hard to fix as a result.
What material implies that EA is only about ~4 things?
semi-official intro talks and Fellowship syllabi
the landing page has 3 main causes and mentions 6 more
the revealed preferences of what people say they’re working on, the distribution of object-level post tags
What emphasises cause divergence and personal fit?
80k have their top 7 of course, but the full list of recommended ones has 23
Personal fit is the second thing they raise, after importance
New causes, independent thinking, outreach, cause X, and ‘question > ideology’ is a major theme at every EAG and (by eye) in about a fifth of the top-voted Forum posts.
So maybe limited room for improvements to communication? Since it’s already pretty clear.
Intro material has to mention some examples, and only a couple in any depth. How should we pick examples? Impact has to come first. Could be better to not always use the same 4 examples, but instead pick the top 3 by your own lights and then draw randomly from the top 20.
Also, I’ve always thought of cause neutrality as conditional—“if you’re able to pivot, and if you want to do the most good, what should you do?” and this is emphasised in plenty of places. (i.e. Personal fit and meeting people where they are by default.) But if people are taking it as an unconditional imperative then that needs attention.
Brian Christian is incredibly good at tying the short-term concerns everyone already knows about to the long-term concerns. He’s done tons of talks and podcasts—not sure which is best, but if 3 hours of heavy content isn’t a problem, the 80k one is good.
There’s already a completely mainstream x-risk: nuclear weapons (and, popularly, climate change). It could be good to compare AI to these accepted handles. The second species argument can be made pretty intuitive too.
Bonus: here’s what I told my mum.
AIs are getting better quite fast, and we will probably eventually get a really powerful one, much faster and better at solving problems than people. It seems really important to make sure that they share our values; otherwise, they might do crazy things that we won’t be able to fix. We don’t know how hard it is to give them our actual values, and to assure that they got them right, but it seems very hard. So it’s important to start now, even though we don’t know when it will happen, or how dangerous it will be.
[I don’t know you, so please feel free to completely ignore any of the following.]
I personally know three EAs who simply aren’t constituted to put up with the fake work and weak authoritarianism of college. I expect any of them to do great things. Two other brilliant ones are Chris Olah and Kelsey Piper. (I highly recommend Piper’s writing on the topic for deep practical insights and as a way of shifting the balance of responsibility partially off yourself and onto the ruinous rigid bureaucracy you are in. She had many of the same problems as you, and things changed enormously once she found a working environment that actually suited her. Actually just read the whole blog, she is one of the greats.)
80k have some notes on effective alternatives to a degree. kbog also wrote a little guide.
In the UK a good number of professions have a non-college “apprenticeship” track, including software development and government! I don’t know about the US.
This is not to say that you should not do college, just that there are first-class precedents and alternatives.
More immediately: I highly recommend coworking as a solution to ugh. Here’s the best kind, Brauner-style, or here are nice group rooms on Focusmate or Complice.
You’re a good writer and extremely self-aware. This is a really good start.
If you’d like to speak to some other EAs in this situation (including one in the US), DM me.
Not recent-recent, but I also really like Carey’s 2017 work on CIRL. Picks a small, well-defined problem and hammers it flush into the ground. “When exactly does this toy system go bad?”
If we take “tangible” to mean executable:
A primitive prototype and a framework for safety via debate (2018-9). Bit quiet since.
Carey’s 2019 proof of concept / extension of quantilizers
Stiennon et al (2020) is an extremely encouraging example of a large negative “alignment tax” (making it safer also made it work better)
But as Kurt Lewin once said “there’s nothing so practical as a good theory”. In particular, theory scales automatically and conceptual work can stop us from wasting effort on the wrong things.
CAIS (2019) pivots away from the classic agentic model, maybe for the better
The search for mesa-optimisers (2019) is a step forward from previous muddled thoughts on optimisation, and they make predictions we can test them on soon.
The Armstrong/Shah discussion of value learning changed my research direction for the better.
Also Everitt et al (2019) is both: a theoretical advance with good software.
I think you’re right, see my reply to Ivan.
I think I generalised too quickly in my comment; I saw “virality” and “any later version” and assumed the worst. But of course we can take into account AGPL backfiring when we design this licence!
One nice side effect of even a toothless AI Safety Licence: it puts a reminder about safety into the top of every repo. Sure, no one reads licences (and people often ignore health and safety rules when it gets in their way, even at their own risk). But maybe it makes things a bit more tangible like LICENSE.md gives law a foothold into the minds of devs.
Seems I did this in exactly 3 posts before getting annoyed.
That’s cool! I wonder if they suffer from the same ambiguity as epistemic adjectives in English though* (which would suggest that we should skip straight to numerical assignments: probabilities or belief functions).
Anecdotally, it’s quite tiring to put credence levels on everything. When I started my blog I began by putting a probability on all major claims (and even wrote a script to hide this behind a popup to minimise aesthetic damage). But I soon stopped.
For important things (like Forum posts?) it’s probably worth the effort, but even a document-level confidence statement is a norm with only spotty adoption on here.
This is a neat idea, and unlike many safety policy ideas it has scaling built in.
However, I think the evidence from the original GPL suggests that this wouldn’t work. Large companies are extremely careful to just not use GPL software, and this includes just making their own closed source implementations.* Things like the Skype case are the exception, which make other companies even more careful not to use GPL things. All of this has caused GPL licencing to fall massively in the last decade.** I can’t find stats, but I predict that GPL projects will have much less usage and dev activity.
It’s difficult to imagine software so good and difficult to replicate that Google would invite our virus into their proprietary repo. Sure, AI might be different from [Yet Another Cool AGPL Parser] - but then who has a bigger data moat and AI engineering talent than big tech, to just implement it for themselves?
Aschenbrenner’s model strikes me as a synthesis of the two intellectual programmes, and it doesn’t get enough attention.
Robin Hanson is the best critic imo. He has many arguments, or one very developed one, but big pieces are:
Innovation in general is not very “lumpy” (discontinuous). So we should assume that AI innovation will also not be. So no one AI lab will pull far ahead of the others at AGI time. So there won’t be a ‘singleton’, a hugely dangerous world-controlling system.
Long timelines [100 years+] + fire alarms
Opportunity cost of spending / shouting now “we are far from human level AGI now, we’ll get more warnings as we get closer, and by saving $ you get 2x as much to spend each 15 years you wait.”″having so many people publicly worrying about AI risk before it is an acute problem will mean it is taken less seriously when it is, because the public will have learned to think of such concerns as erroneous fear mongering.”
The automation of labour isn’t accelerating (therefore current AI is not being deployed to notable effect, therefore current AI progress is not yet world-changing in one sense)
He might not be what you had in mind: Hanson argues that we should wait to work on AGI risk, rather than that safety work is forever unnecessary or ineffective. The latter claim seems extreme to me and I’d be surprised to find a really good argument for it.
You might consider the lack of consensus about basic questions, mechanisms, solutions amongst safety researchers to be a bad sign.
Nostalgebraist (2019) sees AGI alignment as equivalent to solving large parts of philosophy: a noble but quixotic quest.
Melanie Mitchell also argues for long timelines. Her view is closer to the received view in the field (but this isn’t necessarily a compliment).
Spoilers for Unsong:
Jalaketu identifies the worst thing in the world—hell—and sacrifices everything, including his own virtue and impartiality, to destroy it. It is the strongest depiction of the second-order consistency, second-order glory of consequentialism I know. (But also a terrible tradeoff.)