Making progress on Task Y

Disclaimer: I applied to FTX with a solution to this issue, but I hope this isn’t taken as a shady way of promoting my project. Now that decision e-mails are being sent out, I think it’s a great idea to post about your project here whether you got funded or not because: a) some other fund might be interested; b) your project might have certain misalignments that could use some eyeballs to root out; and c) feedback from other EAs can help you improve if you decide to continue your project anyway.

Epistemic status: Speculative and highly entrepreneurial; it might help to adopt a temporary hyperoptimistic “what if this succeeds wildly?” frame[1] before turning on one’s usual filters, but really the reason why I’m posting this is to get feedback and criticism. I am especially interested in knowing if there are possible harms that I didn’t think of, and if this is something that people would actually use.

Introduction

I remember right before the pandemic hit that there was a palpable buzz about finding a potential Task Y[2], i.e.:

  • Task Y is something that can be performed usefully by people who are not currently able to choose their career path entirely based on EA concerns*.

  • Task Y is clearly effective, and doesn’t become much less effective the more people who are doing it.

  • The positive effects of Task Y are obvious to the person doing the task.

The problem is this: we have too many people but not enough problems to work on. And with the recent massive injection of available funding in the movement[3], not only will we see more growth in how many people join local meetups, attend workshops and conferences, or otherwise self-identify as EAs, we will also see people who are not as aligned with EA and are only attracted to the space because of money and prestige.

Let me itemise the points I’m making here so it’s less likely to misconstrue my position:

  • Erin Braid’s comment about how there are two broad conceptions of what people think the EA community is for (and by extension, what growing it means): there is a sense in which it is better to get deontologically aligned EAs than mercenary types who are doing good work but are otherwise only in EA because it pays much better than other forms of charity work. I think this is too static: it is possible to slowly align the latter group as they spend more time with the community, and it is also possible to genuinely love doing EA-related work while eschewing the aesthetic of being a grossly underpaid grad student ignoring their own suffering for higher pursuits.

  • The recent attention we’re giving to entrepreneurial projects is kind of interesting from a far-mode, outside observer point of view. When I talk to non-EAs they think the median EA is highly risk-averse (see this comment for an internal perspective) which is completely antithetical to how entrepreneurship is done (and I would argue that, contrary to the comment above, this is true whether or not you are doing a VC-style startup or a low market-risk, high execution-risk mom-and-pop shop). It could be the case that highly analytical people of the sort EA attracts most strongly aren’t suited for the demands of founding institutions with a) high and ever-changing degrees of uncertainty, and b) where part of what makes the uncertainty high is that your actions can affect it, but again it’s difficult to say more without relying on folk psychology or falling prey to just-so stories. In this case, however, what we don’t have therefore is a ‘talent-constraint’ but holes in the kinds of skillsets available in our talent pool.

  • Continuing from the previous point, it would be difficult to rely on existing institutions to produce the kinds of skillsets we need. For example, I remember MIRI looking for a “walking encyclopedia of all the previous AI techniques that have been tried” which is something you can’t learn from doing e.g., a PhD from a top 10 institution simply because their incentives (i.e., produce papers about new AI topics) are different. Right now, my impression is that the consensus solution to this is to offer workshops but: a) you can’t really run a workshop about something you yourself cannot do, and b) since workshops don’t usually publish their materials or there are skills which are difficult to verbalise, it’s hard to scale this approach beyond a dozen or so people every couple of months.

  • It’s also not solvable by throwing smart people at the problem. If the solution were as simple, we wouldn’t have talent gaps in the first place.

  • (Coming from the opposite direction) Eternal September is an anecdotal observation primarily seen in online groups (which EA definitely is, for the most part), and is basically the erosion of internal norms because of the influx of new, yet-to-be-mentored members. For us, this would take the form of EA just becoming A, if at all, and given the density of smart, conscientious people here there is an unbounded downside if such a misalignment were to occur. I have seen some chatter about this here and on Twitter but nothing that is explicitly trying to address this, which is worrying if you anticipate a superlinear growth of membership in the near-future. I think the right answer is not to keep people out, but to let them in slowly. The norms we have promoting deep analysis by default (which outsiders sometimes round off to “excessive wordiness”) seems to be keeping the tide back so far, but eventually we’d need a more scalable system that cannot be gamed.

Elaborations on Task(s) Y

So what can be done?

For starters, I think we should not think of a Task Y but rather a plethora of them. I would be very surprised if we have exhausted the space of repeatable effective tasks with Earn to Give and Work at an EA organisation, and for highly uncertain research areas like longtermism where the consensus isn’t even close to settling yet, we should consider ourselves stuck in a local minima and be much, much more open to exploring alternatives quite different from what we currently have.

This means we need a systematic way of exploring the problem space while trying our best to minimise downside risk. I don’t think more available funding gets us there on its own, but rather smaller, but exponentially more projects being tried. And if we’re talking about small projects, why not go all the way down to individual tasks? This is important for two reasons:

  • smaller tasks mean cheaper experiments; there is an exponential drop-off in the value of information for most things and indeed most of the work is locating the problem in hypothesis-space in the first place

  • a smaller scope generally means smaller downsides; the asymmetry towards upside happens when good tasks bubble up and get repeated much more often than bad ones

At the very least, concrete tasks are much easier to verify than abstract, high-level ones like “participate in the EA forums” or “attend local meetups”.

Now, again, I would like to repeat my disclaimer above that I submitted a proposal about this to FTX Future Fund and so there is definitely a potential conflict of interest here. But I figured the panelists would have some way of distancing themselves from people who are trying to affect their judgment beyond the application form. Let me know if that assumption is wrong.

The solution

Anyway, so we believe we have worked out a system of tasks that solves the following problems simultaneously:

  • How can we pass on hard-to-verbalise, but still highly important skills to lots of people very quickly?

  • How can we incentivise our existing talent pool to develop the highly specific skills we need in our organisation?

  • How can we find and vet potentially highly impactful junior EAs?

  • How can we lower the cost of trying new approaches to certain cause areas by an order-of-magnitude?

The answer is, you arrange the tasks in a tree[4] and then only let people move up once they’ve done the simpler tasks.

Let me explain. The core interaction loop (which is also available on our still-under-construction project website) can be broken down into four steps:

  1. Pick a task

  2. Submit your attempt via text/​image/​video

  3. Wait for task members to approve you

  4. Once approved, you may now talk to your new peers

Suppose your task is to “give a lightning talk in your local EA meetup”. This is a good task not only because it’s repeatable but also because it directly improves your skill at public speaking, which is something that presumably everyone who wants to run an EA organisation should be decent at.

Now, it’s Saturday, another meetup day. You do your talk and submit a clip of it to the system. And then you wait. Why is it not instantaneous? Is it not a cardinal rule of UX design that interactions should be as fast as possible? No, because there are humans in the loop who will need to check your attempt if it’s the genuine article[5]. If not, or if the members of a task think you didn’t meet their standard, you can always try again.

But suppose you get approved. What happens then? Well, you are now a member of the task! That means you can talk to other members, get tips on how to do the task better, or even tips on how to do more advanced tasks. In this way, we steadily accumulate extremely specific and extremely focused tricks of the trade that you would not otherwise learn in books or even courses.

And if you wish to take your skillset further, you can either look for new tasks you’ve just unlocked (in other words, tasks that define your current one as a prerequisite) or you can create your own.

Possible complications

There are several complications about this scheme. First of all, who makes the tasks? And the cheeky answer is: everyone. But in practice, part of making this project succeed is having really compelling tasks available for people to do, and who better knows which tasks are valuable to the community than the most committed EAs working in orgs? Assembling a library of good tasks will require us to Do Things That Don’t Scale, and we hope that enough of that will allow us to, say, have “Found an EA megaproject” as a repeatable, scalable Task Y candidate.

Another thing is that, at first glance it’s not really clear why people would be incentivised to check other people’s work at all. But anecdotally speaking, after having founded or otherwise run multiple non-EA orgs in the last decade there is a certain sense of accomplishment in being the arbiter of quality in a scene, and regardless of the replication crisis there are still certain laws governing human behaviour that can be relied upon to predict people’s actions. In a way, we are trying to align a reputation system with an achievement system, both of which are solutions two different industries (social media and video games) have come up with and have kept using under the threat of market irrelevance if they’re wrong. So if this part doesn’t work out I would be violently surprised.

Lastly, why would this affect the funding landscape at all? Notice that all tasks are gated by your ability to do them at least once[6]. So what happens when you wish to create a task for something that’s still a bit beyond your ability? Inducement prizes. You specify the task, pay an entrance fee[7] to join a pool of people who would wish to attempt the challenge as well, then whoever submits the best attempt after a certain period of time (as judged by judiciously selected third-parties) wins the prize pool and gets to be the first arbiter for that task. At the very least, this public activity of regular competitions by the most skilled users of the platform is a great way to promote the system not only to EAs but to EA-adjacents who would otherwise not hear of what’s going on because they can’t see the more advanced tasks without having gone through the entire gauntlet

What do you think? Would you use such a system? I know there are a lot of moving parts here but I tried my best to convey why I think it would be a potentially impactful project to work on. And also, I know I said I was looking for possible harms in my disclaimer but I figured I should wait for others’ opinions before revealing my list so as not to prematurely constrain the hypothesis-space.


  1. ↩︎

    (insert LessWrong link here…once you find it)

  2. ↩︎
  3. ↩︎
  4. ↩︎

    Actually, a directed acyclic graph, but the analogy to RPG-like skill trees is so much easier to understand.

  5. ↩︎

    Part of what makes such a system work is the fact that it isn’t automated. If we stuck to machine-verifiable tasks, we wouldn’t be able to cover a wide variety of skills. And at the very least, would it not feel much better to know that actual EAs checked your work and found it satisfactory?

  6. ↩︎

    Which would also cure people of analysis paralysis when it comes to exploring new skillsets, if the difficulty gaps between tasks are made narrow enough.

  7. ↩︎

    Of course, nothing is stopping you, an outside observer, from sponsoring other people’s entrance fees or adding to the prize pool for that matter, which is a good way to guide and accelerate the careers of promising EA juniors.