Mark Xu

Karma: 900

Mark Xu 16 Apr 2024 21:55 UTC
1 point
0 ∶ 0
in reply to: Pablo’s comment on: ‘Dropping out’ isn’t a Plan
yes, thanks!

ARC is hiring theoretical researchers

Jacob_Hilton12 Jun 2023 19:11 UTC

78 points

0 comments4 min readEA link

(www.lesswrong.com)

How to do theoretical research, a personal perspective

Mark Xu19 Aug 2022 19:43 UTC

132 points

7 comments15 min readEA link

Mark Xu 14 Aug 2022 3:24 UTC
105 points
4 ∶ 1
on: Existential risk pessimism and the time of perils
I think this model is kind of misleading, and that the original astronomical waste argument is still strong. It seems to me that a ton of the work in this model is being done by the assumption of constant risk, even in post-peril worlds. I think this is pretty strange. Here are some brief comments:
- If you’re talking about the probability of a universal quantifier, such as “for all humans x, x will die”, then it seems really weird to say that this remains constant, even when the thing you’re quantifying over grows larger.
  - For instance, it seems clear that if there were only 100 humans, the probability of x-risk would be much higher than if there were 10^6 humans. So it seems like if there are 10^20 humans, it should be harder to cause extinction than 10^10 humans.
- Assuming constant risk has the implication that human extinction is guaranteed to happen at some point in the future, which puts sharp bounds on the goodness of existential risk reduction.
- It’s not that hard to get exponentially decreasing probability on universal quantifiers if you assume independence in survival amongst some “unit” of humanity. In computing applications, it’s not that hard to drive down the probability of error exponentially in the resources allocated, because each unit of resource can ~halve the probability of error. Naively, each human doesn’t want to die, so there are # humans rolls for surviving/solving x-risk.
- It seems like the probability of x-risk ought to be inversely proportional to the current estimated amount of value at stake. This seems to follow if you assume that civilization acts as a “value maximizer” and it’s not that hard to reduce x-risk. Haven’t worked it out, so wouldn’t be surprised if I was making some basic error here.
- Generally, it seems like most of the risk is going to come from worlds where the chance of extinction isn’t actually a universal quantifier, and there’s some correlation amongst seemingly independent roles for survival. In particularly bad cases, humans go extinct if there exists someone that wants to destroy the universe, so we actually see an extremely rapid increasing probability of extinction as we get more humans. These worlds would require extremely strong coordination and governance solutions.
  - These worlds are also slightly physically impossible because parts of humanity will rapidly become causally isolated from each other. I don’t know enough cosmology to have an intuition for which way the functional form will ultimately go.
- Generally, it seems like the naive view is that as humans get richer/smarter, they’ll allocate more and more resources towards not dying. At equilibrium, it seems reasonable to first-order-assume we’ll drive existential risk down until the marginal cost equals the marginal benefit, so the key question is how this equilibrium behaves. It seems like my guess is that it will depend heavily on the total amount of value available in the future, determined by physical constraints (and potentially more galaxy-brained considerations).
  - This view seems to allow you to recover more the more naive astronomical waste perspective.
- This makes me feel like the model makes kind of strong assumptions about the amount it will ultimately cost to drive down existential risk. E.g. you seem to imply that rl = 0.0001 is small, but an independent chance that large each century suggests that the probability humanity survives for ~10^10 years is ~0. This feels quite absurd to me.
  - The sentence: “Note that for the Pessimist, this is a reduction of 200,000%”, but humans routinely reduce the probabilities of failures by more than 200,000% via engineering efforts and produce highly complex and artifacts like computers, airplanes, rockets, satellites, etc. It feels like you should naively expect “breaking” human civilization to be harder than breaking an airplane, especially when civilization is actively trying to ensure that it doesn’t go extinct.
- Also, you seem to assume each century has some constant value v eventually, which seems reasonable to me, but the implication “Warming (slightly) on short-termist cause areas” relies on an assumption that the current century is close to value v, when it seems like even pretty naive bounds (e.g. percent of sun’s energy), suggest that the current century is not even within a factor of 10^9 of the long-run value-per-century humanity could reach.
  - Assuming that value grows quadratically seems also quite weird, because of analysis like eternity in 6 hours, which seems to imply that a resource-maximizing civilization will undergo a period of incredibly rapid expansion to achieve per-century rates of value much higher than the current century, and then have nowhere else to go. A better model from my perspective is logistic growth of value, with the upper bound given by some weak proxy like “suppose that value is linear in the amount of energy a civilization uses, then take the total amount of value in the year 2020”, with the ultimate unit being “value in 2020″. This would produce much higher numbers, and give a more intuitive sense of “astronomical waste.”
I like the process of proposing concrete models for things as a substrate for disagreement, and I appreciate that you wrote this. It feels much better to articulate objections like “I don’t think this particular parameter should be constant in your model” than to have abstract arguments. I also like how it’s now more clear that if you do believe that risk in post-peril worlds is constant, then the argument for longtermism is much weaker (although I think still quite strong because of my comments about v).
What links here?
- Winners of the EA Criticism and Red Teaming Contest by Lizka (1 Oct 2022 1:50 UTC; 226 points)
- Lizka's comment on Should we discount future people in proportion to the probability of them not existing? by Joseph (20 Dec 2022 1:48 UTC; 7 points)

Mark Xu 29 Apr 2022 11:40 UTC
45 points
0 ∶ 0
in reply to: mic’s comment on: Increasing Demandingness in EA
I expect 10 people donating 10% of their time to be less effective than 1 person using 100% of their time because you don’t get to reap the benefits of learning for the 10% people. Example: if people work for 40 years, then 10 people donating 10% of their time gives you 10 years with 0 experience, 10 with 1 year, 10 with 2 years, and 10 with 3 years; however, if someone is doing EA work full-time, you get 1 year with 0 exp, 1 with 1, 1 with 2, etc. I expect 1 year with 20 years of experience to plausibly be as good/useful as 10 with 3 years of experience. Caveats to the simple model:
- labor-years might be more valuable during the present
- if you’re volunteering for a thing that is similar to what you spend the other 90% of your time doing, then you still get better at the thing you’re volunteering for
I make a similar argument here.

Mark Xu 28 Apr 2022 21:00 UTC
5 points
0 ∶ 0
in reply to: devansh’s comment on: ‘Dropping out’ isn’t a Plan
One key difference is that “continuing school” usually has a specific mental image attached, whereas “drop out of school” is much vaguer, making them difficult to compare between.

‘Dropping out’ isn’t a Plan

Mark Xu28 Apr 2022 20:28 UTC

56 points

9 comments1 min readEA link

(markxu.com)

Mark Xu 28 Apr 2022 10:49 UTC
13 points
0 ∶ 0
on: My bargain with the EA machine

Many people in EA depart from me here: they see choices that do not maximize impacts as personal mistakes. Imagine a button that, if you press it, would cause you to always take the impact-maximizing action for the rest of your life, even if it entails great personal sacrifice. Many (most?) longtermist EAs I talk to say they would press this button – and I believe them. That’s not true of me; I’m partially aligned with EA values (since impact is an important consideration for me), but not fully aligned.

I think there are people (e.g. me) that value things besides impact and would also press the button because of golden-rule type reasoning. Many people optimize for impact to the point where it makes them less happy.

Mark Xu 19 Apr 2022 14:43 UTC
11 points
0 ∶ 0
on: How Many People Are In The Invisible Graveyard?
A title like “How many lives might have been saved given an earlier COVID-19 vaccine rollout?” would have given me much more information about what the post was about than the current title, which I find very vague.

Mark Xu 21 Jan 2022 5:57 UTC
7 points
0 ∶ 0
in reply to: MaximeCdS’s comment on: Things I recommend you buy and use.
kindle’s are smaller, have backlights, and the kindle store is a good user experience.

Mark Xu 7 Jan 2022 6:53 UTC
6 points
0 ∶ 0
in reply to: Josh Jacobson’s comment on: Consider trying the ELK contest (I am)
Note: I work for ARC.

I would consider someone a “pretty good fit” (whatever that means) for alignment research if they started out with a relatively technical background, e.g. an undegrad degree in math/cs, but not really having engaged with alignment before and they were able to come up with a decent proposal after:
- ~10 hours of engaging with the ELK doc.
- ~10 hours of thinking about the document and resolving confusions they had, which might involve asking some questions to clarify the rules and the setup.
- ~10 hours of trying to come up with a proposal.
If someone starts from having thought about alignment a bunch, I would consider them a potentially “pretty good researcher” if they were able to come up with a decent proposal in 2-8 hours. I expect many existing (alignment) researchers to be able to come up with proposals in <1 hour.

Note that I’m saying “if (can come up with proposal in N hours), then (might be good alignment researcher)” and not saying the other implication also holds, e.g. it is not the case that “if (might be good alignment researcher), then (can come up with proposal in N hours)”

Mark Xu 7 Jan 2022 6:43 UTC
3 points
0 ∶ 0
in reply to: Ajeya’s comment on: Consider trying the ELK contest (I am)
Can confirm we would be interested in hearing what you came up with.

ARC is hiring alignment theory researchers

Paul_Christiano14 Dec 2021 20:17 UTC

89 points

4 comments1 min readEA link

Your Time Might Be More Valuable Than You Think

Mark Xu18 Oct 2021 0:55 UTC

55 points

1 comment6 min readEA link

(markxu.com)

Mark Xu 15 Jun 2021 4:50 UTC
4 points
0 ∶ 0
in reply to: CarlShulman’s comment on: What is an example of recent, tangible progress in AI safety research?
nit: link on “reasons” was pasted twice. For others it’s https://www.lesswrong.com/posts/PZtsoaoSLpKjjbMqM/the-case-for-aligning-narrowly-superhuman-models

Also hadn’t seen that paper. Thanks!

Meta-EA Needs Models

Mark Xu5 Apr 2021 21:59 UTC

43 points

7 comments4 min readEA link

(markxu.com)

Mark Xu 1 Apr 2021 19:19 UTC
66 points
0 ∶ 0
on: Announcing “Naming What We Can”!
Ben Pace, Ben Khun, Ben Todd, Ben West, and Ben Garfinkel should all become the same person, to avoid confusion.

Strong Evidence is Common

Mark Xu14 Mar 2021 1:19 UTC

50 points

7 comments1 min readEA link

(markxu.com)

Mark Xu 26 Feb 2021 22:40 UTC
5 points
0 ∶ 0
on: Things I recommend you buy and use.
Thanks for writing this up. Just ordered a misto, elastic laces, and a waterpik. My own personal list of recommendations is on https://markxu.com/things, but it lacks justifications. Feel free to ask me about any of the items though.

Be Specific About Your Career

Mark Xu24 Feb 2021 7:24 UTC

94 points

5 comments1 min readEA link

(markxu.com)