Owen Cotton-Barratt

Karma: 10,088

Owen Cotton-Barratt 20 Dec 2024 23:36 UTC
2 points
0 ∶ 0
in reply to: Anthony DiGiovanni’s comment on: The ‘Dog vs Cat’ cluelessness dilemma (and whether it makes sense)
IMO the betting odds framing gets things backwards. Bets are decisions, which are made rational by whether the beliefs they’re justified by are rational. I’m not sure what would justify the betting odds otherwise.
Not sure what I overall think of the better odds framing, but to speak in its defence: I think there’s a sense in which decisions are more real than beliefs. (I originally wrote “decisions are real and beliefs are not”, but they’re both ultimately abstractions about what’s going on with a bunch of matter organized into an agent-like system.) I can accept the idea of X as an agent making decisions, and ask what those decisions are and what drives them, without implicitly accepting the idea that X has beliefs. Then “X has beliefs” is kind of a useful model for predicting their behaviour in the decision situations. Or could be used (as you imply) to analyse the rationality of their decisions.
I like your contrived variant of the pi case. But to play on it a bit:
- Maybe when I first find out the information on Sally, I quickly eyeball and think that defensible credences probably lie within the range 30% to 90%
- Then later when I sit down and think about it more carefully, I think that actually the defensible credences are more like in the range 40% to 75%
- If I thought about it even longer, maybe I’d tighten my range a bit further again (45% to 55%? 50% to 70%? I don’t know!)
In this picture, no realistic amount of thinking I’m going to do will bring it down to just a point estimate being defensible, and perhaps even the limit with infinite thinking time would have me maintain an interval of what seems defensible, so some fundamental indeterminacy may well remain.
But to my mind, this kind of behaviour where you can tighten your understanding by thinking more happens all of the time, and is a really important phenomenon to be able to track and think clearly about. So I really want language or formal frameworks which make it easy to track this kind of thing.
Moreover, after you grant this kind of behaviour [do you grant this kind of behaviour?], you may notice that from our epistemic position we can’t even distinguish between:
- Cases where we’d collapse our estimated range of defensible credences down to a very small range or even a single point with arbitrary thinking time, but where in practice progress is so slow that it’s not viable
- Cases where even in the limit with infinite thinking time, we would maintain a significant range of defensible credences
Because of this, from my perspective the question of whether credences are ultimately indeterminate is … not so interesting? It’s enough that in practice a lot of credences will be indeterminate, and that in many cases it may be useful to invest time thinking to shrink our uncertainty, but in many other cases it won’t be.

Owen Cotton-Barratt 20 Dec 2024 11:10 UTC
4 points
0 ∶ 0
in reply to: Anthony DiGiovanni’s comment on: The ‘Dog vs Cat’ cluelessness dilemma (and whether it makes sense)
I appreciated a bunch of things about this comment. Sorry, I’ll just reply (for now) to a couple of parts.
The metaphor with hedonism felt clarifying. But I would say (in the metaphor) that I’m not actually arguing that it’s confused to intrinsically care about the non-hedonist stuff, but that it would be really great to have an account of how the non-hedonist stuff is or isn’t helpful on hedonist grounds, both because this may just be helpful to input into our thinking to whatever extent we endorse hedonist goods (even if we may also care about other things), and because without having such an account it’s sort of hard to assess how much of our caring for non-hedonist goods is grounded in themselves, vs in some sense being debunked by the explanation that they are instrumentally good to care about on hedonist grounds.
I think the piece I feel most inclined to double-click on is the digits of pi piece. Reading your reply, I realise I’m not sure what indeterminate credences are actually supposed to represent (and this is maybe more fundamental than “where do the numbers come from?”). Is it some analogue of betting odds? Or what?
And then, you said:
I think this fights the hypothetical. If you “make guesses about your expectation of where you’d end up,” you’re computing a determinate credence and plugging that into your EV calculation. If you truly have indeterminate credences, EV maximization is undefined.
To some extent, maybe fighting the hypothetical is a general move I’m inclined to make? This gets at “what does your range of indeterminate credences represent?”. I think if you could step me through how you’d be inclined to think about indeterminate credences in an example like the digits of pi case, I might find that illuminating.
(Not sure this is super important, but note that I don’t need to compute a determinate credence here—it may be enough have an indeterminate range of credences, all of which would make the EV calculation fall out the same way.)

Owen Cotton-Barratt 16 Dec 2024 15:58 UTC
3 points
0 ∶ 2
in reply to: Anthony DiGiovanni’s comment on: The ‘Dog vs Cat’ cluelessness dilemma (and whether it makes sense)
I’d be keen to hear more why you’re unsatisfied with these accounts.
With the warning that this may be unsatisfying, since this is recounting a feeling I’ve had historically, and I’m responding to my impression about a range of accounts, rather than providing sharp complaints about a particular account:
- Accounts of imprecise credences seem typically to produce something like ranges of probabilities and then treat these as primitives
- I feel confusion about “where does the range come from? what’s it supposed to represent?”
  - Honestly this echoes some of my unease about precise credences in the first place!
- So I am into exploration of imprecise credences as a tool for modelling/describing the behaviour of boundedly rational actors (including in some contexts as a normative ideal for them to follow)
- But I think I get off the train before reification of the imprecise credences as a thing unto themselves
(that’s incomplete, but I think it’s the first-order bit of what seems unsatisfying)
Just to be clear, are you saying: “It’s a view that, for all/most indeterminate credences we might have, our prioritization decisions (e.g. whether intervention X is net-good or net-bad) aren’t sensitive to variation within the ranges specified by these credences”?
Definitely not saying that!
Instead I’m saying that in many decision-situations people find themselves in, although they could (somewhat) narrow their credence range by investing more thought, in practice the returns from doing that thinking aren’t enough to justify it, so they shouldn’t do the thinking.
If your estimate of your ideal-precise-credence-in-the-limit is itself indeterminate, that seems like a big deal — you have no particular reason to adopt a determinate credence then, seems to me.
I don’t see probabilities as magic absolutes, rather than a tool. Sometimes it seems helpful to pluck a number out of the air and roll with that (and that to be better practice than investing cognition in keeping track of an uncertainty range).
That said, I’m not sure it’s crucial to me to model there being a single precise credence that is being approximated. What feels more important is to be able to model the (common) phenomenon where you can reduce your uncertainty by investing more time thinking.
Later in your comment you use the phrase “rationally obligated”. I find I tend to shy away from that phrase in this context, because of vagueness about whether it means for fully rational or boundedly rational actors. In short:
- I’m sympathetic to the idea that fully rational actors should have precise credences
  - (for the normal vNM kind of reasons)
  - I don’t want to fully commit to that view, but it also doesn’t seem to me to be cruxy
- I don’t think that boundedly rational actors are rationally obliged to have precise credences
- But I don’t think that entails giving up on the idea of them making progress towards something (that I might think of as “the precise credence a fully rational version of them would have”) by thinking more, by saying “you have no reason to adopt a precise credence”
Because if the sign of intervention X for the long-term varies across your range of credences, that means you don’t have a reason to do X on total-EV grounds.
I reject this claim. For a toy example, suppose that I could take action X, which will lose me $1 if the 20th digit of Pi is odd, and gain me $2 if the 20th digit of Pi is even. Without doing any calculations or looking it up, my range of credences is [0,1] -- if I think about it long enough (at least with computational aids), I’ll resolve it to 0 or 1. But right now I can still make guesses about my expectation of where I’d end up (somewhere close to 50%), and think that this is a good bet to take—rather than saying that EV somehow doesn’t give me any reason to like the bet.
This seems hugely decision-relevant to me, if we have other decision procedures under cluelessness available to us other than committing to a precise best guess, as I think we do
For what it’s worth I’m often pretty sympathetic to other decision procedures than committing to a precise best guess (cluelessness or not).
ETA: I’m also curious whether, if you agreed that we aren’t rationally obligated to assign determinate credences in many cases, you’d agree that your arguments about unknown unknowns here wouldn’t work. (Because there’s no particular reason to commit to one “simplicity prior,” say. And the net direction of our biases on our knowledge-sampling processes could be indeterminate.)
I don’t think I’d agree with that. Although I could see saying “yes, this is a valid argument about unknown unknowns; however, it might be overwhelmed by as-yet-undiscovered arguments about unknown unknowns that point in the other direction, so we should be suspicious of resting too much on it”.

Owen Cotton-Barratt 29 Nov 2024 11:56 UTC
4 points
0 ∶ 1
in reply to: Jim Buhler’s comment on: The ‘Dog vs Cat’ cluelessness dilemma (and whether it makes sense)
I think this is at least in the vicinity of a crux?
My immediate thoughts (I’d welcome hearing about issues with these views!):
- I don’t think our credences all ought to be determinate/precise
- But I’ve also never been satisfied with any account I’ve seen of indeterminate/imprecise credences
  - (though noting that there’s a large literature there and I’ve only seen a tiny fraction of it)
- My view would be something more like:
  - As boundedly rational actors, it makes sense for a lot of our probabilities to be imprecise
  - But this isn’t a fundamental indeterminacy — rather, it’s a view that it’s often not worth expending the cognition to make them more precise
  - By thinking longer about things, we can get the probabilities to be more precise (in the limit converging on some precise probability)
  - At any moment, we have credence (itself kind of imprecise absent further thought) about where our probabilities will end up with further thought
  - What’s the point of tracking all these imprecise credences rather than just single precise best-guesses?
    It helps to keep tabs on where more thinking might be helpful, as well as where you might easily be wrong about something
- On this perspective, cluelessness = inability to get the current best guess point estimate of where we’d end up to deviate from 50% by expending more thought

Owen Cotton-Barratt 29 Nov 2024 10:33 UTC
8 points
0 ∶ 1
in reply to: Jim Buhler’s comment on: The ‘Dog vs Cat’ cluelessness dilemma (and whether it makes sense)
Just on this point:
I can’t conveniently assume good and bad unknown unknowns ‘cancel out’
FWIW, my take would be:
- No, we shouldn’t assume that they “cancel out”
- However, as a structural fact[*] about the world, the prevalence of good and bad unknown unknowns are correlated with the good and bad knowns (and known unknowns)
- So, on average and in expectation, things will point in the same direction as the analysis ignoring cluelessness (although it’s worth being conscious that this will turn out wrong in a significant fraction of cases ― probably approaching 50% for something like cats vs dogs)
Of course this relies heavily on the “fact” I denoted as [*], but really I’m saying “I hypothesise this to be a fact”. My reasons for believing it are something like:
- Some handwavey argument along these lines:
  - Among the many complex things we could consider, they will vary in the proportion of considerations that point in a good direction
  - If our knowledge sampled randomly from the available considerations, we would expect this correlation
  - It’s too much to expect our knowledge to sample randomly ― there will surely sometimes be structural biases ― but there’s no reason to expect the deviations to be so perverse as to (on average) actively mislead
    (this needn’t preclude the existence of some domains with such a perverse pattern, but I’d want a positive argument that something might be such a domain)
  - Given that we shouldn’t expect the good and bad unknown unknowns to cancel out, by default we should expect them to correlate with the knowns
- A sense that empirically this kind of correlation is true in less clueless-like situations
  - e.g. if I uncover a new considerations about whether it’s good or bad for EAs to steal-to-give, it’s more likely to point to “bad” than “good”
  - Combined with something like a simplicity prior ― if this effect exists for things where we have a fairly strong sense of the considerations we can track, by default I’d expect it to exist in weaker form for things where we have a weaker sense of the considerations we can track (rather than being non-existent or occurring in a perverse form)
In principle, this could be tested experimentally. In practice, you’re going to be chasing after tiny effect sizes with messy setups, so I don’t think it’s viable any time soon for human judgement. I do think you might hope to one day run experiments along these lines for AI systems. Of course they would have to be cases where we have some access to the ground truth, but the AI is pretty clueless—perhaps something like getting non-superintelligent AI systems to predict outcomes in a complex simulated world.
What links here?
- Jim Buhler's comment on Cosmic AI safety by Magnus Vinding (8 Dec 2024 17:39 UTC; 6 points)

Owen Cotton-Barratt 28 Nov 2024 22:18 UTC
6 points
0 ∶ 0
in reply to: Owen Cotton-Barratt’s comment on: The ‘Dog vs Cat’ cluelessness dilemma (and whether it makes sense)
But having written that, I notice that the example helped me to articulate my thoughts on cluelessness! Which makes it seem like actually a pretty helpful example. :)
(And maybe this is kind of the point—that cluelessness isn’t an absolute of “we cannot hope even in principle to say anything here”, but rather a pragmatic barrier of “it’s never gonna be worth taking the time to know”.)

Owen Cotton-Barratt 28 Nov 2024 22:16 UTC
4 points
0 ∶ 1
on: The ‘Dog vs Cat’ cluelessness dilemma (and whether it makes sense)
I wonder if the example is weakened by the last sentence:
In fact, you even have no idea whether donating his money to either will turn out overall better than not donating it to begin with.
Right now I feel like this is a hard question. But it doesn’t feel like an impossibly intractable one. I think if the forum spent a week debating this question you’d get some coherent positions staked out—where after the debate it would still be unreasonable to be very confident in either answer, but it wouldn’t seem crazy to think that the balance of probabilities suggested favouring one course of action over the other.
This makes me notice that the cats and dogs question feels different only in degree, not kind. I think if you had a bunch of good thinkers consider it in earnest for some months, they wouldn’t come out indifferent. I’d hazard that it would probably be worth >$0.01 (in expectation, on longtermist welfarist grounds) to pay to switch which kind of shelter the billions went to. But I doubt it would be worth >$100. And at that point it wouldn’t be worth the analysis to get to the answer.

Owen Cotton-Barratt 31 Oct 2024 20:43 UTC
6 points
1 ∶ 0
in reply to: Owen Cotton-Barratt’s comment on: EA is bad for mental health (and I’m tired of pretending it’s not)
Given this, my worry is that expressing things like “EA aims to be maximizing in the second sense only” may be kind of gaslight-y to some people’s experience (although I agree that other people will think it’s a fair summary of the message they personally understood).

Owen Cotton-Barratt 31 Oct 2024 20:36 UTC
8 points
2 ∶ 0
in reply to: Richard Y Chappell🔸’s comment on: EA is bad for mental health (and I’m tired of pretending it’s not)
I largely agree with this, but I feel like your tone is too dismissive of the issue here? Like: the problem is that the maximizing mindset (encouraged by EA), applied to the question of how much to apply the maximizing mindset, says to go all in. This isn’t getting communicated explicitly in EA materials, but I think it’s an implicit message which many people receive. And although I think that it’s unhealthy to think that way, I don’t think people are dumb for receiving this message; I think it’s a pretty natural principled answer to reach, and the alternative answers feel unprincipled.

Owen Cotton-Barratt 31 Oct 2024 18:31 UTC
5 points
3 ∶ 0
in reply to: Richard Y Chappell🔸’s comment on: EA is bad for mental health (and I’m tired of pretending it’s not)
On the types of maximization: I think different pockets of EA are in different places on this. I think it’s not unusual, at least historically, for subcultures to have some degree of lionization of 1). And there’s a natural internal logic to this: if doing some good well is good, surely doing more is better?

Owen Cotton-Barratt 31 Oct 2024 18:22 UTC
4 points
0 ∶ 0
in reply to: Richard Y Chappell🔸’s comment on: EA is bad for mental health (and I’m tired of pretending it’s not)
On the potential conflicts between ethics and self-interest: I agree that it’s important to be nuanced in how this is discussed.

But:
1. I think there’s a bunch of stuff here which isn’t just about those conflicts, and that there is likely potential for improvements which are good on both prudential and impartial grounds.
2. Navigating real tensions is tricky, because we want to be cooperative in how we sell the ideas. cf. https://forum.effectivealtruism.org/posts/C665bLMZcMJy922fk/what-is-valuable-about-effective-altruism-implications-for

Owen Cotton-Barratt 31 Oct 2024 16:15 UTC
12 points
1 ∶ 0
on: EA is bad for mental health (and I’m tired of pretending it’s not)
I really appreciated this post. I don’t agree with all of it, but I think that it’s an earnest exploration of some important and subtle boundaries.
The section of the post that I found most helpful was “EA ideology fosters unsafe judgment and intolerance”. Within that, the point that I found most striking was: that there’s a tension in how language gets used in ethical frameworks and in mental wellbeing frameworks, and people often aren’t well equipped with the tools to handle those tensions. This … basically just seems correct? And seems like a really good dynamic for people to be tracking.
Something which I kind of wish you’d explored a bit more is ways in which EA may be helpful for people’s mental health. You get at that a bit when talking about how/why it appeals to people, and seem to acknowledge that there are ways in which it can be healthy for people to engage, but I think that we’ll get faster to a better/deeper understanding of the dynamics if we try to look honestly at the ways in which it can be good for people as well as bad, as well as what levels of tradeoff in terms of potentially being bad for people are worth accepting (I think the correct answer will be “a little bit”, in that there’s no way to avoid all harms without just not being in the space at all, and I think that would be a clear mistake for EA; though I am also inclined to think that the correct answer is “somewhat less than at present”).

Owen Cotton-Barratt 25 Oct 2024 22:37 UTC
2 points
0 ∶ 0
in reply to: Will Aldred’s comment on: AI safety tax dynamics
Yep.

Owen Cotton-Barratt 25 Oct 2024 21:12 UTC
4 points
0 ∶ 0
in reply to: Will Aldred’s comment on: AI safety tax dynamics
it sounds like you see weak philosophical competence as being part of intent alignment, is that correct?
Ah, no, that’s not correct.
I’m saying that weak philosophical competence would:
- Be useful enough for acting in the world, and in principle testable-for, that I expect it be developed as a form of capability before strong superintelligence
- Be useful for research on how to produce intent-aligned systems
… and therefore that if we’ve been managing to keep things more or less intent aligned up to the point where we have systems which are weakly philosophical competent, it’s less likely that we have a failure of intent alignment thereafter. (Not impossible, but I think a pretty small fraction of the total risk.)

Owen Cotton-Barratt 24 Oct 2024 13:26 UTC
4 points
0 ∶ 0
in reply to: Will Aldred’s comment on: AI safety tax dynamics
Yeah, I appreciated your question, because I’d also not managed to unpack the distinction I was making here until you asked.
On the minor issue: right, I think that for some particular domain(s), you could surely train a system to be highly competent in that domain without this generalizing to even weak philosophical competence overall. But if you had a system which was strong at both of those domains despite not having been trained on them, and especially if that was also true for say three more comparable domains, I guess I kind of do expect it to be good at the general thing? (I haven’t thought long about that.)

Owen Cotton-Barratt 24 Oct 2024 10:13 UTC
4 points
0 ∶ 0
in reply to: Will Aldred’s comment on: AI safety tax dynamics
It’s not clear we have too much disagreement, but let me unpack what I meant:
- Let strong philosophical competence mean competence at all philosophical questions, including those like metaethics which really don’t seem to have any empirical grounding
  - I’m not trying to make any claims about strong philosophical competence
  - I might be a little more optimistic than you about getting this by default as a generalization of weak philosophical competence (see below), but I’m still pretty worried that we won’t get it, and I didn’t mean to rely on it in my statements in this post
- Let weak philosophical competence mean competence at reasoning about complex questions which ultimately have empirical answers, where it’s out of reach to test them empirically, but one may get better predictions from finding clear frameworks for thinking about them
- I claim that by the time systems approach strong superintelligence, they’re likely to have a degree of weak philosophical competence
  - Because:
    It would be useful for many tasks, and this would likely be apparent to mild superintelligent systems
    It can be selected for empirically (seeing which training approaches etc. do well at weak philosophical competence in toy settings, where the experimenters have access to the ground truth about the questions they’re having the systems use philosophical reasoning to approach)
- I further claim that weak philosophical competence is what you need to be able to think about how to build stronger AI systems that are, roughly speaking, safe, or intent aligned
  - Because this is ultimately an empirical question (“would this AI do something an informed version of me / those humans would ultimately regard as terrible?”)
  - I don’t claim that this would extend to being able to think about how to build stronger AI systems that it would be safe to make sovereigns

Owen Cotton-Barratt 19 Oct 2024 15:39 UTC
4 points
1 ∶ 1
in reply to: JackM’s comment on: Is RP’s Moral Weights Project too animal friendly? Four critical junctures
IDK, structurally your argument here reminds me of arguments that we shouldn’t assume animals are conscious, since we can only generalise from human experiences. (In both cases I feel like there’s not nothing to the argument, but I’m overall pretty uncompelled.)

Owen Cotton-Barratt 19 Oct 2024 7:23 UTC
2 points
0 ∶ 1
in reply to: JackM’s comment on: Is RP’s Moral Weights Project too animal friendly? Four critical junctures
Why do you think the individual is the level at which conscious experience happens?

(I tend to imagine that it happens at a range of scales, including both smaller-than-individual and bigger-than-individual. I don’t see why we should generalise from our experience to the idea that individual organisms are the right boundary to draw. I put some reasonable weight on some small degree of consciousness occurring at very small levels like neurons, although that’s more like “my intuitive concept of consciousness wasn’t expansive enough, and the correct concept extends here”).

Owen Cotton-Barratt 13 Oct 2024 21:03 UTC
2 points
0 ∶ 0
in reply to: Owen Cotton-Barratt’s comment on: Open Letter to Young EAs
Update: I think I’d actually be less positive on it than this if I thought their antagonism might splash back on other people.

I took that not to be a relevant part of the hypothetical, but actually I’m not so sure. I think for people in the community, it’s creating a public good (for the community) to police their mistakes, so I’m not inclined to let error-filled things slide for the sake of the positives. For people outside the community, I’m not so invested in building up the social fabric, so it doesn’t seem worth trying to punish the errors, so the right move seems to be something like more straightforwardly looking for the good bits.

Owen Cotton-Barratt 13 Oct 2024 20:48 UTC
2 points
0 ∶ 2
in reply to: Richard Y Chappell🔸’s comment on: Open Letter to Young EAs
I guess I think it’s likely some middle ground? I don’t think he has a clear conceptual understanding of moral credit, but I do think he’s tuning in to ways in which EA claims may be exaggerating the impact people can have. I find it quite easy to believe that’s motivated by some desire to make EA look bad—but so what? If people who want to make EA look bad make for good researchers hunting for (potentially-substantive) issues, so much the better.