Reasonably often (maybe once or twice a month?) I see fairly highly upvoted posts that I think are basically wrong in something like “how they are reasoning”, which I’ll call epistemics. In particular, I think these are cases where it is pretty clear that the argument is wrong, and that this determination can be made using only knowledge that the author probably had (so it is more about reasoning correctly given a base of knowledge).
Sometimes I write a comment explaining why. If I reliably did this on all of the posts then you could still rely on karma as an indicator of epistemic soundness, but sadly I don’t, because it’s actually a fair amount of work and my time has high opportunity cost. So here is your PSA: for any particular high-karma post, knowing nothing else about the post besides that it is high karma, there is a non-trivial probability that I would find significant reasoning issues in that post. You can’t rely solely on karma as a strong signal of epistemics.
Moderate examples from the EA Forum (either they were lower karma, or only one particular thing was off instead of most of the post, or something else):
Weak examples from the EA Forum that are still some evidence (for these ones it’s pretty likely someone would have made the points I made if I hadn’t; that’s not true for the others):
Agreed. To some extent it’s OK for bad posts to get upvoted. But I think the fact that posting volume is so much higher now means we should be able to trade off some of that volume for greater post quality. This could be by having a review process for posts, or reinstating the minimum upvote requirement before a user is allowed to post. I also think there may be some achievable gains that don’t require trading off volume, such as improving the upvote strength algorithm.
I’d assume that forum members don’t notice that the reasoning is bad.
As evidence in favor of this view, at least sometimes after I post such a comment, the post’s karma starts to go down, suggesting that the comment informed voters about bad reasoning that they hadn’t previously noticed. (Possibly this happened in most of the examples above, I wasn’t carefully tracking this and don’t know of any way to check now.)
I’d assume that forum members don’t notice that the reasoning is bad.
Probably yeah, at least in part. Sometimes they may notice it a bit but put insufficient weight on it relative to the fact that they agree with the conclusion. But some may also miss it altogether.
My comment was in response to the claim that “to some extent it’s OK for bad posts to get upvoted”.
Ah, I interpreted that claim as “it’s not a huge priority to prevent bad posts from being upvoted, regardless of how that happens”, rather than “it’s fine for forum members to upvote posts whose conclusions they agree with even if they see that the reasons are bad”.
Hot take: strong upvoting things without great reasoning that also have conclusions I disagree with could be good for improving epistemics. At least, I think this gives us an opportunity to demonstrate common thinking processes in EA and what reasoning transparency looks like to newer people to the community. [1]
My best guess is that it also makes it more likely quality divergent thinking from established ideas happens in EA community spaces like the EA forum.
I’m aware that people on this thread might think my thinking processes and reasoning abilities aren’t stellar,* but I still think my point stands.
*My personal view is that this impression would be less because I’m bad at thinking clearly and more because our views are quite different.
A large inferential distance means it’s harder to diagnose epistemics accurately (but I’m not exactly an unbiased observer when it comes to judging my own ability to think clearly).
This leads me to another hot take: footnotes within footnotes are fun.
Maybe I want silent upvoting and downvoting to be disincentivized (or commenting with reasoning to be more incentivized). Commenting with reasoning is valuable but also hard work.
After 2 seconds of thought, I think I’d be massively in favour of a forum feature where any upvotes or downvotes count for more (e.g. double or triple the karma) once you’ve commented.[1]
Just having this incentive might make more people try and articulate what they think and why they think it. This extra incentive to stop and think might possibly make people change their votes even if they don’t end up submitting their comments.
Me commenting on my own comment shouldn’t mean the default upvote on my comment counts for more though: only the first reply should give extra voting power (I’m sure there are other ways to game it that I haven’t thought of yet but I feel like there could be something salvageable from the idea anyway).
I know this is just a small detail and not what you wrote about, but: much of your comment on the recommender systems post hinged on news articles being uncorrelated with the truth. Do you have data to back that up?
I’m replying here because it’s a strong claim that’s relevant to many things beyond that specific post.
I have data in the sense that when I read news articles and check how correct they are, they are usually not very correct. (You can have more nuance than this, e.g. facts about what mundane stuff happened in the world tend to be correct.)
I don’t have data in the sense that I don’t have a convenient list of articles and ways they were wrong such that I could easily persuade someone else of this belief of mine. (Though here’s one example of an article that you at least have to read closely if you want to not be misled.)
Also, I could justify ignoring those two particular news articles without this general claim, at least to myself. I did briefly look at them before I wrote that comment; I didn’t particularly expect to believe them but if they were the rare good kind of news article I would have noticed.
For radicalization, I know specific people who have looked into it and come away unconvinced; Stefan Schubert links to some of this work in a different comment on that post.
The article about social media being addictive is basically just a bunch of quotes from people rather than particular studies / data. It generally seems pretty easy to find people saying things you want so I don’t update much on “such-and-such person said X”. I’ve also once experienced and many times heard stories of journalists adversarially quoting people to make it sound like their position was very different than it actually was, so I usually don’t even update on “such-and-such person believes X”.
I’m wondering if it’d be good to have something special happen to posts where a comment has more karma than the OP. Like, decrease the font size of the OP and increase the font size of the comment, or display the comment first, or have a red warning light emoji next to the post’s title or …
Or maybe the commenter gets a $1,000 prize whenever that happens.
Good versions of “something special” would also incentivize the public service of pointing out significant flaws in posts by making comments that have a shot at exceeding the OP’s karma score.
Obviously “there exists a comment that has higher karma than the OP” is an imperfect proxy of what we’re after here, but anecdotally it seems to me this proxy works surprisingly well (though maybe it would stop due to Goodhart issues if we did any of the above) and it has the upside that it can be evaluated automatically.
There’s been a few posts recently about how there should be more EA failures, since we’re trying a bunch of high-risk, high-reward projects, and some of them should fail or we’re not being ambitious enough.
I think this is a misunderstanding of what high-EV bets look like. Most projects do not either produce wild success or abject failure, there’s usually a continuity of outcomes in between, and that’s what you hit. This doesn’t look like “failure”, it looks like moderate success.
For example, consider the MineRL BASALT competition that I organized. The low-probability, high-value outcome would have had hundreds or thousands of entries to the competition, several papers produced as a result, and the establishment of BASALT as a standard benchmark and competition in the field.
What actually happened was that we got ~11 submissions, of which maybe ~5 were serious, made decent progress on the problem, produced a couple of vaguely interesting papers, some people in the field have heard about the benchmark and occasionally use it, and we built enough excitement in the team that the competition will (very likely) run again this year.
Is this failure? It certainly isn’t what normally comes to mind from the normal meaning of “failure”. But it was:
Below my median expectation for what the competition would accomplish
Not something I would have put time into if someone had told me in advance exactly what it would accomplish so far, and the time cost needed to get it.
One hopes that roughly 50% of the things I do meet the first criterion, and probably 90% of the things I’d do would meet the second. But also maybe 90% of the work I do is something people would say was “successful” even ex post.
If you are actually seeing failures for relatively large projects that look like “failures” in the normal English sense of the word, where basically nothing was accomplished at all, I’d be a lot more worried that actually your project was not in fact high-EV even ex ante, and you should be updating a lot more on your failure, and it is a good sign that we don’t see that many EA “failures” in this sense.
(One exception to this is earning-to-give entrepreneurship, where “we had to shut the company down and made ~no money after a year of effort” seems reasonably likely and it still would plausibly be high-EV ex ante.)
I sometimes see people arguing for people to work in area A, and declaring a conflict of interest that they are personally working on area A.
If they already were working in area A for unrelated reasons, and then they produced these arguments, it seems reasonable to be worried about motivated reasoning.
On the other hand, if because of these arguments they switched to working in area A, this is in some sense a signal of sincerity (“I’m putting my career where my mouth is”).
I don’t like the norm of declaring your career as a “conflict of interest”, because it implies that you are in the former rather than latter category, regardless of which one is actually true. (And the latter is especially common in EA.) However, I don’t really have a candidate alternative norm.
I share your feeling towards it… but I also often say that one’s “skin in the game” (your latter example) is someone else’s “conflict of interest.”
I don’t think that the listener / reader is usually in a good position to distinguish between your first and your second example; that’s enough to justify the practice of disclosing this as a potential “conflict of interest.” In addition, by knowing you already work for cause X, I might consider if your case is affected by some kind of cognitive bias.
Reasonably often (maybe once or twice a month?) I see fairly highly upvoted posts that I think are basically wrong in something like “how they are reasoning”, which I’ll call epistemics. In particular, I think these are cases where it is pretty clear that the argument is wrong, and that this determination can be made using only knowledge that the author probably had (so it is more about reasoning correctly given a base of knowledge).
Sometimes I write a comment explaining why. If I reliably did this on all of the posts then you could still rely on karma as an indicator of epistemic soundness, but sadly I don’t, because it’s actually a fair amount of work and my time has high opportunity cost. So here is your PSA: for any particular high-karma post, knowing nothing else about the post besides that it is high karma, there is a non-trivial probability that I would find significant reasoning issues in that post. You can’t rely solely on karma as a strong signal of epistemics.
Clear, strong examples from the EA Forum:
Comment on How much current animal suffering does longtermism let us ignore?
Comment on Aligning Recommender Systems as Cause Area
Moderate examples from the EA Forum (either they were lower karma, or only one particular thing was off instead of most of the post, or something else):
Comment on Can money buy happiness? A review of new data
Comment on How to think about an uncertain future: lessons from other sectors & mistakes of longtermist EAs
Weak examples from the EA Forum that are still some evidence (for these ones it’s pretty likely someone would have made the points I made if I hadn’t; that’s not true for the others):
Comment on Against the “smarts fetish”
Comment on Critical Review of ‘The Precipice’: A Reassessment of the Risks of AI and Pandemics
Comment on Does 80,000 Hours focus too much on AI risk?
Clear, strong examples from LessWrong:
Comment on Matt Botvinick on the spontaneous emergence of learning algorithms
Comment on Tal Yarkoni: No, it’s not The Incentives—it’s you
Agreed. To some extent it’s OK for bad posts to get upvoted. But I think the fact that posting volume is so much higher now means we should be able to trade off some of that volume for greater post quality. This could be by having a review process for posts, or reinstating the minimum upvote requirement before a user is allowed to post. I also think there may be some achievable gains that don’t require trading off volume, such as improving the upvote strength algorithm.
Fwiw my view is that forum members shouldn’t upvote posts whose reasoning isn’t up to standard even if they agree with the conclusion.
I’d assume that forum members don’t notice that the reasoning is bad.
As evidence in favor of this view, at least sometimes after I post such a comment, the post’s karma starts to go down, suggesting that the comment informed voters about bad reasoning that they hadn’t previously noticed. (Possibly this happened in most of the examples above, I wasn’t carefully tracking this and don’t know of any way to check now.)
Probably yeah, at least in part. Sometimes they may notice it a bit but put insufficient weight on it relative to the fact that they agree with the conclusion. But some may also miss it altogether.
My comment was in response to the claim that “to some extent it’s OK for bad posts to get upvoted”.
Ah, I interpreted that claim as “it’s not a huge priority to prevent bad posts from being upvoted, regardless of how that happens”, rather than “it’s fine for forum members to upvote posts whose conclusions they agree with even if they see that the reasons are bad”.
Hot take: strong upvoting things without great reasoning that also have conclusions I disagree with could be good for improving epistemics. At least, I think this gives us an opportunity to demonstrate common thinking processes in EA and what reasoning transparency looks like to newer people to the community. [1]
My best guess is that it also makes it more likely quality divergent thinking from established ideas happens in EA community spaces like the EA forum.
My reasoning is in a footnote in my comment here.
I’m aware that people on this thread might think my thinking processes and reasoning abilities aren’t stellar,* but I still think my point stands.
*My personal view is that this impression would be less because I’m bad at thinking clearly and more because our views are quite different.
A large inferential distance means it’s harder to diagnose epistemics accurately (but I’m not exactly an unbiased observer when it comes to judging my own ability to think clearly).
This leads me to another hot take: footnotes within footnotes are fun.
Maybe I want silent upvoting and downvoting to be disincentivized (or commenting with reasoning to be more incentivized). Commenting with reasoning is valuable but also hard work.
After 2 seconds of thought, I think I’d be massively in favour of a forum feature where any upvotes or downvotes count for more (e.g. double or triple the karma) once you’ve commented.[1]
Just having this incentive might make more people try and articulate what they think and why they think it. This extra incentive to stop and think might possibly make people change their votes even if they don’t end up submitting their comments.
Me commenting on my own comment shouldn’t mean the default upvote on my comment counts for more though: only the first reply should give extra voting power (I’m sure there are other ways to game it that I haven’t thought of yet but I feel like there could be something salvageable from the idea anyway).
Yes. But people are sticky, so you need to instill more vetting power in people who evaluate appropriately. The question is how to do that.
I know this is just a small detail and not what you wrote about, but: much of your comment on the recommender systems post hinged on news articles being uncorrelated with the truth. Do you have data to back that up?
I’m replying here because it’s a strong claim that’s relevant to many things beyond that specific post.
I have data in the sense that when I read news articles and check how correct they are, they are usually not very correct. (You can have more nuance than this, e.g. facts about what mundane stuff happened in the world tend to be correct.)
I don’t have data in the sense that I don’t have a convenient list of articles and ways they were wrong such that I could easily persuade someone else of this belief of mine. (Though here’s one example of an article that you at least have to read closely if you want to not be misled.)
Also, I could justify ignoring those two particular news articles without this general claim, at least to myself. I did briefly look at them before I wrote that comment; I didn’t particularly expect to believe them but if they were the rare good kind of news article I would have noticed.
For radicalization, I know specific people who have looked into it and come away unconvinced; Stefan Schubert links to some of this work in a different comment on that post.
The article about social media being addictive is basically just a bunch of quotes from people rather than particular studies / data. It generally seems pretty easy to find people saying things you want so I don’t update much on “such-and-such person said X”. I’ve also once experienced and many times heard stories of journalists adversarially quoting people to make it sound like their position was very different than it actually was, so I usually don’t even update on “such-and-such person believes X”.
I’m wondering if it’d be good to have something special happen to posts where a comment has more karma than the OP. Like, decrease the font size of the OP and increase the font size of the comment, or display the comment first, or have a red warning light emoji next to the post’s title or …
Or maybe the commenter gets a $1,000 prize whenever that happens.
Good versions of “something special” would also incentivize the public service of pointing out significant flaws in posts by making comments that have a shot at exceeding the OP’s karma score.
Obviously “there exists a comment that has higher karma than the OP” is an imperfect proxy of what we’re after here, but anecdotally it seems to me this proxy works surprisingly well (though maybe it would stop due to Goodhart issues if we did any of the above) and it has the upside that it can be evaluated automatically.
There’s been a few posts recently about how there should be more EA failures, since we’re trying a bunch of high-risk, high-reward projects, and some of them should fail or we’re not being ambitious enough.
I think this is a misunderstanding of what high-EV bets look like. Most projects do not either produce wild success or abject failure, there’s usually a continuity of outcomes in between, and that’s what you hit. This doesn’t look like “failure”, it looks like moderate success.
For example, consider the MineRL BASALT competition that I organized. The low-probability, high-value outcome would have had hundreds or thousands of entries to the competition, several papers produced as a result, and the establishment of BASALT as a standard benchmark and competition in the field.
What actually happened was that we got ~11 submissions, of which maybe ~5 were serious, made decent progress on the problem, produced a couple of vaguely interesting papers, some people in the field have heard about the benchmark and occasionally use it, and we built enough excitement in the team that the competition will (very likely) run again this year.
Is this failure? It certainly isn’t what normally comes to mind from the normal meaning of “failure”. But it was:
Below my median expectation for what the competition would accomplish
Not something I would have put time into if someone had told me in advance exactly what it would accomplish so far, and the time cost needed to get it.
One hopes that roughly 50% of the things I do meet the first criterion, and probably 90% of the things I’d do would meet the second. But also maybe 90% of the work I do is something people would say was “successful” even ex post.
If you are actually seeing failures for relatively large projects that look like “failures” in the normal English sense of the word, where basically nothing was accomplished at all, I’d be a lot more worried that actually your project was not in fact high-EV even ex ante, and you should be updating a lot more on your failure, and it is a good sign that we don’t see that many EA “failures” in this sense.
(One exception to this is earning-to-give entrepreneurship, where “we had to shut the company down and made ~no money after a year of effort” seems reasonably likely and it still would plausibly be high-EV ex ante.)
I sometimes see people arguing for people to work in area A, and declaring a conflict of interest that they are personally working on area A.
If they already were working in area A for unrelated reasons, and then they produced these arguments, it seems reasonable to be worried about motivated reasoning.
On the other hand, if because of these arguments they switched to working in area A, this is in some sense a signal of sincerity (“I’m putting my career where my mouth is”).
I don’t like the norm of declaring your career as a “conflict of interest”, because it implies that you are in the former rather than latter category, regardless of which one is actually true. (And the latter is especially common in EA.) However, I don’t really have a candidate alternative norm.
I share your feeling towards it… but I also often say that one’s “skin in the game” (your latter example) is someone else’s “conflict of interest.”
I don’t think that the listener / reader is usually in a good position to distinguish between your first and your second example; that’s enough to justify the practice of disclosing this as a potential “conflict of interest.” In addition, by knowing you already work for cause X, I might consider if your case is affected by some kind of cognitive bias.
I’m not objecting to providing the information (I think that is good), I’m objecting to calling it a “conflict of interest”.
I’d be much more keen on something like this (source):