This post had 169 views on the EA forum, 3K on substack, 17K on reddit, 31K on twitter.
Link appears to be broken.
This is great news; I’m so glad to hear that!!!
I wrote a field guide on writing styles. Not directly applicable to the EA Forum but I used some EA Forum-style writing (including/especially my own) as examples.
https://linch.substack.com/p/on-writing-styles
I hope the article can increase the quality of online intellectual writing in general and EAF writing in particular!
Now, of course, being vegan won’t kill you, right away or ever. But the same goes for eating a diet of purely McDonald’s or essentially just potatoes (like many peasants did). The human body is remarkably resilient and can survive on a wide variety of diets. However, we don’t thrive on all diets.
Vegans often show up as healthier in studies than other groups, but correlation is not causation. For example, famously Adventists are vegetarians and live longer than the average population. However, vegetarian is importantly different from vegan. Also, Adventists don’t drink or smoke either, which might explain the difference.
Wouldn’t it be great if we had a similar population that didn’t smoke or drink but did eat meat to compare?
We do! The Mormons. And they live longer than the Adventists.
The Seventh-Day Adventist studies primarily looked at differences *between* different Seventh-Day Adventists, not just a correlational case of Seventh-Day Adventists against other members of the public. This helps control for a number of issues with looking across religious groups, which would be a pretty silly way to determine causation from diet to health. I believe the results also stand after a large number of demographic adjustments [2].
Finally, Mormons are predominantly white. Only 3% of Mormons are black. 32% of Seventh-Day Adventists are black. In the US, black people have a substantially lower life expectancy than white people [3]. Thus, it’d be unreasonable to look at naive life expectancies across two different religious groups and assume that lifestyle makes the biggest difference, when there are clearly other things going on.
[1] https://pmc.ncbi.nlm.nih.gov/articles/PMC4191896/
[2] https://pmc.ncbi.nlm.nih.gov/articles/PMC4191896/table/T4/
[3] Interestingly enough, this is not true across the rest of the developed world. For example, UK black people have a higher life expectancy than white people. I’ve never dived in into this discrepancy before so I’m not sure what the reason is.
I have a lot of sympathy towards being frustrated at knee-jerk bias against AI usage. I was recently banned from r/philosophy on first offense because I linked a post that contained an AI-generated image and a (clearly-labelled) AI summary of someone else’s argument[1]. (I saw that the subreddit had rules against AI usage but I foolishly assumed that it only applied to posts in the subreddit itself). I think their choice to ban me was wrong, and deprived them of valuable philosophical arguments that I was able to make[2] in other subreddits like r/PhilosophyOfScience. So I totally get where you’re coming from with frustration.
And I agree that AI, like typewriters, computers, calculators, and other tools, can be epistemically beneficial in allowing people who otherwise don’t have the time to make arguments to develop them.
Nonetheless I think you’re wrong in some important ways.
Firstly, I think you’re wrong to believe that perception of AI ought only to cause us to be skeptical of whether to engage with some writing, and it is “pure prejudice” to apply a higher bar to writing after reading it conditional upon whether it’s AI. I think this is an extremely non-obvious claim, and I currently think you’re wrong.
To illustrate this point, consider two other reasons I might apply greater scrutiny to some content I see:
An entire essay is written in Comic Sans
I learned that a paper’s written by Francisca Gino
If an essay is written in Comic Sans (a font often adopted by unserious people), we might initially suspect that the essay’s not very serious, but after reading it, we should withdraw any adverse inferences we make about the essay simply due to font. This is because we believe (by stipulation) that an essay’s font can tell us whether an essay is worth reading, but cannot provide additional data after reading the essay. In Pearlian terms, reading the essay “screens off” any information we gain from an essay’s font.
I think this is not true for learning that a paper is written by Francisca Gino. Since Francisca Gino’s a known data fraudster, even after carefully reading a paper by her, or at least with the same level of care I usually apply to reading psychology papers, I should continue to be more skeptical of her findings than after reading the same paper written by a different academic. I think this is purely rational, rather than an ad hominem argument, or “pure prejudice” as you so eloquently put it.
Now, is learning whether an essay is written (or cowritten) by AI a signal more akin to learning that an essay is written in Comic Sans, or closer to learning that it’s written by Francisca Gino? Reasonable people can disagree here, but at the very least the answer’s extremely non-obvious, and you haven’t actually substantiated why you believe it’s the former, when there are indeed good reasons to believe it’s the latter.
In brief:
AI hallucination—while AIs may intentionally lie less often than Harvard business professors, they still hallucinate at a higher rate than i’m comfortable with seeing on the EA Forum.
AI persuasiveness—for the same facts and levels of evidence, AIs might be more persuasive than most human writers. To the extent this additional persuasiveness is not correlated with truth, we should update negatively accordingly upon seeing arguments presented by AIs.
Authority and cognition—If I see an intelligent and well-meaning person present an argument with some probably-fillable holes, that they allude to but do not directly address in the writing, I might be inclined to give them a benefit of a doubt and assumed they’ve considered the issue and decided it wasn’t worth going into in a short speech or essay. However, this inference is much more likely to go wrong if an essay is written with AI assistance. I alluded to this point in my comment on your other top-level post but I’ll mention it again here.
I think it’s very plausible, for example, that if you took the time to write out/type out your comment here yourself, you’d have been able to recognize my critique for yourself, and it wouldn’t have been necessary for me to dive into it.
I still defend this practice. I think the alternative of summarizing other people’s arguments in your own words has various tradeoffs but a big one is that you are injecting your own biases into the summary before you even start critiquing it.
Richard Chappell was also banned temporarily, and has a more eloquent defense. Unlike me he’s an academic philosopher (TM)
I compiled a list of my favorite jokes, which some forum users might enjoy. https://linch.substack.com/p/intellectual-jokes
Yeah I think these two claims are essentially the same argument, framed in different ways.
I appreciate this article and find the core point compelling. However, I notice signs of heavy AI editing that somewhat diminish its impact for me.
Several supporting arguments come across as flimsy/obvious/grating/”fake” as a result. For example, the “Addressing the Predictable Objections” reads more like someone who hasn’t actually considered the objections but just gave the simplest answers to surface-level questions, rather than someone who deeply brainstormed or crowdsourced the objections to the framework. Additionally, the article’s tendency towards binary framings makes it hard for me to think through the relevant tradeoffs.
The fundamental argument is strong. I also appreciate the emphasis towards truth and evident care to remove inaccuracies. I imagine there was significant editing effort to avoid hallucinations. Nonetheless the breezy style makes it hard for me to read, and I’d appreciate seeing it developed with more depth and authentic engagement with potential counterarguments.
Thanks, appreciate the empirical note and graph on trendlines!
Preventing an AI takeover is a great way for countries to help their own people!
Tbh, my honest if somewhat flippant response is that these trials should update us somewhat against marginal improvements in the welfare state in rich countries, and more towards investments in global health, animal welfare, and reductions in existential risk.
I’m sure this analysis will go over well to The Argument subscribers!
It’s funny (and I guess unsurprising) that Will’s Gemini instance and your Claude instance both reflected what I would have previously expected both of your ex ante views to be!
lmao when I commented 3 years ago I said
As is often the case with social science research, we should be skeptical of out-of-country and out-of-distribution generalizability.
and then I just did an out-of-country and out-of-distribution generalization with no caveats! I could be really silly sometimes lol.
Re the popular post on UBI by Kelsey going around, and related studies:
I think it helped less than I “thought” it would if I was just modeling this with words. But the observed effects (or lack thereof) in the trials appears consistent with standard theoretical models of welfare economics. So I’m skeptical of people using this as an update against cash transfers, in favor of a welfare state, or anything substantial like that.
If you previously modeled utility as linear or logarithmic with income (or somewhere in between), these studies should be a update against your worldview. But I don’t think those were ever extremely plausible to begin with.
See further discussions here:
https://x.com/LinchZhang/status/1958705316276969894
and here:
https://forum.effectivealtruism.org/posts/AAZqD2pvydH7Jmaek/lorenzo-buonanno-s-shortform?commentId=rXLQJLTH7ejJHcJdt
Tbc I have always been at least a little skeptical of UBI ever since I’ve heard of the idea. But I also buy the “poor people are poor because they don’t have enough money” argument, at least in low-income countries. So I don’t really have a dog in this fight.
(It’s mildly frustrating that The Argument doesn’t open up comments to people who aren’t paid subscribers, since I think this is an important point that most readers of that Kelsey post (and possibly the writer/editors) are not yet getting)
Hmm, I’d guess the s.d.s to be lower in the US than in Kenya, for what it’s worth. Under five child mortality rates are about 10x higher in Kenya than in the US, and I expect stuff like that to bleed through to other places.
But even if we assume a smaller s.d. (check: is there a smaller s.d., empirically?), this might be a major problem. The top paper on OpenResearch says they can rule out health improvements greater than 0.023-0.028 standard deviations from giving $1000/month. I’m not sure how it compares to households incomes, but let’s assume that household income is $2000-$3000/month for the US recipients now, so the transfer is 33-50% of household income.
From Googling around, the GiveDirectly studies show mental health effects around a quarter of a standard deviation from a doubling of income.
In other words, the study can rule out effect sizes the size that theory would predict if theory predicts effect sizes >0.1x the effect size in GiveDirectly experiments.
Does theory predict effect sizes >0.1x that of the GiveDirectly experiments? Well, it depends on ň, the risk-aversion constant[1]! If risk aversion is between 0 (linear, aka insane) and 1 (logarithmic) we should predict changes >.41 x to >.58x that of the GD experiments. So we can rule out linera, logarithmic, and super-logarithmic utility!
But if ň=1.5, then theory will predict changes on the scale of 0.076x to 0.128x that of the GD experiments. Ie, exactly in the boundary of whether it’s possible to detect an effect at all or not!
If ň =2, then theory will predict changes on the scale of 0.014x to 0.028x, or much smaller than the experiments are powered to detect.
For what it’s worth, before the studies I would’ve guessed a risk-aversion constant across countries to be something in between 1.2 and 2, so this study updates me some but not a lot.
@Kelsey Piper and others, did you or the study authors pre-register your beliefs on what risk-aversion constant you expected?
rendering the greek constant economists use for risk aversion
as ň since it otherwise doesn’t render correctly on my laptop.
I expected low effects based on background assumptions like utility being sublogarithmic to income but I didn’t expect the effect to be actually zero (at the level of power that the very big studies could detect).
I’d be interested in replicating these trials in other developed countries. It could just be because the US is unusually wealthy.
Of course, like you say, this is further evidence we should increase foreign aid, since money could do far more good in very poor countries than very rich ones.
Published a review of Ted Chiang, my favorite science fiction short story writer.
Most relevant to EAs: he’s one of the few living SF writers who portrays technology as potentially enhancing humanity rather than dystopian. I really like how he imagines what’s possible and takes ideas seriously. But he completely misses societal-level responses to transformative tech. His worlds get universe-altering inventions and use them for personal therapy instead of solving coordination problems or running multiverse-wide RCTs.
In (attempted) blinded trials, my review is consistently ranked #1 by our AI overlords, so check out the one book review that all the LLMs are raving about!!!
I think as an individual reading and mathematical modeling is more conducive to learning true things about the world more than most other things on the list. Certainly I read much more often than I conduct RCTs! Even working scientists have reading the literature as a major component of their overall process.
I also believe this is true for civilization overall. If we imagine in an alternative civilization that is incapable of RCTs but can learn things from direct observation, natural experiments, engineering, etc, I expect substantial progress is still possible. However, if all information can only be relayed via the oral tradition, I think it’d be very hard to build up a substantial civilization. There’s a similar argument for math as well, though less so.
Likewise I think what thought experiments I’m influenced by is more important than the idea that thought experiments are (possibly) less trustworthy than at helping me make decisions than a full blown philosophical framework or more trustworthy than folk wisdown.
Sure, the article discusses this in some detail. Context and discernment definitely matters. I could’ve definitely spent more effort on it, but I was worried it was already too long, and am also unsure if I could provide anything novel that’s relevant to specific people’s situations anyway.
FWIW I think the infographic was fine and would suggest reinstating it (I don’t think the argument is clearer without it, and it’s certainly harder for people to suggest methods you might have missed if you don’t show methods you included!)
I think the infographic probably makes it more likely for people to downvote the post without reading it.
Your linkpost also strips most of the key parts from the article, which I suspect some of the downvoters missed
Yeah the linkpost is just an introduction + explanation of why the post is relevant to EA Forum + link. I strongly suspect, based on substack analytics (which admittedly might be inaccurate) most people who downvoted the post didn’t read or even skim the post. I frankly find this extremely [1]rude.
(Less than 1% of my substack’s views came from the EA Forum, so pretty much every single one of the clickers have to have downvoted; I think it’s much more likely that people who didn’t read the post downvoted. I personally only downvote posts I’ve read, or at least skimmed carefully enough that I’m confident I’d downvote upon a closer read. I can’t imagine having the arrogance to do otherwise.)
Are the abundance ideas actually new to EA folks? They feel like rehashes of arguments we’ve had ~ a decade ago, often presented in less technical language and ignoring the major cruxes.
Not saying they’re bad ideas, just not new.