I’m one of the people who submitted a post right before the deadline of the criticism contest. FWIW I think number 6 is off base. In my case, the deadline felt like a Schelling point. My post was long and kind of technical, and I didn’t have any expectation of getting money—though having a fake deadline was very helpful and I would probably not have written it without the contest. I don’t think that any of the posts that got prizes were written with an expectation of making a profit. They all looked like an investment of multiple hours by talented people who could have made much more money (at least in expectation) by doing something else. In order for someone profit-motivated to take advantage of this they would have to be involved and knowledgeable enough to write a competitive post, and unable to make money in essentially any other way. This seems like an unlikely combination, but if it does exist then I’d assume that supporting such people financially is an additional benefit rather than a problem.
iporphyry
Atlantic article on effective Ukraine aid
On longtermism, Bayesianism, and the doomsday argument
I like hypothesis generation, and I particularly like that in this post a few of the points are mutually exclusive (like numbers 7 and 10), which should happen in a hypothesis generation post. However this list, as well as the topic, feels lazy to me, in the sense of needing much more specificity in other to generate more light than heat.
I think my main issue is the extremely vague use of”quality” here. It’s ok to use vague terms when a concept is hard to define, but in this case it feels like there are more useful ways to narrow it down. For example you could say “the average post seems less informative/well-researched” or “the average poster seems less experienced/ qualified”, or “I learned more from the old forum than the new one” (I think especially a focus on your experience would make the issue more precise, and open up new options such as “posts became less fun once I learned all the basics and new people who are just learning them became less interesting to me”). I would like to see a hypothesis generation post that focuses much more on the specific ways that posts are “worse” (and generates hypotheses on what they are) rather than on reasons for this to be the case. I suspect that once a concrete question is asked, the potential reasons will become more concrete and testable.
Another issue is that I think a lot of the points are more properly “reasons that posts on a forum can be bad” rather than issues with current vs old posts and I have trouble believing that these issues were absent or better in the past. This would also be solved by trying to make the complaint specific.
I like this criticism, but I think there are two essentially disjoint parts here that are being criticized. The first is excess legibility, i.e., the issue of having explicit metrics and optimizing to the metrics at all. The second is that a few of the measurements that determine how many resources a group gets/how quickly it grows are correlated with things that are not inherently valuable at best and harmful at worst.
The first problem seems really hard to me: the legibility/autonomy trade-off is an age-old problem that happens in politics, business, and science, and seems to involve a genuine trade-off between organizational efficiency and the ability to capitalize on good but unorthodox ideas and individuals.
The second seems more accessible (though still hard), and reasonably separable from the first. Here I see a couple of things you flag (other than legibility/”corporateness” by itself) as parameters that positively contribute to growth but negatively contribute to the ability of EA to attract intellectually autonomous people. The first is “fire-and-brimstone” style arguments, where EA outreach tends to be all-or-nothing, “you either help save the sick children or you burn in Utilitarian Hell”, and the second is common-denominator level messaging that is optimized to build community (so things like slogans, manufactured community and sense of purpose; things that attract the people like Bob in your thought experiment), but not optimized to appeal to meta-level thinkers who understand the reasoning behind the slogans. Both are vaguely correlated with EA having commonalities with religious communities, and so I’m going to borrow the adjective “pious” to refer to ideas and individuals for which these factors are salient.
I like that you are pointing out that a lot of EA outreach is, in one way or another, an “appeal to piety”, and this is possibly bad. There might be a debate about whether this is actually bad and to what extent (e.g., the Catholic church is inefficient, but the sheer volume of charity it generates is nothing to sneer at), but I think I agree with the intuition that this is suboptimal, and that by Goodhart’s law, if pious people are more likely to react to outreach, eventually they will form a supermajority.
I don’t want to devalue the criticism that legibility is in itself a problem, and particularly ugh-y to certain types of people (e.g. to smart humanities majors). But I think that the problem of piety can be solved without giving up on legibility, and instead by using better metrics, that have more entanglement with the real world. This is something I believed before this post, so I might be shoe-horning it here: take this with a grain of salt.
But I want to point out that organizations that are constantly evaluated on some measurable parameter don’t necessarily tend to end up excessively pious. A sports team can’t survive by having the best team spirit; a software company will not see any profit if it only hires people who fervently believe in its advertising slogans. So maybe a solution to the problem of appeals to piety is to, as you say, reduce the importance of the metric of “HEA” generation in determining funding, clout, etc., but replace it with other hard-to-fake metrics that are less correlated with piety and more correlated with actually being effective at what you do.
I haven’t thought much about what the best metrics would be and am probably not qualified to make recommendations, but I just for plausibility’s sake here are a couple of examples of things that I think would be cool (if not necessarily realistic):
First, it would be neat (though potentially expensive) if there were a yearly competition between teams of EA’s (maybe student groups, or maybe something on a larger level) to use a funding source to create an independent real-world project and have their impact in QALY’s judged by an impartial third party.
Second, I think it would be nice to make “intramural” forms of existing competitions, such as the fiction contest, Scott Alexander’s book review contest, various super-forecasting contests, etc., and grading University groups on success (relative to past results). If something like this is implemented, I’d also like to see the focus of things like the fiction competition move away from “good messaging” (which smacks of piety) and towards “good fiction that happens to have an EA component, if you look hard enough”.
I think that if the funding culture becomes more explicitly focused on concentration of talent and on real-world effects and less on sheer numbers or uncritical mission alignment, then outreach will follow suit and some of the issues that you address will be addressed.
I think a lot of people miss the idea that “being an EA” is a different thing from being “EA adjacent”/”in the EA community”/ “working for an EA organization” etc. I am saying this as someone who is close to the EA community, who has an enormous amount of intellectual affinity, but does not identify as an EA. If the difference between the EA label and the EA community is already clear to you, then I apologize for beating a dead horse.
It seems from your description of yourself like you’re actually not an Effective Altruist in the sense of holding a significantly consequentialist worldview that one tries to square with one’s choices (once again, neither am I). From your post, the main way that I see in which your worldview deviates from EA is that, while lots of EA’s are status-motivated, your worldview seems to include the idea that typical levels of status -based and selfish motivations aren’t a cognitive error that should be pushed against.
I think that’s great! You have a different philosophical outlook (from the very little I can see in this post, perhaps it’s a little close to the more pro-market and pro-self interest view of people like Zvi, who everyone I know in the community respects immensely). I think that if people call this “evil” or “being a bad person”, they are being narrow-minded and harmful to the EA cause. But I also don’t think that people like you (and me) who love the EA community and goals but have a personal philosophy that deviates significantly from the EA core should call ourselves EA’s, any more than a heterosexual person who has lots of gay friends and works for a gay rights organization should call themselves LGBT. There is a core meaning to being an “effective altruist”, and you and I don’t meet it.
No two people’s philosophies are fully aligned, and even the most modal EA working in the most canonical EA organization will end up doing some things that feel “corporate” or suboptimal, or that matter to other people but not to them. If you work for an EA org, you might experience some of that because of your philosophical differences, but as long as you’re intellectually honest with yourself and others, and able to still do the best you can (and not try to secretly take your project in a mission-unaligned direction) then I am sure everyone would have a great experience.
My guess is that most EA organizations would love to hire/fund someone with your outlook (and what some of the posts you got upset with are worried about are people who are genuinely unaligned/deceptive and want to abuse the funding and status of the organization for personal gain). However if you do come in to an EA org and do your best, but people decline to work with you because of your choices or beliefs, I think that would be a serious organizational problem and evidence of harmful cultishness/”evaporative cooling of group beliefs”.
I agree with you that EA outreach to non-Western cultures is an important and probably neglected area — thank you for pointing that out!
There are lots of reasons to make EA more geographically (and otherwise) diverse, and also some things to be careful about, given that different cultures tend to have different ethical standards and discussion norms. See this article about translation of EA into Mandarin. Something to observe is that outreach is very language and culture-specific. I generally think that international outreach is best done in a granular manner — not just “outreach to all non-Western cultures” or “outreach to all the underprivileged”. So I think it would be wonderful for someone to post about how to best approach outreach in Malawi, but that the content might be extremely different from writing about outreach in Nigeria.
So: if you’re interested in questions like this, I think it would be great if someone were to choose a more specific question and research it! (And I appreciate that your post points out a real gap.)
On a different note, I think that the discussion around your post would be more productive if you used other terms than “social justice.” Similarly, I think that the dearth of the phrase “social justice” on the EA Forum is not necessarily a sign of a lack of desire for equity and honesty. There are many things about the “social justice” movement that EAs have become wary of. For instance, my sense is that the conventional paradigm of the contemporary Western elite is largely based on false or unfalsifiable premises. I’d guess that this makes EAs suspicious when they hear “social justice” — just like they’re often wary about certain types of sociology research (things like “grit,” etc. which don’t replicate) or psychosexual dynamics and other bits of Freud’s now-debunked research.
At the same time (just like with Freudism), a lot of the core observations that the modern social justice paradigm makes are extremely true and extremely useful. It is profoundly obvious, both from statistics and from the anecdotal evidence of any woman that pretty much every mixed-gender workplace has an unacceptable amount of harassment. There is abundant evidence that e.g. non-white Americans experience some level of racism, or at least are treated differently, in many situations.
Given this, here are some things that I think it would be useful to do:
Make the experience of minorities within EA more comfortable and safe.
Continue seriously investigating translating EA concepts to other cultural paradigms (or conversely, translating useful ideas from other cultural paradigms into EA). (See also this article .)
Take some of the more concrete/actionable pieces of the social justice paradigm and analyze/ harmonize them with the more consequentialist/science-based EA philosophy (with the understanding that an honest analysis sometimes finds cherished ideas to be false).
I think the last item is definitely worth engaging with more, especially with people who understand and value the social justice paradigm. Props if you can make progress on this!
Edit: I mostly retract this comment. I skimmed and didn’t read the post carefully (something one should never do before leaving a negative comment) and interpreted it as “Leverage wasn’t perfect, but it is worth trying to make Leverage 2.0 work or have similar projects with small changes”. On rereading, I see that Jeff’s emphasis is more on analyzing and quantifying the failure modes than on salvaging the idea.
That said, I just want to point out that (at least as far as I understand it), there is a significant collection of people within and around EA who think that Leverage is a uniquely awful organization which suffered a multilevel failure extremely reminiscent of your run-of-the mill cult (not just for those who left it, but also for many people who are still in it), which soft-core threatens members to avoid negative publicity, exerts psychological control on members in ways that seem scary and evil. This is context that I think some people reading the sterilized publicity around Leverage will lack.
There are many directions from which people could approach Leverage 1.0, but the one that I’m most interested in is lessons for people considering attempting similar things in the future.I think there’s a really clear lesson here: don’t.I’ll elaborate: Leverage was a multilevel failure. A fundamentally dishonest and charismatic leader. A group of people very convinced that their particular chain of flimsy inferences led them to some higher truth that gave them advantages over everyone else. A frenzied sense of secrecy and importance. Ultimately, psychological harm and abuse.It is very clearly a negative example, and if someone is genuinely trying to gain some positive insight into a project from “things they did right” (or noticeably imitate techniques from that project), that would make me significantly less likely to think of them as being on the right track.There are examples of better “secret projects”—the Manhattan project as well as other high-security government organizations, various secret revolutionary groups like the early US revolutionaries, the abolitionist movement and the underground railroad, even various pro-social masonic orders. Having as one’s go-to example of something to emulate an organization that significantly crossed the line into cult territory (or at least into Aleister Crowley level grandiosity around a bad actor) would indicate to me a potential enlarged sense of self-importance, an emphasis on deference and exclusivity (“being on our team”) instead of competence and accountability, and a lack of emphasis on appropriate levels of humility and self-regulation.To be clear, I believe in decoupling and don’t think it’s wrong to learn from bad actors. But with such a deeply rotten track record, and so many decent organizations that are better than it along all parameters, Leverage is perhaps the clearest example of a situation where people should just “say oops” and stop looking for ways to gain any value from it (other than as a cautionary tale) that I have heard of in the EA/LW community.
Disclaimer: this is an edited version of a much harsher review I wrote at first. I have no connection to the authors of the study or to their fields of expertise, but am someone who enjoyed the paper here critiqued and in fact think it very nice and very conservative in terms of its numbers (the current post claims the opposite). I disagree with this post and think it is wrong in an obvious and fundamental way, and therefore should not be in decade review in the interest of not posting wrong science. At the same time it is well-written and exhibits a good understanding of most of the parts of the relevant model, and a less extreme (and less wrong :) version of this post would pass my muster. In particular I think that the criticism
However, since this parameter is capped at 1, while there is no lower limit to the long tail of very low estimates for fl, in practise this primarily has the effect of reducing the estimated probability of life emerging spontaneously, even though it represents an additional pathway by which this could occur.
is very valid, and a model taking this into account would have a correspondingly higher credence for “life is common” scenarios. However the authors of the paper being criticized are explicitly thinking about the likelihood of “life is not common” scenarios (which a very naive interpretation of the Drake equation would claim are all but impossible) and here this post is deeply flawed.
The essential beef of the author of the post (henceforth the OP) with the authors of the paper (henceforth, Sandberg et al) concerns their value fl, which is the “log standard deviation in the log uncertainty of abiogenesis” (abiogenesis is the event wherein random and non-replicating chemical processes create the first replicating life). A very rough explanation of this parameter (in the log uncertainty model which Sandberg et al use and OP subscribes to) is the probability of the best currently known model for abiogenesis occuring on a given habitable planet. Note that this is very much not the probability of abiogenesis itself, since there can be many other methods which produce abiogenesis a lot more frequently than the best currently known model. The beautiful conceit of this paper (and the field it belongs to) is the idea that, absent a model for a potentially very large or very small number (in this case, the probability of abiogenesis, or, in the larger paper, the probability of the emergence of life on a given paper), our best rough estimate for our uncertainty it is more or less log uniformly distributed between the largest and smallest “theoretically possible” values (so a number between 10^-30 and 10^-40 is roughly as likely as a value between 10^-40 and 10^-50, provided these numbers are within the “theoretically possible” range. The difference between “log uniform” and “log normal” is irrelevant to a first approximation). The exact definition of “theoretically possible” is complicated, but in the case of abiogenesis the largest theoretically possible value of fl (as of any other probability measure) is 1 while the smallest possible value is the probability of abiogenesis given the best currently known methods. The model is not perfect, but by far the best we have for predicting the lower tail of such distributions, i.e., in this case, the likelihood of the cosmos being mostly devoid of intelligent life. (Note that the model doesn’t tell us this probability is close to 1! Just that it isn’t close to 0.)
Now the best theoretically feasible model for abiogenesis currently known is the so-called RNA world model, which is analyzed in supplement 1 of Sandberg et al. Essentially, the only sure-fire way we know of abiogenesis is spontaneously generating the genome of an archaeobacterium, which has hundreds of thousands of base pairs, and would put the probability of abiogenesis at under 10^-100,000 (insanely small). However, we are fairly confident both that much smaller self-replicating RNA sequence would be possible in certain conducive chemical environments (the putative RNA world), and that there is some redundancy in how to generate a near minimal self-replicating RNA sequence (so you don’t have to get every base pair right). The issue is that we don’t know how small the smallest genome is and how much redundancy there is in choosing it. By the nature of log uncertainty, if we want to get the lowest value in the range of uncertainties (what OP and Sandberg et al call log standard deviation) we should take the most pessimistic reasonable estimates. These are attempted in the previously mentioned supplement, though rather than actually taking pessimistic values, Sandberg et al rather liberally assume a very general model of self-replicating RNA formation, with their lower bound based on assumptions about protein folding (rather than a more restrictive model based on assuming low levels of redundancy, which I would have chosen, and which would have put the value of fl significantly lower even than the Sandberg et al paper: they explicitly say that they are trying to be conservative). Still, they estimate a value of fl equal or lower than 10^-30 with the current best model. In order to argue for a 10^-2 result while staying within the log normal model, OP would have to convince me of some drastic additional knowledge. Either that they have a proof, beyond all reasonable doubt, that either an RNA chain shorter than the average protein is capable of self-replicating, or that there is a lot of redundance in how self-replicating RNA can form, and a chemical “RNA soup” would naturally tend to self-replication under certain conditions. Both of these are plausible theories, but as such methods for abiogenesis are not currently known to exist, assuming they work for your lower bounds on log probability is precisely not how log uncertainty works. In this way OP is, quite simply, wrong. Therefore, as incorrect science, I do not recommend this post for the decade review.
Fair enough. I admit that I skimmed the post quickly, for which I apologize, and part of this was certainly a knee-jerk reaction to even considering Leverage as a serious intellectual project rather than a total failure as such, which is not entirely fair. But I think maybe a version of this post I would significantly prefer would first explain your interest in Leverage specifically: that while they are a particularly egregious failure of the closed-research genre, it’s interesting to understand exactly how they failed and how the idea of a fast, less-than-fully transparent think tank can be salvaged. It does bother me that you don’t try to look for other examples of organizations that do some part of this more effectively, and I have trouble believing that they don’t exist. It reads a bit like an analysis of nation-building that focuses specifically on the mistakes and complexities of North Korea without trying to compare it to other less awful entities.
I enjoyed this post a lot!
I’m really curious about your mention of the “schism” pattern because I both haven’t seen it and I sort of believe a version of it. What were the schism posts? And why are they bad?
I don’t know if what you call “schismatics” want to burn the commons of EA cooperation (which would be bad), or if they just want to stop the tendency in EA (and really, everywhere) of people pushing for everyone to adopt convergent views (the focus of “if you believe X you should also believe Y” which I see and dislike in EA, versus “I don’t think X is the most important thing, but if you believe X here are some ways you could can do it more effectively” which I would like to see more).
Though I can see myself changing my mind on this, I currently like the idea of a more loose EA community with more moving parts that has a larger spectrum of vaguely positive-EV views. I’ve actually considered writing something about it inspired by this post by Eric Neyman https://ericneyman.wordpress.com/2021/06/05/social-behavior-curves-equilibria-and-radicalism/ which quantifies, among other things, the intuition that people are more likely to change their mind/behavior in a significant way if there is a larger spectrum of points of view rather than a more bimodal distribution.
I’m trying to optimise something like “expected positive impact on a brighter future conditional on being the person that I am with the skills available to/accessible for me”.
If this is true, then I think you would be an EA. But from what you wrote it seems that you have a relatively large term in your philosophical objective function (as opposed to your revealed objective function, which for most people gets corrupted by personal stuff) on status/glory. I think the question determining your core philosophy would be which term you consider primary. For example if you view them as a means to an end of helping people and are willing to reject seeking them if someone convinces you they are significantly reducing your EV then that would reconcile the “A” part of EA.
A piece of advice I think younger people tend to need to hear is that you should be more willing to accept that “X is something I like and admire, and I am also not X” without having to then worry about your exact relationship to X or redefining X to include themselves (or looking for a different label Y). You are allowed to be aligned with EA but not be an EA and you might find this idea freeing (or I might be fighting the wrong fight here).
Thanks! But I see your point
Certainly not deliberately. I’ll try to read it more carefully and update my comment
Thanks! I didn’t fully understand what people meant by that and how it’s related to various forms of longtermism. Skimming the linked post was helpful to get a better picture.
I have some serious issues with the way the information here is presented which make me think that this is best shared as something other than an EA forum post. My main issues are:
This announcement is in large part a promotion for the Fistula Foundation, with advertising-esque language. It would be appropriate in an advertising banner of an EA-aligned site but not on the forum, where critical discussion or direct information-sharing is the norm.
It includes the phrase that Fistula Foundation is “widely regarded as one of the most effective charities in the world” (in addition to some other similar phrases). This is weaselly language which should not be used without immediate justification (e.g. ”...according to X rating”).
In this case this is also misleading/false. I went to the EA impact assessment page for the foundation and it is actually estimated to be below the cost-effectiveness range of EA programs (while it seems to be considered a reasonable charity).
In general the language here together with the fact that the OP is affiliated with this foundation makes me update to take the fistula foundation much less seriously and to avoid donating there in the future. I would suggest for the OP to remove this post or to edit it in a way that is informative rather than pandering (e.g. something like “join me to go skinny-dipping for fistula foundation on X day. While it has a mediocre impact assessment, I like the cause and think skinny dipping would be a good way to support it while also becoming less insecure about our bodies”).
I think this is a great post! It addresses a lot of my discomfort with the EA point of view, while retaining the value of the approach. Commenting in the spirit of this post.
Some possible bugs:
*When I click on the “listen online” option it seems broken (using this on a mac)
*When I click on the “AGI safety fundamentals” courses as podcasts, they take me to the “EA forum curated and popular” podcast. Not sure if this is intentional, or if they’re meant to point to a podcast containing just the course
But I agree with your meta-point that I implicitly assumed SSA together with my “assumption 5” and SSA might not follow from the other assumptions
I like that you admit that your examples are cherry-picked. But I’m actually curious what a non-cherry-picked track record would show. Can people point to Yudkowsky’s successes? What did he predict better than other people? What project did MIRI generate that either solved clearly interesting technical problems or got significant publicity in academic/AI circles outside of rationalism/EA? Maybe instead of a comment here this should be a short-form question on the forum.