Do you just mean this shortform or do you mean the full post once I finish it? Either way I’d say feel free to post it! I’d love to get feedback on the idea
Harrison D
I may have just missed this in the comments below, but FWIW: On top of all the other points that have been made in opposition to this stance, I would also assign very low credence to the implied claim “if we don’t [do things that oppose cancel culture], then we’ll be able to avoid getting canceled during the ‘cultural revolution’.” I would suspect that if this “cultural revolution” (which I already consider implausible) were nearly as bad as you suggest, EA as a movement would already get targeted regardless (especially if it’s the case that the whole movement will be held collectively guilty for a subset of the movement speaking out about something), and thus it would have an even smaller fractional expected value. To clarify further, here the witch analogy that you use is potentially misleading because with witch hunts the scope is at least ostensibly limited to the instances of “witches.” This could of course be expanded to include “witch sympathizers”, but it’s at least more plausible that by avoiding getting involved one can continue their abolition work. If however the witch hunt were to grow into an entire philosophy that says “anyone not primarily concerned with finding witches and burning them will be treated as witch sympathizers,” then you face the lose-lose situation (darned if you do, darned if you don’t).
EA (forum/community) and Kialo?
TL;DR: I’m curious why there is so little mention of Kialo as a potential tool for hashing out disagreements in the EA forum/community, whereas I think it would be at least worth experimenting with. I’m considering writing a post on this topic, but want to get initial thoughts (e.g., have people already considered it and decided it wouldn’t be effective, initial impressions/concerns, better alternatives to Kialo)
The forum and broader EA community has lots of competing ideas and even some direct disagreements. Will Bradshaw’s recent comment about discussing cancel culture on the EA forum is just the latest example of this that I’ve seen. I’ve often felt that the use of a platform like Kialo would be a much more efficient way of recording these disagreements, since it helps to separate out individual points of contention and allow for deep back-and-forth, among many other reasons. However, when I search for “Kialo” in the search bar on the forum, I only find a few minor comments mentioning it (as opposed to posts) and they are all at least 2 years old. I think I once saw a LessWrong post downplaying the platform, but I was wondering if people here have developed similar impressions.
More to the point, I was curious to see if anyone had any initial thoughts on whether it would be worthwhile to write an article introducing Kialo and highlighting how it could be used to help hash out disagreements here/in the community? If so, do you have any initial objections/concerns that I should address? Do you know of any other alternatives that would be better options (keeping in mind that one of the major benefits of Kialo is its accessibility)?
“The problem is that it’s evidence that the system at large has very little defenses against goodharting and runaway competition effects.” Although I acknowledge that there will always be some level of misalignment between truth-seeking and competition, I would push back on the idea that the system has little defense against drastic goodharting like is seen in both high school and collegiate policy debate: the experience of Stoa (the league in which I debated) and NCFCA are evidence of that. In my view and in the view of some others (see e.g., https://www.ethosdebate.com/community-judges-1-necessity-community-judges/), it seems that one of the important front-line defenses against gamification of debate is the use of community judges who recoil at nonsense and speed. Of course, that introduces tradeoffs that debaters (myself included) sometimes huff about, such as biased decisions, but it still seems worth it. Additionally, I feel fairly confident that there are other important factors that explain the stark cultural differences between Stoa/NCFCA and most public-school/collegiate leagues (e.g., the debaters’ personalities/background, parental involvement, the Christian ethos, the observation of and opportunity for self-differentiation from public-school/collegiate practices).
To address your broader point about the truth-seeking vs. competition drive (goodharting): I and many others in my league have considered this question. (For a brief example article from someone I know, see https://www.ethosdebate.com/art-persuasion-vs-pursuit-truth/) I could be wrong/exaggerating, but I get the sense from you that debate should be really strict about promoting truth-seeking above other things—perhaps even to the extent that debate should almost never sacrifice truth-seeking for other goals. Perhaps that is not what you are saying, but regardless, I would push back and emphasize that debate has a wide variety of purposes, crucially including skills education in general (as opposed to topical education). (I actually recently finished a blog series which I started by outlining some of the major purposes of debate: see https://www.ethosdebate.com/purposes-of-debate-pt-1-the-goals-and-anti-goals-of-debate/ ). In short, I think that the experience of Stoa/NCFCA shows that with reasonable safeguards (e.g., including community judges in the judging pool) debate can be at least neutral if not more positive than negative in promoting truth-seeking, while at the same time is a great way to get youth excited about studying topics, scrutinizing their own views, and learning to persuade others. That last part applies to that NITOC final (regarding seatbelt policy), which focused on a case that was known for being somewhat pathos-heavy (as opposed to, for example, the case for cutting funding for air marshals, which I and many other debaters would likely have never come to see if it were not subject to the adversarial scrutiny of a competitive season of debate): debate shouldn’t be entirely/solely about truth-seeking; teaching persuasion skills is also really important, because if you have the truth but cannot persuade others, then your ability to act on it is sorely limited.
Also: “people repeatedly abusing terrible studies because you can basically never challenge the validity or methodology of a study”—my experience in Stoa was fairly different: I repeatedly had to defend the methodology of some of the studies I relied on, and was able to challenge the methodology of sources.
Just to clarify my position:
I think that the culture of British Parliamentary has made it significantly less game-y and more civil than most if not all other formats of collegiate debate, including both prepared formats (e.g., Policy Debate, Lincoln-Douglas, Public Forum) and limited preparation formats (e.g., other forms of parli such as American Parliamentary).
I think that the limited topic prep nature of parliamentary debate makes those formats significantly less game-y and more civil than most other formats of collegiate debate.
My main issue with BP is really just two individual characteristics in the format that represent stark differences from the format I did in high school (American Parli, in the Stoa league): the 4-teams-of-2 (instead of 2v2) combined with the lack of access to published sources on the internet when in prep. Really, most if not all of the main issues I highlighted in my comment relate to the first thing, which I think is more fundamental. So ultimately, I’m not trying to compare BP to policy debate, nor am I trying to compare it to the actual (culturally-driven) practice of American Parli in collegiate leagues (which I’m not as familiar with), it’s really just me comparing it to what I think an ideal format would be when given a decent culture that isn’t so acceptive of gamification.
It’s unfortunate to hear that you had such a negative experience from debate. As someone who has judged public-school high school policy and public forum debate, I will say that I am not that surprised. That being said, I do take issue with your characterization of all/BP debate with the video plus the statement “I do think British Parliamentary Debate style is a bit less broken than this, but like, not that much.”
I cannot speak to every BP league/competition in the world, but I have never seen nor heard of such drastic gamification of debate in BP—or anything even come close to it—in the four years that I did BP in college. In fact, I have often seen people hold up BP and collegiate policy debate as polar opposites, with BP being one of the least toxic/gamified formats (at least among the major formats) and policy debate being the most. (BP definitely has some problems with left-leaning judge bias, but it could be a lot worse and that’s not really that unique to BP.) Ultimately, I don’t want to be rude/abrasive, but I feel that the video really gives a deeply misleading picture of BP, even if it is only a mild-moderate exaggeration of collegiate policy debate. I think it’s unfortunate to think that many people who are unfamiliar with debate (let alone BP specifically) may come away with such a misleading picture of BP/debate in general based on this extreme example of a different format. I’m not sure how to say this in a non-confrontational way, but I personally think that some kind of revision/redaction (e.g., a disclaimer saying that the video is not of BP, acknowledgement that BP is different) may be in order.
I will just add the following video to illustrate that the gamifying (e.g., speed and spread tactics) of debate is not so inherent to the sport or even specific formats themselves, but rather are heavily determined by league culture (e.g., what kinds of judges are used, how debaters are taught to debate): https://www.youtube.com/watch?v=TvhNvumnZ1U&t=23s (Although, do excuse the pathos-heavy story at the beginning, and remember these are just high school students)
Sounds good! Like I said, I do recognize that choosing BP probably has quite a few advantages on the (meta?) level in the sense that it seemingly has a more-global audience and topic scope, perhaps a better competitive culture, etc. (Update/clarification: I would say that all of the “flaws” with BP’s format are minor in comparison with the advantages from the BP league culture, which crucially does not have the ridiculous speed and spread from policy debate, as exhibited in the video from Habryka.) If I ever get around to finishing that article about its downsides I might share a link to it here...
The TUILS Framework for Improving Pro-Con Analysis
First of all, thanks for posting this, I think it’s interesting to see some analysis on this topic I actually just yesterday thought about when I looked at an old post. I don’t know if you already tried looking at this and/or whether it is even possible to do this, but I think an interesting metric would be something like “number of people who upvoted (or downvoted?) divided by number of unique people who have viewed the article”. I doubt that would perfectly fix the “old posts’ votes are underrepresented” (if, for example, there are any kind of chronological snobbery or “old news = boring news” biases).
Is it possible to see how many unique users have viewed an article?
Sometimes, questions are too difficult to answer directly. However, if you’re unable to answer a question, then a sign that you’ve understood the question is your ability to break it down into concrete subquestions that can be answered, each of which is easier to answer than the original top-level question. If you can’t do this, then you’re just thinking in circles.
I am actually working on a post that provides an adaptable framework for decision-making which tries to do this. That being said, I naturally make no guarantees that it will be a panacea (and in fact if there are any meta-EA-specific models being used, I would assume that the framework I’m presenting will be less well tailored to meta-EA specifically).
As someone who did debate in high school and throughout college, I am really excited to see this + I think it makes a lot of sense. As you noted, debate often involves evaluating choices in more-neutral ways, seeing both sides of arguments, etc. I’d love to hear more about how this project/idea develops.
The only thing I would note is my moderate dislike for the British Parliamentary (BP) format. Of course, I recognize that it may not be feasible to choose a different format and/or that there may be other justifications for using it (e.g., having more people per round, the league’s culture is not as wacky/out-of-touch as some other leagues’, a greater breadth of perspectives in each round).
Still, in my experience/analysis, BP’s 4-teams-of-2 format (instead of the traditional “one team vs. one team” format), wherein teams that are ostensibly supposed to be working together to support their side of the motion are actually partially pitted against each other to get a higher rank in a round, leads to numerous problems that undermine the educational value of the round: knifing* (where one of the “back half” teams undercuts something that the “opening” team on their own side said), abandoning (where one of the back half teams lets the other side strawman or otherwise unfairly attack their opening team’s arguments), the fact that closing government (back-half team for the motion) can really suffer if opening government sets up the round poorly (e.g., when opening government uses really bad definitions), the fact that closing teams are often incentivized to focus on “new” arguments rather than focusing on the “good” arguments (since those will usually already have been taken by the opening teams), etc.
(Honestly, this is just a few of the highlights: for a few months off-and-on I’ve been outlining a blog article on why I dislike certain aspects of BP. Who knows, maybe I’ll finish it sometime this month?)
*Although hard knifing is rarely an effective strategy (usually, judges aren’t blind to what’s going on and they’ll punish the knifer if it was bad/uncalled for), it’s maddening how effective soft abandonments are (e.g., only giving half responses then saying something like “we want to focus on new matter on back half”).
Actually, I think it’s worth being a bit more careful about treating low-likelihood outcomes as irrelevant simply because you aren’t able to attempt to get that outcome more often: your intuition might be right, but you would likely be wrong in then concluding “expected utility/value theory is bunk.” Rather than throw out EV, you should figure out whether your intuition is recognizing something that your EV model is ignoring, and if so, figure out what that is. I listed a few example points above, to give another illustration:
Suppose you have a case where you have the chance to push button X or button Y once: if you push button X, there is a 1⁄10,000 chance that you will save 10,000,000 people from certain death (but a 9,999⁄10,000 chance that they will all still die); if you push button Y there is a 100% chance that 1 person will be saved (but 9,999,999 people will die). There are definitely some selfish reasons to choose button Y (e.g., you won’t feel guilty like if you pressed button X and everyone still died), and there may also be some aspect of non-linearity in the impact of how many people are dying (refer back to (1) in my original answer). However, if we assume away those other details (e.g., you won’t feel guilty, the deaths to utility loss is relatively linear) -- if we just assume the situation is “press button X for a 1⁄10,000 chance of 10,000,000 utils; press button Y for a 100% chance of 1 util” the answer is perhaps counterintuitive but still reasonable: without having a crystal ball that perfectly tells the future, the optimal strategy is to press button X.
It seems like I was not able to access it (without paying) if you are referring to https://link.springer.com/article/10.1007/s00355-021-01321-2
Disappointed to see that someone decided this was a joke and changed the article title :/
I won’t try to answer your three numbered points since they are more than a bit outside my wheelhouse + other people have already started to address them, but I will mention a few things about your preface to that (e.g., Pascal’s mugging).
I was a bit surprised to not see a mention of the so-called Petersburg Paradox, since that posed the most longstanding challenge to my understanding of expected value. The major takeaways I’ve had for dealing with both the Petersburg Paradox and Pascal’s mugging (more specifically, “why is it that this supposedly accurate decision theory rule seems to lead me to make a clearly bad decision?”) are somewhat-interrelated and are as follows:
1. Non-linear valuation/utility: money should not be assumed to linearly translate to utility, meaning that as your numerical winnings reach massive numbers you typically will see massive drops in marginal utility. This by itself should mostly address the issue with the lottery choice you mentioned: the “expected payoff/winnings” (in currency terms) is almost meaningless because it totally fails to reflect the expected value, which is probably miniscule/negative since getting $100 trillion likely does not make you that much happier than getting $1 trillion (for numerical illustration, let’s suppose 1000 utils vs. 995u), which itself likely is only slightly better than winning $100 billion (say, 950u) … and so on whereas it costs you 40 years if you don’t win (let’s suppose that’s like −100u).
2. Bounded bankrolling: with things like the Petersburg Paradox, my understanding is that the longer you play, the higher your average payoff tends to be. However, that might still be -$99 by the time you go bankrupt and literally starve to death, after which point you no longer can play.
3. Bounded payoff: in reality, you would expect that payoffs to be limited to some reasonable, finite amount. If we suppose that they are for whatever reason not limited, then that essentially “breaks the barrier” for other outcomes, which are the next point:
4. Countervailing cases: This is really crucial for bringing things together, yet I feel like it is consistently underappreciated. Take for example classic Pascal’s mugging-type situations, like “A strange-looking man in a suit walks up to you and says that he will warp up to his spaceship and detonate a super-mega nuke that will eradicate all life on earth if and only if you do not give him $50 (which you have in your wallet), but he will give you $3^^^3 tomorrow if and only if you give him $50.” We could technically/formally suppose the chance he is being honest is nonzero (e.g., 0.0000000001%), but still abide by rational expectation theory if you suppose that there are indistinguishably likely cases that cause the opposite expected value—for example, the possibility that he is telling you the exact opposite of what he will do if you give him the money (for comparison, see the philosopher God response to Pascal’s wager), or the possibility that the “true” mega-punisher/rewarder is actually just a block down the street and if you give your money to this random lunatic you won’t have the $50 to give to the true one (for comparison, see the “other religions” response to the narrow/Christianity-specific Pascal’s wager). Ultimately, this is the concept of fighting (imaginary) fire with (imaginary) fire, occasionally shows up in realms like competitive policy debate (where people make absurd arguments about how some random policy may lead to extinction), and is a major reason why I have a probability-trimming heuristic for these kinds of situations/hypotheticals.
I think there may be a bit of a disconnect between what I meant and how it was received, perhaps magnified by the fact that I was only giving my skim-derived impressions. First, I fully agree with jackmalde’s point that GDP isn’t a perfect measure, but partially reflecting a comment from your second paragraph, I presume that a lot of economists recognize that measures like GDP are not perfect (in fact, at least 2 if not all 3 of the econ professors I’ve had have explicitly said something along those lines).
Second, based on the first paragraph of the Cambridge article (“Nature is a “blind spot” in economics”) it seemed like the implication was that 1) economists have massively ignored this, and 2) adding consideration of “nature” would be model-shattering. When the claim is simply “nature is a factor” (among multiple others), I think that’s probably reasonable.
Third, I should clarify what I mean about my skepticism: I am not the slightest bit skeptical that economic models could be improved in general. However, by default I am skeptical towards any specific claim of widespread blindness among economists, because I think that most of these claims will be wrong—i.e., I have a low outside view/base rate for each specific claim, especially with regards to the questions I mentioned in my original answer/comment.
Building on that, I don’t want to over-articulate my thought process since it was largely just my initial, informal thoughts, but: There may be good evidence to back up Partha’s claim, it just seems like something that falls within a category of “Things that, if true, would be much more widely recognized [by economists] / would not have to be presented as some major ‘blind spot.‘” I don’t claim that this heuristic is good for someone whose work/research relates to this (i.e., those kinds of people should do more research than initial impressions), but as someone who is not in economics I think it’s more effective to have that kind of skepticism as opposed to treating every economic idea of the day/hour as equally legitimate.
Lastly, I’ll admit that I may have been judging it a bit too hastily as a result of its similarity to some of the discourse I’ve seen from nature-as-an-inherent-value environmentalists. If he is trying to put forward a way to measure the (extrinsic) impact of ecosystems on human wellbeing in a way not measured by other standards of wellbeing (e.g., pollution’s effects on health indicators, timber’s and fish stocks’ ability to provide consumption value, insect pollination’s effects on agricultural productivity), that might be interesting, it’s just that a lot of the initial examples presented felt like they could have been examples of double counting (see previous parenthetical). This is an important point that helps tie together some of the previous issues.
Sometimes I can understand arch-primitivists’ argument for returning to the period of open doors supported by nothing but classic material physics and gravity, but every time I look at a modern door hinge I am reminded that it is one of the few things that decently rhymes with orange.
Very light, initial impression:
The EA community is at least somewhat intellectually diverse, and on this particular topic I think there are probably some people in the EA community who may be quite sympathetic to the idea. I’ll add the important caveat, though, that I merely skimmed the abstracts/introductions for those links, so I don’t know exactly what all he argues for. If he is simply saying “nature is an important factor in health, economic inputs/resources, leisure, etc.” then that does not sound so model-shattering. Still, I am a bit skeptical of any kind of “Here’s this one thing [especially something associated with lots of sentiment/political buzz, like “nature”] that economists have inexplicably left out of their models, and it changes everything”—e.g., skeptical of its significance in general, skeptical that economists have truly left it out of their models if it is significant, skeptical that there isn’t a valid reason to leave it out of their models if they have been doing that and it is significant, and so on.
Dang, now I am really interested in listening to that podcast.
If you haven’t wandered around the Nicky Case website, I’d recommend doing so. There are a lot of interesting educational games on there, covering a wide variety of concepts such as social contagion, prisoner’s dilemmas, segregation, etc.