There’s also this post plus comments.
Cool! Here are a few that might be worth including. Perhaps searching the Forum for “prize” or “incentivize” would give more interesting results. Also, I think maybe if you look on Paul Christiano’s LW account submissions, there are a few more like this.
I asked my father, who has spent the past 40 years at Xerox PARC and worked with Bob Taylor, what he thought of this post. He wrote:
That all seems reasonable to me. My guess is that the most important factors are great people and a great leader. One of my co-workers, who was involved with starting a research center in France said “A people hire A people. B people hire C people”. So, the first few people that you hire are really important.I think that the main job of the leader is to keep people happy and focused. Most of my managers have been really good leaders.I also think that being co-located is very important. When I am out of touch with my co-workers, I tend to lose motivation.
BTW, one of the reasons that the best leaders usually have a technical background is that it is hard to identify the very best people without it. That is why non-technical companies have trouble hiring good programmers, and conversely why the best tech companies were founded by people with a technical background.
Another thing I remember him once mentioning to me is that PARC bought its researchers very expensive, cutting-edge equipment to do research with, on the assumption that Moore’s Law would eventually drive down the price of such equipment to the point where it was affordable to the mainstream.
He’s willing to answer questions.
This is exactly what p-values are designed for, so you are probably better off looking at p-values rather than effect size if that’s the scenario you’re trying to avoid.
Yes, this is a better idea.
From what I understand, effect size is one of the better ways to predict whether a study will replicate. For example, this paper found that 77% of replication effect sizes reported were within a 95% prediction interval based on the original effect size.
As a spot check, you say that brain training has massive purported effects. I looked at the research page of Lumosity, a company which sells brain training software. I expect their estimates of the effectiveness of brain training to be among the most optimistic, but their highlighted effect size is only d = 0.255.
A caveat is that if an effect size seems implausibly large, it might have arisen due to methodological error. (The one brain training study I found with a large effect size has been subject to methodological criticism.) Here is a blog post by Daniel Lakens where he discusses a study which found that judges hand out much harsher sentences before lunch:
If hunger had an effect on our mental resources of this magnitude, our society would fall into minor chaos every day at 11:45. Or at the very least, our society would have organized itself around this incredibly strong effect of mental depletion… we would stop teaching in the time before lunch, doctors would not schedule surgery, and driving before lunch would be illegal.
However, I think psychedelic drugs arguably do pass this test. During the 60s, before they became illegal, a lot of people kind of were talking about how society would reorganize itself around them. And forget about performing surgery or driving while you are tripping.
The way I see it, if you want to argue that an effect isn’t real, there are two ways to do it. You can argue that the supposed effect arose through random chance/p-hacking/etc., or you can argue that it arose through methodological error.
The random chance argument is harder to make if the studies have large effect sizes. If the true effect is 0, it’s unlikely we’ll observe a large effect by chance. If researchers are trying to publish papers based on noise, you’d expect p-values to cluster just below the p < 0.05 threshold (see p-curve analysis)… they’re essentially going to publish the smallest effect size they can get away with.
The methodological error argument could be valid for a large effect size, but if this is the case, confirmatory research is not necessarily going to help, because confirmatory research could have the same issue. So at that point your time is best spent trying to pinpoint the actual methodological flaw.
This is the only comment this user has ever written, and their profile looks very spammy. I wonder if spammers have discovered that posting flamebait is a good way to get people to visit their website...
I don’t have data either way, but “knacks” for psychotherapy feel more plausible to me than “knacks” for producing the effects in Many Labs 2 (just skimming over the list of effects here). Like, the strongest version of this claim is that no one is more skilled than anyone else at anything, which seems obviously false.
Suppose we conduct a study of the Feynman problem-solving algorithm: “1. Write down the problem. 2. Think real hard. 3. Write down the solution.” A n=1 study of Richard Feynman finds the algorithm works great, but it fails to replicate on a larger sample. What is your conclusion: that the n=1 result was spurious, or that Feynman has useful things to teach us but the 3-step algorithm didn’t capture them?
I haven’t read enough studies on psychedelics to know how much room there is in the typical procedure for a skilled therapist to make a difference though.
1.3) (Owed to Scott Alexander’s recent post). The psychedelic literature mainly comprises small studies generally conducted by ‘true believers’ in psychedelics and often (but not always) on self-selected and motivated participants. This seems well within the territory of scientific work vulnerable to replication crises.
I think small studies are also more vulnerable to publication bias.
On the flip side, it may be possible that the “true believers” actually are on to something, but they have a hard time formalizing their procedure into something that can be replicated on a massive scale. So if larger studies fail to replicate the results from the small studies, this may be the reason why.
Maybe someone could read and summarize the core points of this? I read the first chapter and didn’t get a lot out of it, and wasn’t able to parse passages such as
Technologies of the self anchor these reflective practises; data in this sense forms a bridge between the actual and the virtual, as the creation of the self spills over into the negotiated co-creation of worlds. Empathy and emotion are not in conflict, but complex mediation and configuration.
The meaning of those practises, the positions they occupy and the selves they created were problematic rather than the practises in and of themselves. It is in this sense that ‘ethics’ as a dimension of social life is is here distinguished from pre-theorised systematics; it shapes the connotations for how selves are formed, others are engaged and worlds are envisioned. Yet the perceived Heresy is a deeply personal thing, made through enrolment into this ‘moral assemblage’. As the ethical subject develops, its possibilities for further self-reflection and development are also changed. Whilst the others ‘clicked’ into this assemblage, for Sarah the strands of relational fabric become tangled… The ‘Heresy’ of the EAs for many isn’t in any single thing they do; no one practise is the cause of offense, but it the complex relational possibilities of specific lived encounters in which the self is so profoundly involved. At the moment of ‘ethical breakdown’, reflection fails and revelation is rejected.
I have a hunch that a big part of the issue here is institutional momentum around maximizing key performance indicators such as daily active users, time spent on platform, etc. Perhaps it will be important to persuade decisionmakers that although optimizing for these metrics helps the bottom line in the short run, in the long run optimizing these to the exclusion of all else hurts the brand, increases the probability of regulatory action or negative “black swan” type events, and risks having the users abandon the product. (I understand that the longer a culture gets exposed to alcohol, the greater the degree it develops “cultural antibodies” to the negative effects of alcohol which allow it to mitigate the harms… decisionmakers should worry that if users don’t endorse the time they spend with the product, this hurts the long-term viability of the platform; imagine the formation of a group like Alcoholics Anonymous but for social media, for instance.) I think it’d be good if decisionmakers also started optimizing for key performance indicators like whether users think the product is a benefit to their life personally, whether the product makes society healthier/better off, etc. Or even more specific stuff, like whether users who engage in disagreements tend to come to a consensus vs walking away even angrier than when they started.
With regard to risks, here are some thoughts of mine related to scenarios in which users self-select in their use of these tools. I think maybe what I describe in this comment has already happened though.
Donald Knuth is a Stanford professor and world-renowned computer scientist. For years he offered cash prizes to anyone who could find an error in any of his books. The amount of money was only a few dollars, but there’s a lot of status associated with receiving a Knuth check. People would frame them instead of cashing them.
Why don’t more people do this? Like having a bug bounty program, but for your beliefs. Offer some cash and public recognition to anyone who can correct a factual error you’ve made or convince you that you’re wrong about something. Donald Freakin’ Knuth has cut over two thousand reward checks, and us mortals probably make mistakes at a higher rate than he does.
Everyone could do this: organizations, textbooks, newspapers, individuals. If you care about having correct beliefs, create an incentive for others to help you out.
$2 via Paypal to the first person who convinces me this practice is harmful.
This is an interesting post by Ramez Naam. He argues that too much attention is given to transportation & energy emissions and not enough to agriculture & industry emissions. Naam thinks that renewable tech will continue to drop in cost, and he’s optimistic that part of the equation will solve itself. He says the highest-leverage action is the development of new tech to address agriculture & industry emissions.
Maybe we could have a classified ads thread every once in a while? (More thoughts here.)
It feels inefficient to second-guess a decision which has already been finalized. I think you could argue that something like a grant decisions thread should get posted before money gets disbursed, in case commenters surface important considerations overlooked by the grantmakers. There might also be value in auditing a while after money gets disbursed, to understand what the money actually did. Auditing right after money gets disbursed seems like the worst of both worlds.
So, for a respective cause area, an EA Fund functions as like an index fund that incentivizes the launch of nascent projects, organizations, and research in the EA community.
You mean it functions like a venture capital fund or angel investor?
Good to know!
This in particular strikes me as understandable but very unfortunate. I’d strongly prefer a fund where happening to live near or otherwise know a grantmaker is not a key part of getting a grant. Are there any plans or any way progress can be made on this issue?
I agree this creates unfortunate incentives for EAs to burn resources living in high cost-of-living areas (perhaps even while doing independent research which could in theory be done from anywhere!) However, if I was a grantmaker, I can see why this arrangement would be preferable: Evaluating grants feels like work and costs emotional energy. Talking to people at parties feels like play and creates emotional energy. For many grantmakers, I imagine getting to know people in a casual environment is effectively costless, and re-using that knowledge in the service of grantmaking allows more grants to be made.
I suspect there’s low-hanging fruit in having the grantmaking team be geographically distributed. To my knowledge, at least 3 of these 4 grantmakers live in the Bay Area, which means they probably have a lot of overlap in their social network. If the goal is to select the minimum number of supernetworkers to cover as much of the EA social network as possible, I think you’d want each person to be located in a different geographic EA hub. (Perhaps you’d want supernetworkers covering disparate online communities devoted to EA as well.)
This also provides an interesting reframing of all the recent EA Hotel discussion: Instead of “Fund the EA Hotel”, maybe the key intervention is “Locate grantmakers in low cost-of-living locations. Where grant money goes, EAs will follow, and everyone can save on living expenses.” (BTW, the EA Hotel is actually a pretty good place to be if you’re an aspiring EA supernetworker. I met many more EAs during the 6 months I spent there than my previous 6 months in the Bay Area. There are always people passing through for brief stays.)
Congratulations on the launch!
Can anyone think of good places to link EA Hub from now that it’s been revamped? I’m worried that people will forget about it in a few weeks once this post falls off the EA Forum homepage.
One strategy: Brainstorm use cases, then figure out where people are currently going for those use cases, then put links to the EA Hub in those places with an explanation of how EA Hub solves the use case. For example (rot13′d so you can think of your own before being primed by mine), one possible use case is crbcyr zrrgvat sryybj RNf juvyr geniryvat. Fb jr pbhyq qebc n yvax gb gur RN Uho va gur RN Pbhpufhesvat Snprobbx tebhc qrfpevcgvba naq fhttrfg gung crbcyr svaq ybpny tebhcf be fraq crefbany zrffntrf gb ybpny RNf vaivgvat gurz sbe pbssrr juvyr geniryvat. (Nffhzvat gung’f pbafvqrerq na npprcgnoyr hfr bs gur crefbany zrffntr srngher—V qba’g frr jul vg jbhyqa’g or gubhtu.)
We could just start calling it the Athena Hotel. That also disambiguates if additional hotels are opened in the future.
Do you have any thoughts on Tetlock’s work which recommends the use of probabilistic reasoning and breaking questions down to make accurate forecasts?