weeatquince’s is sharing a widely held view, i.e. that eradication is superior to containment in health and economic outcomes, see e.g. this analysis. The idea itself is plausible, since a successful lockdown allows complete reopening of the internal economy afterwards.
Sample size is however small, especially when it comes to non-island countries. I only know of two non-island countries that seriously went for eradication coupled with border closures, namely Vietnam and Israel. Israel gave up at one point when cases started to rise (which is why it is listed among the containment countries in the analysis above) , but Vietnam succeeded (although it had to restrict travel heavily domestically as well). Personally, I believe it is a suboptimal strategy for non- authoritarian, non-island countries.
I think their original point stands though, that EA/rationalists did not seem to entertain the idea of eradication enough, but probably neither did biorisk organizations last year.
Thank you for this great post. In the past I have looked for such platforms and concepts, but was unaware of the term ‘inducement prize’ and did not find much.
Two extensions to the concept you presented could make it even more interesting, especially for the EA community. Firstly, rather than just requests being supplied to such a platform, offers to conduct e.g. research could be posted first by qualified researchers in order to gauge interest. Secondly, there is no reason why there couldn’t be several parties/individuals who pay the bounty collectively. Essentially, this would be a “reverse kickstarter” use case, where payment is made after completion rather than in the beginning.
It seems that there a lot of potential projects in the community with distributed interest and willingness-to-pay : literature reviews, evaluations of possible cause areas, research into personal Covid-19 risks etc.
For posterity, I was wrong here because I was unaware of the dispersion parameter k that is substantially higher for SARS than for Covid-19.
Truly excellent post!
My intuition is that research abouts NPIs on behavioural change might be more tractable and therefore impactful than research where the endpoint is infection. If the endpoint is infection, any study that enrolls the general population will need to have very large sample sizes, as the examples you listed illustrate. I am sure these problems can be overcome, but I assume that one reason we have not seen more of these studies is that it is infeasible to do so without larger coordination.
While it is unfortunate and truly surprising that we have very little research on e.g. the impact of mask wearing and distancing, we do know that certain behavioural, realistic changes would be completely sufficient to squash the pandemic in many regions.
The change does not have to be large: As the reproductive number R is magically hovering around ~1.1 to ~1.3 in most regions in the Western world, it would be sufficient if people would act just a little bit more careful to get R below 1: That could mean reducing private meetings by e.g. one third (or moving them outside), widespread adoption of contact tracing apps, placing air filters in schools, or targeting public health messaging towards people that currently are not reached or persuaded. I have seen some research about vaccine hesitancy, but far less about these other areas. At the very least, a randomized study comparing different kinds of public health messaging seems really easy to do.and fairly useful. This might look differently for the next pandemic though.
More broadly: As you alluded to, fostering and increasing coordination between researchers looking to conduct a study might also be really useful. This applies probably even more to research about drug interventions, but way too much of it is underpowered and badly conducted, and thus pretty much useless before results have even been published. This paper argues that the solutions are already known (e.g. multicenter trials), but not implemented widely due to institutional inertia. Again, it is worth looking into how to facilitate such coordination, I believe that large cash grants by EA aligned institutions conditional on coordination between different trial sites could work.
There’s an additional factor: Marketing and public persuasion. It is one thing to say: Based on a theoretical model, air filters work, and a totally different thing to say: We saw that air filters cut transmission by X% . My hope would be that the certainty and the effect estimate could serve to overcome the collective inaction we saw in the pandemic (in that many people agree that e.g. air filters would probably help, but barely nobody installed them in schools).
[Epistemic status: This is mostly hobbyist research that I did to evaluate which tests to buy for myself]
The numbers listed by the manufacturers are not very useful, sadly. These are generally provided without a standard protocol or independent evaluation, and can be assumed to be a best case scenario in a sample of symptomatic individuals. On the other hand, as you note, the sensitivity of antigen tests increases when infectiousness is high.
I am absolutely out of depth trying to balance these two factors, but luckily an empirical study from the UK estimates based on contact tracing data that “The most and least sensitive LFDs [a type of rapid antigen tests used in the UK] would detect 90.5% (95%CI 90.1-90.8%) and 83.7% (83.2-84.1%) of cases with PCR-positive contacts respectively.” So, if a person tests negative but is still Covid-19 positive, you can assume the likelihood of infection to be 10-20% of an average Covid-19 contact.
With regards to self vs. professional testing, there does not seem to be a very clear picture yet, but this German study suggests basically equivalent sensitivity.
You should also make sure to buy tests that were independently evaluated, you can find lists of such tests here or here. The listed numbers are hard to compare between different studies and tests, however, but the one you mentioned seems to have good results compared to other tests.
I am honestly not sure how long the test results are valid, but 2 hours seems safe. I cannot comment on the other numbers provided by microCovid.
No, my impression is that willingness to pay is a sufficient but not necessary condition to conclude that an industry standard benefits customers. A different sufficient condition would be an assessment of the effects of the standard by the regulators in terms of welfare. I assume that is the reason why the regulators in this case carried out an analysis of the welfare benefits, because why even do so if willingness-to-pay is the only factor?
More speculatively, I would guess that Dutch regulators also take account welfare improvements to other humans , and would not strike down an industry standard for safe food (if the standard actually contributed to safety).
Thank you for this post. My stance is that when engaging with hot-button topics like these, we need to pay particular attention to the truthfulness and the full picture of the topic. I am afraid that your video simplifies the reasons for the dismissal of the two researchers quite a bit to “they were fired for being critical of the AI”, and would benefit from giving a fuller account. I do not want to endorse any particular side here, but it seems important to mention that
Google wanted the paper to mention that some techniques exist to mitigate the problems mentioned by Dr. Gebru. “Similarly, it [the paper] raised concerns about bias in language models, but didn’t take into account recent research to mitigate these issues”
Dr. Gebru sent an email to colleagues telling them to stop working on one of their assigned tasks (diversity initiatives) because she did not believe those initiatives were sincere. “Stop writing your documents because it doesn’t make a difference”
Google alleges that Dr. Mitchell shared company correspondence with outsiders.
Whether or not you think any of this justifies the dismissal, these points should be mentioned in a truthful discussion.
I think you might have an incorrect impression of the ruling. The agreement was not just struck down because consumers seemed to not be willing to pay for it, but also because the ACM (on top (!) of the missing willingness to pay) decided that the agreement did not benefit consumers by the nature of the improvements (clearly, most of the benefit goes to the chickens).
From the link: “In order to qualify for an exemption from the prohibition on cartels under the Dutch competition regime it is necessary that the benefits passed on to the consumers exceed the harm inflicted upon them under agreements.”
There is also a quite active EA Discord server, which serves the function of “endless group discussions” fairly well, so another Slack workspace might have negligible benefits.
[Epistemic status: Uncertain, and also not American, so this is a 3rd party perspective]
As for the likelihood of some form of collapse, to me the current trajectory of polarization in the US seems unsustainable. Nowadays, members of both parties are split about whether they consider members of the other party “a threat to their way of life”(!) and feelings towards the other party are rapidly declining.
I do not think that this is just a fluke, as many political scientists argue that this is driven by an ideological sorting and a creation of a “mega-identity”, where race, education and political leanings now all align with each other. Political debate seems overwhelmingly likely to get more acrimonious when disagreement is not just about facts, but about your whole identity, and when you consider the other side to be your enemy.
It is only a slight overstatement to say that members of both parties live in two very different realities. There is almost no overlap in the trusted news organizations and the unprecedentedly constant approval rating of Donald Trump indicates that neither side changed their mind much in response to new information coming in.
On the up side, “67% comprise ‘the Exhausted Majority’, whose members share a sense of fatigue with our polarized national conversation, a willingness to be flexible in their political viewpoints, and a lack of voice in the national conversation.” My worry is that this majority is increasingly drowned out by the radical voices in traditional and social media.
It is also pertinent that political collapse can happen very fast and without much warning, like the Arab Spring and the collapse of the Soviet Union showed, which came unexpected to observers. Decline can also take the form of persistent riots/unrest where no one party has the political capital/strength to reach an agreement with the rioters or to stop it. Consequently, if decline of the US seems likely and bad, I would worry about it possibly happening quickly (<10 years).
Hi Mark, thanks for writing this post. I only had a cursory reading of your linked paper and the 80k episode transcript, but my impression is that Tristan’s main worry (as I understand it) and your analysis are not incompatible:
Tristan and parts of broader society fear that through the recommendation algorithm, users discover radicalizing content. According to your paper, the algorithm does not favour and might even actively be biased against e.g conspiracy content.
Again, I am not terribly familiar with the whole discussion, but so far I have not yet seen the point made clearly (enough), that both these claims can be true: The algorithm could show less “radicalizing” content than an unbiased algorithm would, but even these fewer recommendations could be enough to radicalize the viewers compared to a baseline where the algorithm would recommend no such content. Thus, YouTube could be accused of not “doing enough”.
Your own paper cites this paper arguing that there is a clear pattern of viewership migration from moderate “Intellectual Dark Web” channels to alt-right content based on an analysis of user comments. Despite the limitation of using only user comments that your paper mentions, I think that commenting users are still a valid subset of all users and their movement towards more radical content needs to be explained, and that the recommendation algorithm is certainly a plausible explanation. Since you have doubts about this hypothesis, may I ask if you think there are likelier ways these users have radicalized?
A way to test the role of the recommendation algorithm could be to redo the analysis of the user movement data for comments left after the change of the recommendation algorithm. If the movement is basically the same despite less recommendations for radical content, that is evidence that the recommendations never played a role like you argue in this post. If however the movement towards alt-right or radical content is lessened, it is reasonable to conclude that recommendations have played a role in the past, and by extension could still play a (smaller) role now.
In general I agree, but the forum guidelines do state “Polish: We’d rather see an idea presented imperfectly than not see it at all.”, and this is a post explicitly billed as “response” that were invited by Rob. So if this is all the time Mark wants to spend on it, I feel it is perfectly fine to have a post that is only for people who have listened to the podcast/are aware of the debate.
* Good can mean quality and morality: Again, I liked that. We do mean it in both ways (the advice is both attempting to be as high quality as possibly and as high as possible in moral impact, but we are working under uncertainty in both parameters).
For what it’s worth, I liked the name specifically because to me it seemed to advertise an intention of increasing a lot of readers’ impact individually by a moderate amount, unlike 80000′s approach where the goal is to increase fewer readers’ impact by a large amount.
I.e. unlike Michael I like the understatement in the name, but I agree with him that it does convey understatement.
Will, you are right that boycotting is not the right term for the phenomenon at hand. In addition to the reason you gave, a cancellation campaign mostly involves pressuring other organizations or people to boycott somebody. Plain old boycotting is one personal’s decision to not attend a talk, cancelling is demanding to stop the talk from even happening.
However, I think there is some truth to the point that cancel culture is not the most productive term when used in discussions over whether it is actually a bad thing, precisely because as you say it suggests that people engaging in it are doing something wrong and thus begs the question. For a somewhat symmetrical situation, consider proponents of cancel culture starting a discussion over “Should Organization A be a platform for Person B’s harmful views?”.
Thanks for the write-up. Regarding the issue of loss of motivation when scientists work on research they are less intrinsically interested in:
I know of at least one large scale historical experiment which did this. In the Soviet Union, science was reorganized to investigate areas specifically expected to increase social welfare (sadly sometimes the conclusions were predetermined by party cadres). This quote from an overview article seems relevant:
Under the Bolshevik rule, scientists lost much of their autonomy and independence but acquired more social prestige and de facot influence on politically important decision making. The Soviet regime valued science more highly and allocated it a proportionally larger share of the national income than did contemporary governments in economically better developed and more prosperous countries. It strongly opposed the ideology of pure science, promoting instead the ideal of science as potentially usable- even if not always immediately applicable- knowledge about the world.
https://www.jstor.org/stable/40207005?seq=8#metadata_info_tab_contents (page 122)
It might be worth looking into how and whether this actually worked to produce good research.
Thanks a lot for the in depth analysis, and great analysis on the efficacy of N-95 masks.
However, I think that because of the whole politicization of mask wearing most discussion has missed a crucial point (and I have been guilty of this as well): In situations where people are ready to wear masks (shops, public transport) infection risk is not high and surgical masks are enough. In situations where people generally do not wear masks (bars, restaurants, private meetings at home, all day at your workplace) risk is higher but willingness to wear masks lower. It is my understanding that this is where most of the infections happen, at least in Europe. KN95 masks have been more uncomfortable to wear than surgical ones in my experience, so my presumption is that N95 masks are not so comfortable that people will wear them all day ( please correct me if I am wrong).
This does not mean that there are some situations wear N95 masks for the general population might be beneficial like barbershops or doctor visits. It just does not seem to me that there is a lot of potential to get R to below 1 with mask wearing.
There might also be some value in designing face coverings that people would wear in more situations. For example these Japanese researchers claim to have a face shield design that prevents airborne spread much more efficiently.
Are you sure that this is the standard way in competitions? It is absolutely correct that before the final submission, one would find the best model by fitting it on a train set and evaluating it on the test set. However, once you found a best performing model that way, there is no reason not to train the model with the best parameters on the train+test set, and submit that one. (Submission are the predictions of the model on the validation set, not the parameters of the model). After all, more data equals better performance.
Jordan Peterson is probably indeed a good example. A more objective way to describe his demeanor than shamelessness is “not giving in”. One major reason why he seems to be popular is his perceived willingness to stick to controversial claims. In turn that popularity is some form of protection against attempts to get him to resign from his position at the University of Toronto.
However, I think that there are significant differences between Peterson and EA’s situation, so Peterson’s example is not my endorsement of a “shamelessness” strategy.
Almost all of the contract research is done for public projects, often in joint-ventures with companies. That way, most of the funding comes from public sources.
Could you please explain that further? Looking at this document, page 13, it says that almost 50% of the proceeds from contract research is from economic sources (“Wirtschaftserträge”), and only 41 percent of the contract research money comes from public sources (“EU” and “Bund/Länder”). If my reading is correct, then it would be misleading to say that “almost all” of the research is done for public projects. Or does the category “Wirtschaftserträge” also contain public projects somehow?