I love that the syllabus includes a section on generalizability. A related topic is how well lab experiments generalize to real-world field experiments, as in John List’s work on baseball cards.
A related topic is replicability: internal validity rather than external. If you do exactly the same study twice, how likely are you to get a similar result?
I believe replications are highly socially valuable but not well-rewarded in econ academia, since the job market tends to reward originality and technical skill, especially at the junior level. This makes replication work relatively neglected, tractable, and a good choice for EAs who are not bound by career concerns. It’s also valuable to understand why replications fail if you want to make sure your own results replicate.
Papers on replicability in economics:
LaLonde 1986 is a classic. LaLonde shows that techniques commonly used on observational data fail to reproduce an experimental result. This study was part of the impetus behind the “credibility revolution” in modern microeconomics.
Camerer et. al. replicate 18 econ lab experiments and find that the average replicated effect size is ⅔ of the originally reported size. That’s better than Reproducibility Project: Psychology found in psychology experiments (see below).
Andrews and Kasy, “Identification of and Correction for Publication Bias”, provides a theoretical framework for answering questions like “given a reported effect size, how big should I expect the actual effect size to be?”. They apply their method to several empirical questions like the effect of the minimum wage on employment.
Two papers by Alwyn Young:
“Consistency without Inference: Instrumental Variables in Practical Application” shows that most well-published econ results using the quasi-experimental technique of “instrumental variables” are biased towards the result you would get without using IV and are much noisier than reported standard errors suggest. Basically, the estimates are so bad you might as well not bother with IV—not that the alternative is good either. Big if true, but this is an unpublished working paper, so I take it with a grain of salt.
Reproducibility Project: Psychology replicated 100 studies in psychology and got a lot of public attention. Their findings were taken to imply that reproducibility is poor, fueling the “replication crisis” in psychology. This has spurred lots of great articles and blog posts on replicability and academic research in general. A great rabbit hole to go down on a free afternoon.
Data Colada is an ongoing project for replications in psychology.
Thanks for the paper suggestions! Most of my own research is on internal validity in the LaLonde style so I definitely think it is important too. I’ll add a section on replicability to the syllabus.
I love that the syllabus includes a section on generalizability. A related topic is how well lab experiments generalize to real-world field experiments, as in John List’s work on baseball cards.
A related topic is replicability: internal validity rather than external. If you do exactly the same study twice, how likely are you to get a similar result?
I believe replications are highly socially valuable but not well-rewarded in econ academia, since the job market tends to reward originality and technical skill, especially at the junior level. This makes replication work relatively neglected, tractable, and a good choice for EAs who are not bound by career concerns. It’s also valuable to understand why replications fail if you want to make sure your own results replicate.
Papers on replicability in economics:
LaLonde 1986 is a classic. LaLonde shows that techniques commonly used on observational data fail to reproduce an experimental result. This study was part of the impetus behind the “credibility revolution” in modern microeconomics.
Camerer et. al. replicate 18 econ lab experiments and find that the average replicated effect size is ⅔ of the originally reported size. That’s better than Reproducibility Project: Psychology found in psychology experiments (see below).
“The Power of Bias in Economics Research” is highly cited, but I haven’t read it.
Andrews and Kasy, “Identification of and Correction for Publication Bias”, provides a theoretical framework for answering questions like “given a reported effect size, how big should I expect the actual effect size to be?”. They apply their method to several empirical questions like the effect of the minimum wage on employment.
Two papers by Alwyn Young:
“Consistency without Inference: Instrumental Variables in Practical Application” shows that most well-published econ results using the quasi-experimental technique of “instrumental variables” are biased towards the result you would get without using IV and are much noisier than reported standard errors suggest. Basically, the estimates are so bad you might as well not bother with IV—not that the alternative is good either. Big if true, but this is an unpublished working paper, so I take it with a grain of salt.
“Channeling Fisher: Randomization Tests and the Statistical Insignificance of Seemingly Significant Experimental Results” shows that well-published experimental econ studies tend to overstate statistical significance. This happens due to boring technical reasons.
Papers on this outside economics:
Reproducibility Project: Psychology replicated 100 studies in psychology and got a lot of public attention. Their findings were taken to imply that reproducibility is poor, fueling the “replication crisis” in psychology. This has spurred lots of great articles and blog posts on replicability and academic research in general. A great rabbit hole to go down on a free afternoon.
Data Colada is an ongoing project for replications in psychology.
“Why Most Published Research Findings are False” is the classic and has its own Wikipedia page.
Thanks for the paper suggestions! Most of my own research is on internal validity in the LaLonde style so I definitely think it is important too. I’ll add a section on replicability to the syllabus.