Which is the relevant table? I did not find “provider”
Ah, right, good question. My understanding is that provider variance is modeled as a random effect.[1] I was looking at Supplementary Table 5: iiuc, provider effects explain R2_conditional—R2_marginal ~ 50% of the variance, whereas fixed effects explain 17% of the variance.
- ^
Paragraph 2.2.1: “Data provider was treated as a random effect to account for potential differences in sampling and analysis methods, and the selection of sampling sites”
Hi Jonah, Jasmine and Miles,
Cool idea! Thanks for writing it up.
I wonder if the causal effect of adding true stuff to wikipedia could be tested directly? For example:
Treatment: claims that are on Wikipedia
Control: claims that aren’t on Wikipedia
Design: Match by how likely the claim is to be true, and by how frequently the claim appears outside of Wikipedia.[1] Compute contrasts.
A Wikipedia claim might be mirrored on various other parts of the training corpus. We might want to control for that.