This brings to mind the assumption of normal distributions when using frequentest parametric statistical tests (t-test, ANOVA, etc.). If plots 1-3 represented random samples from three groups, an ANOVA would indicate there was no significant difference between the mean values of any group, which usually be reported as there being no significant difference between the groups (even though there is clearly a difference between them). In practice, this can come up when comparing a treatment that has a population of non-responders and strong responders vs. a treatment where the whole population has an intermediate response. This can be easily overlooked in a paper if the data is just shown as mean and standard deviation, and although better statistical practices are starting to address this now, my experience is that even experienced biomedical researchers often don’t notice this problem. I suspect that there are many studies which have failed to identify that a group is composed of multiple subgroups that respond differently by averaging them out in this way.

The usual case for dealing with non-normal distributions is to test for normality (i.e. Shapiro-Wilk’s test) in the data from each group and move to a non-parametric test if that fails for one or more groups (i.e. Mann-Whitney’s, Kruskal-Wallis’s or Friedman’s tests), but even that is just comparing medians so I think it would probably still indicate no significant difference between (the median values of) these plots. Testing for difference between distributions is possible (i.e. Kolmogorov–Smirnov’s test), but my experience is that this seems to be over-powered and will almost always report a significant difference between two moderately sized (~50+ samples) groups, and the result is just that there is a significant difference in distributions, not what that actually represents (i.e differing means, standard deviations, kurtosis, skewness, long-tailed, completely non-normal, etc. )

One of the topics I hope to return to here is the importance of histograms. They’re not a universal solvent. However they are easily accessible without background knowledge. And as a summary of results, they require fewer parametric assumptions.

I very much agree about the reporting of means and standard deviations, and how much a paper can sweep under the rug by that method.

This brings to mind the assumption of normal distributions when using frequentest parametric statistical tests (t-test, ANOVA, etc.). If plots 1-3 represented random samples from three groups, an ANOVA would indicate there was no significant difference between the mean values of any group, which usually be reported as there being no significant difference between the groups (even though there is clearly a difference between them). In practice, this can come up when comparing a treatment that has a population of non-responders and strong responders vs. a treatment where the whole population has an intermediate response. This can be easily overlooked in a paper if the data is just shown as mean and standard deviation, and although better statistical practices are starting to address this now, my experience is that even experienced biomedical researchers often don’t notice this problem. I suspect that there are many studies which have failed to identify that a group is composed of multiple subgroups that respond differently by averaging them out in this way.

The usual case for dealing with non-normal distributions is to test for normality (i.e. Shapiro-Wilk’s test) in the data from each group and move to a non-parametric test if that fails for one or more groups (i.e. Mann-Whitney’s, Kruskal-Wallis’s or Friedman’s tests), but even that is just comparing medians so I think it would probably still indicate no significant difference between (the median values of) these plots. Testing for difference between distributions is possible (i.e. Kolmogorov–Smirnov’s test), but my experience is that this seems to be over-powered and will almost always report a significant difference between two moderately sized (~50+ samples) groups, and the result is just that there is a significant difference in distributions, not what that actually represents (i.e differing means, standard deviations, kurtosis, skewness, long-tailed, completely non-normal, etc. )

One of the topics I hope to return to here is the importance of histograms. They’re not a universal solvent. However they are easily accessible without background knowledge. And as a summary of results, they require fewer parametric assumptions.

I very much agree about the reporting of means and standard deviations, and how much a paper can sweep under the rug by that method.