It seems like this review contains a relative paucity of research supporting the null hypothesis that deliberation does not improve decision making (or, for that matter, the alternative hypothesis that it actually worsens decision making). Were you unable to find studies taking this position? If not, how worried are you about the file-drawer effect here?
One approach worth considering when engaging with topics like these is to adopt a systematic review rather than a narrative one. The part which would be particularly helpful re. selection worries is having a pre-defined search strategy, which can guard against inadvertently gathering a skewed body of evidence. (If there are lots of quantitative results, there are statistical methods to assess publication bias, but that typically won’t be relevant here—that said, there are critical appraisal tools you can use to score study quality which offers indirect indication, as well as value generally in getting a sense of how trustworthy the consensus of the literature is).
These are a lot more work, but may be worth it all the same. With narrative reviews on topics where I expect there to be a lot of relevant (primary) literature, I always have the worry one could tell a similarly persuasive story for many different conclusions depending on whether the author happened upon (or was more favourably disposed to) one or other region of the literature.
‘Quick and dirty’ systematisation can also help: e.g. ‘I used search term [x] in google scholar and took the first 20 results—of the relevant ones, 8 were favourable, and 1 was neutral’.
I’ve always wondered about the “first N Google results” strategy. Even in the absence of a file-drawer effect, isn’t this more likely to turn up papers making positive claims (on the assumption that e.g. rejections of the null are more likely to be cited than inconclusive results)?
I’m not sure how google scholar judges relevance (e.g. I can imagine eye-catching negative results also being boosted up the rankings) but I agree it is a source of distortion—I’d definitely offer it as ‘better than nothing’ rather than good. (Perhaps one tweak would be sample by a manageable date range rather than relevance, although one could worry about time trends).
A better option (although it has some learning curve and onerousness) is query a relevant repository, export all the results, and take a random sample from these.
One approach worth considering when engaging with topics like these is to adopt a systematic review rather than a narrative one. The part which would be particularly helpful re. selection worries is having a pre-defined search strategy, which can guard against inadvertently gathering a skewed body of evidence. (If there are lots of quantitative results, there are statistical methods to assess publication bias, but that typically won’t be relevant here—that said, there are critical appraisal tools you can use to score study quality which offers indirect indication, as well as value generally in getting a sense of how trustworthy the consensus of the literature is).
These are a lot more work, but may be worth it all the same. With narrative reviews on topics where I expect there to be a lot of relevant (primary) literature, I always have the worry one could tell a similarly persuasive story for many different conclusions depending on whether the author happened upon (or was more favourably disposed to) one or other region of the literature.
‘Quick and dirty’ systematisation can also help: e.g. ‘I used search term [x] in google scholar and took the first 20 results—of the relevant ones, 8 were favourable, and 1 was neutral’.
I’ve always wondered about the “first N Google results” strategy. Even in the absence of a file-drawer effect, isn’t this more likely to turn up papers making positive claims (on the assumption that e.g. rejections of the null are more likely to be cited than inconclusive results)?
I’m not sure how google scholar judges relevance (e.g. I can imagine eye-catching negative results also being boosted up the rankings) but I agree it is a source of distortion—I’d definitely offer it as ‘better than nothing’ rather than good. (Perhaps one tweak would be sample by a manageable date range rather than relevance, although one could worry about time trends).
A better option (although it has some learning curve and onerousness) is query a relevant repository, export all the results, and take a random sample from these.