Summary
As the body of existing literature is growing, meta analyses—i.e. taking multiple studies on the same topic and aggregating their results—have become more important. They provide stronger evidence on a given problem than any individual study and can thus be relevant information sources for evidence based decision making. In this post, I explore whether their potential for improving decision making could be enhanced by making them ‘interactive’. By that I mean (1) allowing the decision maker to choose what studies to include, (2) being updated in real-time and (3) giving the possibility to incorporate prior beliefs and quality & bias ratings. I conclude by presenting a workable prototype and case study of an interactive meta-study and outlining planned next steps for exploring the impact potential.

Link to Prototype

Outline
0. A primer on meta analyses (Feel free to skip this if you are familiar with the concept).

What I mean by ‘interactive meta studies’
Why I think interactive meta studies can help us making better decisions
A proof-of-concept and a case study
Next steps—Exploring paths to impact
Appendix

A primer on meta analyses

The number of existing scientific studies is growing—simply because (so far) humans have continued to exist and continued doing science (apart from that also the number of scientists and studies is growing which accelerates the whole process). This means that for almost all relevant, scientifically tractable problems you will be able to find multiple, published scientific studies.^[1] If we assume good intentions of the researchers behind those studies, most of them will have at least some informational value. However, most likely they will also all contain a lot of noise and bias.

To give an example, let’s say we want to know what the effect of a specific drug is for preventing migraine attacks. To do so, we gather a group of medical researchers and ask them to conduct a trial. Our researchers then recruit a bunch of participants with migraine and ask them how often they get migraines and how bad they are. Then they randomly select half of the participants to be given the medicine and half of the participants to be given a placebo. Both groups are instructed to swallow a pill as soon as they realize any slight sign of an upcoming migraine. After two weeks, the researchers ask the participants again about the frequency and severity of their migraine attacks. Then the researchers look at their data and compare the two groups. From this, they derive an estimate of the effect of the medicine for preventing migraine.

So, this kind of seems like a valid attempt to find out the average effect of this drug on migraine severity for anyone suffering from migraines. However, obviously the estimate that our researchers calculated will not be the same as the true underlying effect (unless in very rare cases where this might happen more or less by chance).

First of all, we are working with a (potentially small) sample of participants who have most likely not been randomly selected from the entire population of interest (i.e. all people suffering from migraines). Further, all kinds of ‘problems’ might have occurred that would influence our estimate—e.g. there might have been a heatwave which might have affected the baseline frequency of migraines, people might forget taking the drug, the researchers might have made a small error in their calculations etc. So, we should assign quite a high degree of uncertainty to our estimate.

The good news is that if we have five teams of researchers conducting such a study and then combine their data, the overall sample size will be bigger and the influence of measurement and analysis errors will become smaller. The resulting ‘hyper-estimate’ will thus most likely be closer to ground truth.

And this is exactly what meta-studies are doing. They take a bunch of studies that try to answer the same question and combine their estimates into one common and hopefully more accurate estimate. Well, in simplified terms at least.
Here also the official Wikipedia definition if you want a more precise description:

A meta-analysis is a statistical analysis that combines the results of multiple scientific studies. Meta-analyses can be performed when there are multiple scientific studies addressing the same question, with each individual study reporting measurements that are expected to have some degree of error. The aim then is to use approaches from statistics to derive a pooled estimate closest to the unknown common truth based on how this error is perceived.

What I mean by ‘interactive meta studies’

Now this blogpost is about ‘interactive meta studies’ (or more precisely about a concept that I decided to name like this), so we should first make sure that we are all on the same page regarding the meaning of these three words in conjunction.

Disclaimer: I am not sure if ‘interactive meta study’ is the most fitting name for what I am about to describe. So, if after reading this you have an idea for a better term—please let me know.

When a meta analysis is conducted, the authors usually start by searching for existing studies that tackle the question of interest and meet a set of inclusion criteria (these can be different in every case but typically those are certain requirements with regards to the study design or the way outcomes are reported). Then they take all the studies they find and calculate an aggregate measurement (and confidence interval).

An interactive meta analysis takes this concept a step further and enables the reader or user of the meta analyses to have a say in the decision which studies are included in the calculation of the aggregated measurement. To give a concrete example:

Let’s assume (again if you read the first chapter) you want to know the impact of a specific drug on preventing migraine. Now there are three existing studies on this:

Study #1: Has 8 participants all of whom have had regular migraine throughout the last 3 years.
Study #2: Has 83 participants who had at least one migraine attack in the preceding 6 months. The study has not been peer-reviewed yet but has been published as a medical pre-print.
Study #3: Has 42 participants. All participants are at least 40 years old and have reported that they had at least three ‘very painful’ migraine attacks within the preceding 6 months.

Now, maybe you think that 8 participants is really not a lot (and correctly so) and that anyone who conducts a study with so few participants should not be trusted and thus, you don’t want to include this study. Or, you are suffering migraines yourself and are interested in the potential effect of this drug on you but you are below the age of 40 and thus think that you should better not consider Study#3.

In a ‘static’ traditional meta analysis you would now have to look up the numbers and calculate the estimate without whichever studies you want to include for yourself. This requires quite a bit of effort and some knowledge about statistics. Wouldn’t it be much better if you could just provide a list of the studies you are interested in and be provided with an aggregated estimate from these?

This is exactly the core idea of ‘interactive meta analyses’.

Note: We have run a little exploratory study to get a feeling for if people actually have a desire to weight studies differently, when confronted with situations as the above. So far, it seems they do. More on this in the Appendix.

Why I think interactive meta studies can help us making better decision

Now that you have a rough idea of what I mean by ‘interactive meta analysis’ - why should you care?

I think that interactive meta analyses would provide a bunch of benefits (as compared to classical meta analyses), especially when it comes to actual policy making or using them as the base for other types of important decisions:

As already outlined above: You, as the decisionmaker, get to have a say. Depending on the question at hand and your particular situation there might be different kinds of inclusion criteria that you would like to specify (especially when it comes to more social science questions things like e.g. the country a study is conducted in could be of interest).
It is easy to update. If a meta analysis is published and the day after a new hallmark study on this topic is released, it won’t be included in the meta analysis. If the meta-analysis is interactive (or maybe a better wording in this context would be ‘adaptive’), it is easy to just add another study to the data pipeline and make sure the estimates are always up-to-date with the current existing knowledge. Thereby, researchers can also save time- they can add more studies without a lot of additional effort (no need to make calculations for the added studies).
Prior knowledge and quality ratings can be explicitly included. Maybe you have some prior belief about the quantity that is being estimated in the studies or there are some existing quality ratings of studies (e.g. risk of bias ratings, expert judgements). An interactive meta-analysis would allow you to factor those into your hyper-estimate, either by applying Bayesian updating and/or allowing you to adjust the weights that are assigned to individual studies.

In addition, promoting interactive meta-analyses could also have a positive impact on science as a whole, due to the following reasons:

Flaws in research can be spotted more easily. Researchers and reviewers can use the tool to benchmark new results against existing evidence. While a deviating result of course doesn’t necessarily imply that there is something wrong, it might very well lead to double checking of the data and procedure.
The need for full meta-analyses could be reduced and thereby research resources could be freed up. While this tool (at least as of now) does not meet the strict, scientific standards required for a meta-analysis, for some use cases this might be good enough. Thus, it could reduce the need for fully meta analysis studies and thereby free up some time from researchers that could be spent on other types of research.

Obviously there might also be reasons why interactive meta-studies lead to zero or even negative impact on decision making. While I believe that the case for them being true is less strong than for the positives, I have elaborated a bit more on potential risks in Appendix.

A proof-of-concept and a case study

To see whether the above explained concept can be translated from idea into reality, we set out to build a proof-of-concept. We chose to build it around the question “What is the effect of wearing masks on the transmission rate of COVID-19?”.

The reasons why we chose this particular use case were:

There were existing meta-analyses that we could draw upon.
One might reasonably expect that more studies on this topic will be published in the future.
It is a topic of current interest.
Similar questions might arise in the future in the light of different pandemics.
Study results are divergent.

You can access the working prototype here. I would very much invite you to play around with it (and reach out in case you notice any bugs), but it’s not strictly necessary to follow along with the remainder of this article. Also—being a prototype—this page is designed to have the very basic functionalities of an interactive meta-study but there could obviously be many more useful features (some ideas are listed in the Appendix).

It should be noted that this is not a scientific study. The list of included studies is taken from existing meta-analyses but it is far from comprehensive. The estimates should thus—as any kind of estimates—not be taken at face value.

Next Steps—Exploring Paths to Impact

Now that we have the prototype ready, our next goal is to find out whether there is a big enough impact potential of this idea to warrant further efforts to push it.

For this, we first need to find out if there is anyone out there who would (1) be willing to use such a tool, (2) be enabled to make better decisions through the use of the tool and (3) whose decisions actually have a sizeable impact.

And this is one of the main reasons I am writing this post. While we have come up with a list of potential user groups and the related paths to impact (see Appendix), we don’t have much experience in potential domains nor in policy making, decision making or forecasting. Thus, my hope is that the prototype together with this post will allow us to find out whether there exist concrete, impactful use cases for interactive meta-analyses

So, if you, who is reading this have a situation in mind where such a tool could be useful or even a concrete project at hand, where it could be interesting to incorporate this—please leave a comment, schedule a meeting with us or write an email.

Also, if you have any general feedback, criticism or feature ideas—please don’t hesitate sharing them! (There is also a list of ideas for features in the Appendix)

If you are interested in where this could lead in the long-term (at least in my personal utopia) - you can find more on the bigger picture vision in the Appendix.

Big thanks to Marina and Cecilia for giving valuable feedback and to Future Academy (the organizers and all the participants) for motivating me to follow through with this idea and providing helpful advice, insights and connections!

Appendix

The bigger picture (or my utopian long-term vision for interactive meta-studies)

My hope would be that in the future, a large part of published studies will automatically be part of some kind of an interactive meta-analysis. I think that the operational efforts of running such a system nested in the existing scientific publication system would be fairly small compared to the potential upside.

Scientific journals would just have to mandate, that every paper they publish is accompanied by a standardized, machine-readable file containing high-level information on the main result of the paper (e.g. effect size, sample size). Once the paper is published, these could be easily fed into the data pipeline of the interactive meta-study on the topic.

Taking into consideration the current push towards open science and more journals requiring researchers to publish their underlying data, I believe that this is not a big change to demand. On the other side, I think the potential upside is huge. If we had one credible (open-access) source of interactive meta-studies, this could easily become the go-to information source for any questions where people care about the underlying scientific evidence. Thus it could not only help to combat misinformation but also have a large impact on opinions of the general public as well as policy makers and Think Tanks.

Potential Downsides (or Why I think interactive meta-studies might have zero or even negative impacts on decision making)

While I am hopeful that the advantages I outlined in the main article will prevail, I do also see some ways interactive meta-studies could lead to worse decision making and/or misinformation:

Pick and choose until you get the desired outcome. If the user has a prior opinion on the question at hand, she might (consciously or unconsciously) choose the studies included in the calculation and/or their respective weighting in a way to fit her expectations. The tool could thus be used in an attempt to signal that there is scientific support for a specific agenda.
Users lacking the ability to make the right choices. Whoever might be the end user, might not know which criteria will be relevant in a specific situation to decide whether to include studies/how to weight them. The resulting hyper-estimate could thus be even ‘more wrong’ than an overall, traditional meta-study hyper-estimate.
Loss of contextual information about the included studies. While the idea is to offer the user multiple criteria for filtering and weighting studies, some contextual information about individual studies will get lost. While there is always the opportunity to read up on the underlying studies in detail, having the tool might give the (wrong) impression that all relevant dimensions of a study are somewhat included and that there is no necessity to deeply engage with the topic and the included studies.
The wrong studies might be chosen. If less time is spent on meta-analysis, one could conduct the research choice part too light-mindedly, and that might lead to wrong studies being chosen and taken into account.

An exploratory survey

About the Survey

We conducted a very small survey (n=15) amongst people that we know on what people think of as relevant evidence to consider with regards to mask as a way to prevent the spreading of a virus. Obviously the results are not supposed to be interpreted as any kind of evidence—the main purpose was to gather additional insights/inspiration on what people think of as relevant evidence.

The survey asked people to imagine themselves as being the president of a fictional state ‘Nivaria’ who has to make a decision mandatory mask policy for the country as the first case of a previously unknown virus has been detected in Nivaria.

Takeaways

Our main takeaways from the survey results were that

While multiple factors play a role for such a decision, many people (n=11) considered (scientific) information on the effectiveness of masks as relevant
When presented with three different (fictional) study results on mask efficiency, the majority of respondents (n=13) did not consider all of them equally but considered meta information about the studies to judge how relevant the study results are for their decision.
Factors that seemed to influence which studies people put most weight on were (1) sample size, (2) similarity of participants to the inhabitants of Nivaria, (3) likelihood of compliance of study participants, (4) control for potential confounders.

Survey Setup

Apart from questions about the demographics and professional experience of people, the survey basically had three questions:

Relevance of different factors for decision making about mask policies

You are the president of the fictional island state Nivaria. Lately, you have anxiously been following the news about the spread of previously unknown virus around the world. And today it finally happened—the first case has been detected in Nivaria.

Now, some other countries have decided to make it mandatory to wear masks as the disease spreads through the air. You are considering to do the same.

What factors would influence your decision regarding a mandatory mask policy for Nivaria?

(Be as precise as possible)

Decision about a mandatory mask policy

Your assistant collected the following points of view from experts and published scientific resources:

- It would be worth having a mandatory mask policy if it makes people at least 50% less likely to get infected after interacting with an infected person.*

*This estimate already takes into account any economic losses from too many people getting the disease at the same time vs. reduced productivity, less income in the leisure sector, costs of providing face masks to the general public etc.

- A study conducted on the neighboring island where they asked 43 people that have been identified as close contacts of infected people about their mask wearing behavior. They found that mask wearing made it 40% less likely to get infected.

- A randomized control trial with a total of 426 participants where half the participants were sent a package of masks and recommended to wear those. They found that there was no significant effect of mask wearing. The study has been done in country with a very different climate and culture to Nivara.

- An observational study conducted on a very similar, but not identical virus where the researchers compared states that have had a mandatory mask policy to states who did not. They conclude that mask wearing reduces transmission risk by 70%.

What do you decide?

[ ] Put a mandatory mask policy in place

[ ] Don’t put a mandatory mask policy in place

Please elaborate a bit on your considerations and decision making process. Especially on why particular studies were more or less important to your decision.

Why did you come to the above conclusion?

Potential User Groups

User Group	Objectives	Path to Impact	Impact Potential
Politicians and Decision Makers	Base their opinion on on scientific data without spending too much time on it. Being able to assess multiple scenarios by giving different weights to studies.	Better informed decisions.	High
Domain Experts, ThinkTanks & Government Consultants	Being able to quickly synthesize relevant existing scientific information with their domain expertise. Faster access to relevant numbers for reports etc.	Better recommendations to public official for policy implementation.	High
Researchers	Sanity check their new research results and compare them with others. Share their new data with the world. Discover research gaps & reduce time for literature review.	Potential reduction of publication of biased results. Better coordination of research efforts.	Medium
Researchers			Medium
Forecasters	Make predictions, based on previous data	Improve forecasting quality.	Medium
Students	Aggregate data for their small research projects	Improve academic work quality.	Low
Just random people who want to know more on the topic	Base their opinion on topic XX on scientific data without spending too much time on it	People with an interest in the topic are better informed.	Low
Just random people who are bored	Look at nicely aggregated data and maybe see something interesting on the topic they like	Random people are better informed.	Low

Features We Would Like to Include at a Later Stage

Give users the ability to upload their own data
Offer a function where users can either add to the data presented on the page or upload a dataset themselves.
Add additional possibilities to filter studies
Further dimensions to filter could be e.g. the type of study (RCT, observational etc.), the year it was published in, the journal it was published in or a bias rating of the study.
Enable input of weights for specific studies
Right now it is only possible to select or deselect a study. However, in the future we would like to enable users to assign specific weights to studies (e.g. depending on how much they trust the results).
Add an option to calculate the hyper-estimate in a Bayesian way as opposed to frequentist models.
Right now, the estimates are calculated in the most common frequentist way based on the inverse-variance method. However, we already have started on working on offering a Bayesian estimation framework as an alternative.
Enable user to input a prior for the Bayesian estimation. In conjunction with the Bayesian estimation we want to enable users to input their own priors. This means, one could choose a mean and confidence interval for their prior belief that is then used as a prior for the Bayesian effect estimation.

^
As opposed to irrelevant (e.g. What T-shirt should I wear today?) or intractable problems (e.g. What things will happen on the 1st of May 2024?). Another exception are problems that have just come up very recently (e.g. How will GPT4 affect the labour market?).

Could interactive meta-studies of scientific research improve decision making?