An approximate solution is to exploit your best opportunity 90% of the time, then randomly select another opportunity to explore 10% of the time.
This is the epsilon-greedy strategy with epsilon = 0.1, which is probably a good rule of thumb for when one’s prior for each of the causes has a slim-tailed distribution (e.g. Gaussian). The optimal value of epsilon increases with the variance in our prior for each of the causes. So if we have a cause and our confidence interval for its cost effectiveness goes over more than an order of magnitude (high variance), a higher value of epsilon could be better. Point is—the rule of thumb doesn’t really apply when you think some causes are much better than others and you have plenty of uncertainty.
That said, if you had realistic priors for the effectiveness of each cause, you can calculate an optimal solution using Gittins indeces.
This is the epsilon-greedy strategy with epsilon = 0.1, which is probably a good rule of thumb for when one’s prior for each of the causes has a slim-tailed distribution (e.g. Gaussian). The optimal value of epsilon increases with the variance in our prior for each of the causes. So if we have a cause and our confidence interval for its cost effectiveness goes over more than an order of magnitude (high variance), a higher value of epsilon could be better. Point is—the rule of thumb doesn’t really apply when you think some causes are much better than others and you have plenty of uncertainty.
That said, if you had realistic priors for the effectiveness of each cause, you can calculate an optimal solution using Gittins indeces.
Interesting!