What if you got a range of different EA orgs who are promoting the same behaviour (e..g, donation) then implemented a series of interventions (e.g., a change in choice architecture) across their websites/campaigns (maybe on just a portion of traffic). Then use A/B testing to see what worked.
This is very much what we are focused on doing; hoping to do soon. Donations and pledges are the obvious common denominator across many of our partner orgs. Being able to test a comparable change (in choice architecture, or messaging/framing, etc.) will be particularly helpful in allowing us to measure the robustness of any effect across contexts and platforms.
There is some tradeoff between ‘run the most statistically powerful test of a single pairing, A vs B’ and ‘run tests of across a set of messages A through Z using an algorithm (e.g., Kasy’s ‘Exploration sampling’) designed to yield the highest-value state of learning (cf ’reinforcement learning). We are planning to do some of each, compromising between these approaches.
This is very much what we are focused on doing; hoping to do soon. Donations and pledges are the obvious common denominator across many of our partner orgs. Being able to test a comparable change (in choice architecture, or messaging/framing, etc.) will be particularly helpful in allowing us to measure the robustness of any effect across contexts and platforms.
There is some tradeoff between ‘run the most statistically powerful test of a single pairing, A vs B’ and ‘run tests of across a set of messages A through Z using an algorithm (e.g., Kasy’s ‘Exploration sampling’) designed to yield the highest-value state of learning (cf ’reinforcement learning). We are planning to do some of each, compromising between these approaches.