We are planning to do a survey of a representative selection of students at NTNU, our university in Trondheim, Norway. There are about 23 000 students across a few campuses. We want to measure the students’:
… basic knowledge of global development, aid and health (like Hans Rosling’s usual questions)
… current willingness and habits of giving (How much? To what? Why?)
… estimates of what they will give in the future, that is after graduating
And of course background information.
We think we may use this survey for multiple ends. Our initial motivation was to find a way to measure our own impact at the university. And we still think, in some sense, we could measure our impact over time.
Another use of the results would be the media opportunity when we present the disaggregated results, e.g. how altruistic are the engineering students compared to the medical students. We think the student press would love these kind of results and give us a lot of free media coverage.
Lastly we have thought about these results could be interesting to other institutions in Norway, primarily in the aid sector. Our university is the largest technological university in Norway with many of the most attractive fields of study and thus many businesses and institutions are interested in the students.
It this is a success we want to expand to the universities in Oslo and Bergen. This will also give us a better control group, more solid results, maybe national media coverage and a better chance to reach out to people.
I would love to get some answers to the following questions:
Do you have any experience from similar projects? Are there any specific questions or other topics we should consider including in the survey? Maybe you have other ideas of how we could leverage the results?
Per Bernadette, getting good data from these sorts of project requires significant expertise (if your university is as bad as mine, you can get student media attention for attention-grabbing but methodologically suspect survey data, but I doubt you would get much more). I’m reluctant to offer advice beyond ‘find an expert’. But I will add a collection of problems that surveys run by amateurs fall into as pitfalls to avoid, and further to provide further evidence why expertise is imperative.
1: Plan more, trial less
A lot of emphasis in EA is on trialling things instead of spending a lot of time planning them: lean startups, no plan survives first contact, VoI etc. But lean trial design hasn’t taken off in the way lean start-ups have. Your data can be poisoned to the point of being useless in innumerable ways, and (usually) little can be done about this post-hoc: many problems revealed in analysis could only have been fixed in original design.
1a: Especially plan analysis
Gathering data and then analysing it always suspect: one can wonder whether the investigators have massaged the analysis to satisfy their own preconceptions or prejudices. The usual means to avoiding it is specifying the analysis you will perform: the analysis might be ill-conceived, but at least it won’t be data-dredging. It is hard to plan in advance what sort of hypotheses the data would inspire you to inspect, so seek expert help.
2: Care about sampling
With ‘true’ random sampling, the errors in your estimates fall as your sample size increases. The problem with bias/directional error is that its magnitude doesn’t change with your sample size.
Perfect probabilistic sampling is probably a platonic ideal—especially with voluntary surveys, the factors that make someone take the survey will probably change the sample from the population of interest along axis that aren’t perfectly orthogonal to your responses. It remains an ideal worth striving for: significant sampling bias makes your results all-but-uninterpretable (modulo very advanced ML techniques, and not always even then). It is worth thinking long and hard about the population you are actually interested, the sampling frame you will use to try and capture them, etc. etc.
Questions can be surprisingly hard to ask right
Even with a perfect sample, they still might not provide good data depending on the questions you use. There are a few subtle pitfalls besides the more obvious ones of forgetting to include the questions you wanted to ask or lapses of wording: allowing people to select multiple options of an item then wondering how to aggregate it, having a ‘choose one’ item with too many selections for the average person to read, or sub dividing it inappropriately: (“Is your favourite food Spaghetti, Tortollini, Tagliatelle, Fusili, or Pizza?”)
Again, people who spend a living designing surveys try and do things to limit these problems—item pools, pilots where they look at different questions and see which yield the most data, etc. etc.
3a. Too many columns in the database
There’s a habit towards a ‘kitchen sink’ approach of asking questions—if in doubt, add it in, as it can only give good data, right? The problem is that false positives become increasingly difficult if you just fish for interesting correlations, as the possible comparisons increase geometrically. There are ways of overcoming this (dimension reduction, family-wise or false-discovery error control), but they aren’t straightforward.
There are probably many more I’ve forgotten. But tl;dr: it is tricky to do this right!
I think a key challenge with this is how you intend to select your sample, so as to be truly representative. Getting interested students will select a certain type of participant; so will offering a payment. Could you get the University on board with distributing your survey via email to random student numbers, for example? Your results will only be powerful (and useful) if you can ensure random selection of participants.
Agreed! As well as a careful sampling plan, things to think about in advance are: how will your questions be tested (to make sure you are asking about the information you think you are asking about). To be rigorous, you should also have a pre-specified analysis plan, which includes what comparisons you are going to make, what tests are appropriate for the data set, and how big your sample needs to be to detect the difference you are interested in.
The planning and design of surveys is a whole area of study. I would suggest finding someone knowledgeable in it to help. It’s possible somebody studying a relevant subject might be able to get involved and help you as part of their coursework. (At Oxford University these things are taught in Health Sciences, and there are study design modules that have projects doing just that)
Sounds like an interesting project Jorgen! It sounds like you already have a good plan, so my main survey-running tip would be to keep it short, and break it out into multiple pages if it reaches sufficient length so people don’t have to complete the whole thing. We used LimeSurvey for the EA survey, which is a pretty nice piece of software—I’d be happy to answer any questions on that if you want to message me.
We are planning to do a survey of a representative selection of students at NTNU, our university in Trondheim, Norway. There are about 23 000 students across a few campuses. We want to measure the students’:
… basic knowledge of global development, aid and health (like Hans Rosling’s usual questions)
… current willingness and habits of giving (How much? To what? Why?)
… estimates of what they will give in the future, that is after graduating
And of course background information.
We think we may use this survey for multiple ends. Our initial motivation was to find a way to measure our own impact at the university. And we still think, in some sense, we could measure our impact over time. Another use of the results would be the media opportunity when we present the disaggregated results, e.g. how altruistic are the engineering students compared to the medical students. We think the student press would love these kind of results and give us a lot of free media coverage. Lastly we have thought about these results could be interesting to other institutions in Norway, primarily in the aid sector. Our university is the largest technological university in Norway with many of the most attractive fields of study and thus many businesses and institutions are interested in the students.
It this is a success we want to expand to the universities in Oslo and Bergen. This will also give us a better control group, more solid results, maybe national media coverage and a better chance to reach out to people.
I would love to get some answers to the following questions: Do you have any experience from similar projects? Are there any specific questions or other topics we should consider including in the survey? Maybe you have other ideas of how we could leverage the results?
Per Bernadette, getting good data from these sorts of project requires significant expertise (if your university is as bad as mine, you can get student media attention for attention-grabbing but methodologically suspect survey data, but I doubt you would get much more). I’m reluctant to offer advice beyond ‘find an expert’. But I will add a collection of problems that surveys run by amateurs fall into as pitfalls to avoid, and further to provide further evidence why expertise is imperative.
1: Plan more, trial less
A lot of emphasis in EA is on trialling things instead of spending a lot of time planning them: lean startups, no plan survives first contact, VoI etc. But lean trial design hasn’t taken off in the way lean start-ups have. Your data can be poisoned to the point of being useless in innumerable ways, and (usually) little can be done about this post-hoc: many problems revealed in analysis could only have been fixed in original design.
1a: Especially plan analysis
Gathering data and then analysing it always suspect: one can wonder whether the investigators have massaged the analysis to satisfy their own preconceptions or prejudices. The usual means to avoiding it is specifying the analysis you will perform: the analysis might be ill-conceived, but at least it won’t be data-dredging. It is hard to plan in advance what sort of hypotheses the data would inspire you to inspect, so seek expert help.
2: Care about sampling
With ‘true’ random sampling, the errors in your estimates fall as your sample size increases. The problem with bias/directional error is that its magnitude doesn’t change with your sample size.
Perfect probabilistic sampling is probably a platonic ideal—especially with voluntary surveys, the factors that make someone take the survey will probably change the sample from the population of interest along axis that aren’t perfectly orthogonal to your responses. It remains an ideal worth striving for: significant sampling bias makes your results all-but-uninterpretable (modulo very advanced ML techniques, and not always even then). It is worth thinking long and hard about the population you are actually interested, the sampling frame you will use to try and capture them, etc. etc.
Questions can be surprisingly hard to ask right
Even with a perfect sample, they still might not provide good data depending on the questions you use. There are a few subtle pitfalls besides the more obvious ones of forgetting to include the questions you wanted to ask or lapses of wording: allowing people to select multiple options of an item then wondering how to aggregate it, having a ‘choose one’ item with too many selections for the average person to read, or sub dividing it inappropriately: (“Is your favourite food Spaghetti, Tortollini, Tagliatelle, Fusili, or Pizza?”)
Again, people who spend a living designing surveys try and do things to limit these problems—item pools, pilots where they look at different questions and see which yield the most data, etc. etc.
3a. Too many columns in the database
There’s a habit towards a ‘kitchen sink’ approach of asking questions—if in doubt, add it in, as it can only give good data, right? The problem is that false positives become increasingly difficult if you just fish for interesting correlations, as the possible comparisons increase geometrically. There are ways of overcoming this (dimension reduction, family-wise or false-discovery error control), but they aren’t straightforward.
There are probably many more I’ve forgotten. But tl;dr: it is tricky to do this right!
I think a key challenge with this is how you intend to select your sample, so as to be truly representative. Getting interested students will select a certain type of participant; so will offering a payment. Could you get the University on board with distributing your survey via email to random student numbers, for example? Your results will only be powerful (and useful) if you can ensure random selection of participants.
Agreed! As well as a careful sampling plan, things to think about in advance are: how will your questions be tested (to make sure you are asking about the information you think you are asking about). To be rigorous, you should also have a pre-specified analysis plan, which includes what comparisons you are going to make, what tests are appropriate for the data set, and how big your sample needs to be to detect the difference you are interested in.
The planning and design of surveys is a whole area of study. I would suggest finding someone knowledgeable in it to help. It’s possible somebody studying a relevant subject might be able to get involved and help you as part of their coursework. (At Oxford University these things are taught in Health Sciences, and there are study design modules that have projects doing just that)
Sounds like an interesting project Jorgen! It sounds like you already have a good plan, so my main survey-running tip would be to keep it short, and break it out into multiple pages if it reaches sufficient length so people don’t have to complete the whole thing. We used LimeSurvey for the EA survey, which is a pretty nice piece of software—I’d be happy to answer any questions on that if you want to message me.