I like to ask questions and talk about ideas. www.tinarwhite.com
tina
We have a website now too. Please check out our collaborate page if you’d like to help: https://www.covid19risk.com/
CoEpi is a great team too! We’re helping each other out now with a lot of cross-communication between our slack channels.
Wow, what a great dataset! If you have some colleagues that might be interested, please link them here to the forum. I’ve also made a couple public Facebook posts about it looking for collaborators in COVID-19 discussion groups and publicly on on my personal page here and here with more information about who can help.
This app idea is focused on privacy, but not for the reasons people may assume. It isn’t just an abstract ideal. It’s privacy-focused because it would need to be. The app would rely on people voluntarily sharing data. In many countries, people do not feel safe sharing certain kinds of information because of their government. And if it’s not safe to share your information, people won’t. Making this as safe as possible would be crucial, and that’s what the privacy focus is about.
3) You mention that Google traffic data is still useful, even when few people use it. I am not familiar with that part of the app, but if it involves some form of prediction, it is important to note that Google has had years to get this right. With a pandemic, you have at best months(!), and on top of that the situation changes constantly.
Yes. This is another reason why working with someone like Google Maps or some other mapping app could be crucial, because they have accumulated domain / tribal knowledge that no one else might have.
[Edit: I’ve received some private feedback that neither of us might be right here. The calculation for both (density of traffic and density of “infectiousness”) may be quite straightforward. The reason traffic updates got so much better might just have been more data.]
2) Regarding the computation of the risk score: If you only use confirmed cases with voluntary sign up, you might not get enough data; if you use suspected cases by symptoms, you will get a lot of false positives due to worried people with the flu. In the absence of data on how to properly account for that, this is a very difficult problem.
These are significant challenges and I talk a bit about how they can be addressed in The Incentives Align and at the end of the section Example App Questionnaire. I imagine there would always be more confidence put in a confirmed case with a code than someone who just answers yes to having cold or flu symptoms recently.
Also, for confirmed cases, as part of contact tracing, the CDC sometimes identifies a site of concern where a patient might recall that something particularly infectious happened before they were aware they were sick. For example: “Oh. I remember that a few days ago, I sneezed quite forcefully and unexpectedly at my favorite buffet, so I couldn’t cover my nose. Oh, and then again on the way home on the BART! I’m so sorry.” Tracing multiple paths backwards, you might get a lot of data from a single event.
1) How likely are you to catch the virus at all just by being in the same area/frequenting the same shops as somebody infected? My impression from the Western cases so far was that it infections occurred generally with close contacts; this risk changes obviously when more infected people are around, but still should be estimated to decide whether such an app would be worth it.
I imagine the CDC could already have some estimates for this, which they might use in contact tracing. And it might turn out that contact tracing is enough to solve the problem. It seems to be working well right now in the U.S.
But if not, a not-so-educated guess for a general outline of the calculation of risk for the app might be (1) Close contact. You were in the same location where a possibly infectious person spent some time within maybe 10 minutes of them. This is higher risk. (2) Semi-Close Contact. If the virus might live on surfaces for a few hours, the close contact risk distribution tapers off over a few hours and (3) Infectiousness. This isn’t binary so the infectious person’s distribution also peaks sometime around their first symptoms, and tapers off over a few days up until the tail reaches some maximum.
A heat map that changes with time could reflect this information. And any individual’s risk could be calculated by integrating over it. And, if the local situation is suddenly found to have had multiple people in it in the last few weeks, and the level of precision was possible, you could do multiple iterations of this given each user’s (small) risk of contracting the virus from their interactions too, given an estimate of what we think the time might be between when a person contracts the virus to when they become infectious. This is harder, but if everything else works, it’s possible.
Writing this out in words is long, but the actual summation/integral is not too complicated. It’s a combination of science and hack-y guesses. But this seems true to me of almost all engineering.
0) Even basic questions about the virus and how it spreads are still unanswered, like how infectious one is during the incubation period. This makes more advanced questions regarding a risk score difficult to answer.
I agree. And there are a lot of resources being put into research on this right now, so in time I hope we have better answers. But even imperfect information could be helpful. See the Q&A with Sukrit Silas. At first I imagined the app could only give a risk score that was very coarse. Just levels 1-7. I’ve commented separately, header Example Score Levels, with an example of what I mean, which I didn’t put in the post because I have no confidence in what it should be. But you might be able to show a decimal point too. I think it’s good not to start out being committed to any kind of risk scoring system until you have a sense of what’s possible.
Thank you for raising the concerns that you did. I appreciate the opportunity to explain more about how I’ve been thinking about these concerns. This is the kind of feedback I was hoping for from posting on EA Forum. I’ll try to address each one.
Example Score Levels
I didn’t put this in the original post nor have I formatted this well because I have no idea what the levels should look like and I’m not committed to them. These are just example score levels for reference:
7 - severe—you have confirmed COVID-19
6 - very high—you are in Wuhan right now, the cruise ship in Japan, or your family member or roommate has had the virus
5 - high—you are in another outbreak in China right now or live near a different community outbreak or in a quarantined rescue from one of these places
4 - medium—you have significantly crossed paths with someone who had the virus and was infectious or you live somewhat near a community outbreak
3 - low—you have crossed paths with someone who may have had the virus or you live or work in an area with more risk of this
2 - very low—you may have crossed paths with someone who may have the virus, but it’s still very unlikely you are infected
1 - negligible—there is no one near you with the virus and you are in a lowest risk area
And maybe
0 - you already had confirmed COVID-19 and recovered
I don’t know this, but I think most people in the U.S. probably have the lowest level 1 - negligible risk right now. It would also be reassuring to see reflected in a score. They’d only need to listen to general advice (wash your hands more and pay attention to the CDC’s recommendations and updates).
These are pretty coarse levels so they aren’t too hard to calculate. And if you are at levels 5-7, you probably already know. If you’re at level 4, you’ve probably already gotten a call from the CDC (or equivalent). But anyone at a lower level has no idea where they are at. Hence, the app.
I also want to point out that at high levels, things like quarantines, contact tracing and requesting that some people self-quarantine are known to be quite effective. At lower levels, we’re given guidance like washing your hands. It wouldn’t be too surprising that an app that assesses situations in between using this sort of probabilistic contact tracing and gives recommendations for things like individual self-quarantine could also be somewhat effective without being too disruptive.
“And before I start with my concerns, Western governments are already kind of doing a similar thing: They identify contacts of infected people, including people who dined at the same restaurant at a similar time, in order to test them.”
This is called contact tracing. In the Introduction, I refer to contact tracing, but I think I missed an opportunity to define what it is, so thank you for pointing this out. What I’m proposing, like you say, is similar. It’s like an automatic, probabilistic form of contact tracing aided by a lot of GPS data.
I posted on EA Forum about a month ago with an idea for a privacy-focused contact tracing app and started a volunteer team to complete it. We now about have about a 20-person team working on Slack. We’ve made great progress and it’s almost done. You can see our progress and arguments for the intervention here:
https://www.covid-watch.org/articles/
And my original EA forum post is here:
https://forum.effectivealtruism.org/posts/8chk6DHZXctGHtNoz/covid-19-risk-assessment-app-idea-for-vetting-and-discussion
If you’re interested in collaborating, please see our collaborate page here:
https://www.covid-watch.org/collaborate.html
[Edit: Formerly at https://www.covid19risk.com]