Can you give an indication of how common the problem is? (ie how often do papers get lost/deleted?) My intuition says not very often, and when it does happen it’s most likely to be the least useful papers, but I could believe my intuition is wrong.
Thanks for the question! Should have provided context. With new executive orders, entire databases are being deleted of open sourc public academic data. Efforts to retain access are kind of disparate and keeping track is hard, whilst datasets are too big for lone people to download and archive or host.
For example, here’s a short excerpt of just some of the deletions since yesterday (started collating to keep track in the masterdoc, hoping to make a website/distribution etc):
PAPERS AND TOPICS DELETED or UNAVAILABLE: (as of 2/2/25) AND ALTERNATIVE SOURCES (IF AVAILABLE)
Also down was AtlasPlus, an interactive tool that lets users analyze CDC data on HIV, STDs, TB and viral hepatitis, and the CDC’s Social Vulnerability Index, data that helps researchers and public policy leaders identify communities that are vulnerable to the effects of disasters and public health emergencies.
CDC with surveillance data on HIV, viral hepatitis, STDs and TB. Also gone missing: a page with basic information about HIV testing. The CDC’s Social Vulnerability Index, a tool that assesses community resilience in the event of natural disaster was also taken down.
For the first time in 60 years, MMWR weekly morbidity and mortality report isn’t published
Vaccine info sheets
As of Friday afternoon, several CDC pages related to HIV were down, including the CDC’s HIV index page, testing page, datasets, national surveillance reports and causes pages.
The site for the Youth Risk Behavior Surveillance System — a long-running survey that tracks health behaviors among high school students in the United States — said “The page you’re looking for was not found.”
Several webpages from Centers for Disease Control and Prevention with references to LGBTQ+ health were no longer available. A page from the HHS Office for Civil Rights outlining the rights of LGBTQ+ people in health care settings was also gone as of Friday. The website of the National Institutes of Health’s Office for Sexual & Gender Minority Research Office disappeared.
Checked the 8 links in the first section and they’ve all been archived on the publicly accessible Internet Archive for at least half a year. There’s also browser tooling to access those archives quicker
Thank you so much for being proactive! It’s true partial archives of some CDC datasets have been done, but the issue is is it’s usually dynamic, in the sense that guidelines get republished each week (or day for outbreaks) and get updated continually. Furthermore, IA is working on archiving datasets, but downloading or using them only brings the static dataset without necessarily capturing the actual sitemap schema for navigation. Plus IA and EOT are great but are begging people to help out to decentralised our dependandance and provide alternatives if they get targeted.
At the very least, the hope is the most critical day to day functioning information can be reported and provided.
For example, HIV prescribing guidelines for clinicians and NGOs valid since last week have been put onto the doc, and vaccine information sheets valid from 28 Jan 2025 also put onto the doc if anyone needs them.
Can you give an indication of how common the problem is? (ie how often do papers get lost/deleted?) My intuition says not very often, and when it does happen it’s most likely to be the least useful papers, but I could believe my intuition is wrong.
Thanks for the question! Should have provided context. With new executive orders, entire databases are being deleted of open sourc public academic data. Efforts to retain access are kind of disparate and keeping track is hard, whilst datasets are too big for lone people to download and archive or host.
For example, here’s a short excerpt of just some of the deletions since yesterday (started collating to keep track in the masterdoc, hoping to make a website/distribution etc):
PAPERS AND TOPICS DELETED or UNAVAILABLE: (as of 2/2/25) AND ALTERNATIVE SOURCES (IF AVAILABLE)
Broad topics:
HIV and Sexual Health
https://www.cdc.gov/hiv/testing/index.html
HIV hub https://www.cdc.gov/hiv/site.html
Contraceptive guidelines https://www.cdc.gov/contraception/hcp/contraceptive-guidance/
STI treatment guidelines https://www.cdc.gov/std/treatment-guidelines/adolescents.htm
Discussing HIV and Sexual Health resources https://www.cdc.gov/healthyyouth/healthservices/infobriefs/birth_control_information.htm
STIs in adolescents treatments https://www.cdc.gov/std/treatment-guidelines/adolescents.htm
Up from 17:46 GMT 2⁄2
Contraception guidance for healthcare providers https://www.cdc.gov/contraception/hcp/contraceptive-guidance/
Preventing HPV in women https://www.cdc.gov/hiv/prevention/index.html
STI statistics https://www.cdc.gov/sti-statistic
Gender and diversity
Disability inclusion
Youth and childhood
Diseases and outbreaks, global health
Health disparities in TB, HIV, STDS and hepatitis https://www.cdc.gov/health-disparities-hiv-std-tb-hepatitis/
Vaccines
Vaccine guidance https://www.cdc.gov/vaccines/hcp/acip-recs/index.html
Misc
A-Z Index of Birth Defects https://www.cdc.gov/ncbddd/sitemap.html
Intellectual Disabilities Information Hub https://www.cdc.gov/ncbddd/developmentaldisabilities/facts-about-intellectual-disability.html
Contact CDC https://www.cdc.gov/contact/wcms-auto-sitemap.xml
Cancer screening hub https://www.cdc.gov/wcms-auto-sitemap-root-cancerscreening.xm
Covid treatment sitemap
Covid vaccine information https://www.cdc.gov/covidvaccines
Archived content I could find https://archive.cdc.gov/#/results?q=covid%20vaccine&start=0&rows=10
VetoViolance site https://vetoviolence.cdc.gov/apps/maintenance/
TO SORT: Deletions and removals
INCLUSIVE PRACTICES FOR HELPING STUDENTS THRIVE
YOUTH RISK BEHAVIOUR SURVEILLANCE SYSTEM
PREVENTING CHRONIC DISEASE | SEXUAL RISK FACTORS
SEXUAL HEALTH RISK ADVISORY CONCERNS
Also down was AtlasPlus, an interactive tool that lets users analyze CDC data on HIV, STDs, TB and viral hepatitis, and the CDC’s Social Vulnerability Index, data that helps researchers and public policy leaders identify communities that are vulnerable to the effects of disasters and public health emergencies.
A page about food safety during pregnancy called “Safer Food Choices for Pregnant People” was also removed.
CDC with surveillance data on HIV, viral hepatitis, STDs and TB. Also gone missing: a page with basic information about HIV testing. The CDC’s Social Vulnerability Index, a tool that assesses community resilience in the event of natural disaster was also taken down.
For the first time in 60 years, MMWR weekly morbidity and mortality report isn’t published
Vaccine info sheets
As of Friday afternoon, several CDC pages related to HIV were down, including the CDC’s HIV index page, testing page, datasets, national surveillance reports and causes pages.
Many of the CDC’s sites related to LGBTQ youth were also removed, including pages that mentioned LGBT children’s risk of suicide, those focused on creating safe schools for LGBTQ youth and a page focused on health disparities among LGBTQ youth.
The site for the Youth Risk Behavior Surveillance System — a long-running survey that tracks health behaviors among high school students in the United States — said “The page you’re looking for was not found.”
Several webpages from Centers for Disease Control and Prevention with references to LGBTQ+ health were no longer available. A page from the HHS Office for Civil Rights outlining the rights of LGBTQ+ people in health care settings was also gone as of Friday. The website of the National Institutes of Health’s Office for Sexual & Gender Minority Research Office disappeared.
Checked the 8 links in the first section and they’ve all been archived on the publicly accessible Internet Archive for at least half a year. There’s also browser tooling to access those archives quicker
Thank you so much for being proactive! It’s true partial archives of some CDC datasets have been done, but the issue is is it’s usually dynamic, in the sense that guidelines get republished each week (or day for outbreaks) and get updated continually. Furthermore, IA is working on archiving datasets, but downloading or using them only brings the static dataset without necessarily capturing the actual sitemap schema for navigation. Plus IA and EOT are great but are begging people to help out to decentralised our dependandance and provide alternatives if they get targeted.
At the very least, the hope is the most critical day to day functioning information can be reported and provided.
For example, HIV prescribing guidelines for clinicians and NGOs valid since last week have been put onto the doc, and vaccine information sheets valid from 28 Jan 2025 also put onto the doc if anyone needs them.
But thank you for looking into it 💕