Apr 26, 2026
Attention-decay in pandemic surveillance is an emerging-disease phenomenon
Taggart Tufte
Multi-signal pandemic surveillance combines wastewater, search-query, information-seeking, and clinical signals under the assumption that adding sources improves detection. We test this assumption across the multi-year endemic transition of COVID-19 and contrast against influenza. Across attention-based signals (Google Trends, Wikipedia) we find 5–23× variance compression after the first major COVID-19 wave but no compression across flu seasons; wastewater is the only signal type with stable variance across both diseases. The diagnosis is a novelty cycle in public attention specific to emerging pathogens. We additionally report a corpus-scale negative result on LLM-prompt surveillance using WildChat-4.8M (3.2M conversations), and a system-design proposal for privacy-preserving aggregate releases.
This was a really exceptionally executed and presented project, with some very clear explanations and compelling data visualizations. It was also an impressive amount of work to have achieved in the timeframe.
On style and presentation, my only gripe was that the work was structured and began quite formal and academic, but then transitioned to a more 'bloggy' tone. Though the clear explanations and signposts made up for that.
Other points of improvement were 1) true relevance to the problem in the Track and 2) the methodology/results felt like they were somewhat overcomplicated/contrived. On point 1), the work mainly considers how signals change during an outbreak that became much less catastrophic over time, rather than contributing to a true early-warning system or initial detection. This is useful from a surveillance perspective, but is quite a bit less marginally useful than if the work addressed pre-outbreak warning systems. It would have been interesting to test whether there were any signals that reflected the changing virology/epidemiology of the COVID-19 variants (e.g. do any of the input signals reflect changing symptoms/lethality/transmissibility for which we have good data for a given variant).
On point 2), while impressive, the project often felt like it described quite 'common sense' findings, e.g. people got 'COVID-fatigued' and stopped searching things on Google. There was a lot of fairly advanced mathematics and statistics in this study that were outside of my expertise, so I would liked to have seen more application of Occam's razor - some simpler descriptive statistics might have painted a similar story and been more accessible to a wider audience.
Report shares finding that online user signals of attention to a particular pathogen do not reflect ground truth levels of incidence in the long run. Report notes that the historical decay of public attention to COVID-19 relative to its incidence appears to a phenomenon stemming from its prior novelty as an emerging pandemic. Report highlights that wastewater surveillance remains a reliable signal of a pathogen's incidence, unlike online signals of attention to the pathogen, which may not always correspond.
Code repository link in PDF submission did not work, at least on the submission review platform. Repository could not be found on the submission author's personal GitHub page. Report appears generated at least in large part via LLM assistance, though a note on how AI-technologies were used in the submission was not included. Implications of report findings are not highlighted enough- with a document of this length, it is extra important to make the take-aways clear.
Cite this work
@misc {
title={
(HckPrj) Attention-decay in pandemic surveillance is an emerging-disease phenomenon
},
author={
Taggart Tufte
},
date={
4/26/26
},
organization={Apart Research},
note={Research submission to the research sprint hosted by Apart.},
howpublished={https://apartresearch.com}
}


