Abstract
Forecasting the short-term spread of an ongoing disease outbreak poses a challenge owing to the complexity of contributing factors, some of which can be characterized through interlinked, multi-modality variables, and the intersection of public policy and human behavior. Here we introduce PandemicLLM, a framework with multi-modal large language models (LLMs) that reformulates real-time forecasting of disease spread as a text-reasoning problem, with the ability to incorporate real-time, complex, non-numerical information. This approach, through an artificial intelligence–human cooperative prompt design and time-series representation learning, encodes multi-modal data for LLMs. The model is applied to the COVID-19 pandemic, and trained to utilize textual public health policies, genomic surveillance, spatial and epidemiological time-series data, and is tested across all 50 states of the United States for a duration of 19 months. PandemicLLM opens avenues for incorporating various pandemic-related data in heterogeneous formats and shows performance benefits over existing models.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
27,99 € / 30 days
cancel any time
Subscribe to this journal
Receive 12 digital issues and online access to articles
99,00 € per year
only 8,25 € per issue
Buy this article
- Purchase on SpringerLink
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout





Similar content being viewed by others
Data availability
All data utilized in this study derive from publicly accessible sources. Hospitalization data were collected from the US Department of Health and Human Services33 and reported case data were sourced from the Johns Hopkins University CSSE COVID-19 Dashboard34. Vaccination data were obtained from the US CDC Vaccine Tracker36. For spatial data, demographic information was sourced from the US Census Bureau37, healthcare system data from the Commonwealth Fund39, and presidential election results from the Federal Elections Commission41. Public health policy data were acquired from the Oxford COVID-19 Government Response Tracker43. Official reports on variants were collected from the World Health Organization (WHO) and the European Centre for Disease Prevention and Control (ECDC), with estimated variant proportions sourced from the CDC30. The data are available at an archived repository52 and at https://github.com/miemieyanga/PandemicLLM. Source data for Figs. 3–5 and Extended Data Fig. 1 are provided with this paper.
Code availability
All codes are written using Python 3.11.5. Codes are publicly accessible at an archived repository52 and at https://github.com/miemieyanga/PandemicLLM.
References
Giordano, G. et al. Modelling the COVID-19 epidemic and implementation of population-wide interventions in Italy. Nat. Med. 26, 855–860 (2020).
Li, X. et al. Wastewater-based epidemiology predicts COVID-19-induced weekly new hospital admissions in over 150 USA counties. Nat. Commun. 14, 4548 (2023).
Du, H. et al. Incorporating variant frequencies data into short-term forecasting for COVID-19 cases and deaths in the USA: a deep learning approach. EBioMedicine 89, 104482 (2023).
Reich, N. G. et al. Collaborative hubs: making the most of predictive epidemic modeling. Am. J. Public Health 112, 839–842 (2022).
Rosenfeld, R. & Tibshirani, R. J. Epidemic tracking and forecasting: lessons learned from a tumultuous year. Proc. Natl Acad. Sci. USA 118, e2111456118 (2021).
Hsiang, S. et al. The effect of large-scale anti-contagion policies on the COVID-19 pandemic. Nature 584, 262–267 (2020).
Li, J., Lai, S., Gao, G. F. & Shi, W. The emergence, genomic diversity and global spread of SARS-CoV-2. Nature 600, 408–418 (2021).
Nixon, K. et al. An evaluation of prospective COVID-19 modelling studies in the USA: from data to science translation. Lancet Digit. Health 4, e738–e747 (2022).
Castro, M., Ares, S., Cuesta, J. A. & Manrubia, S. The turning point and end of an expanding epidemic cannot be precisely forecast. Proc. Natl Acad. Sci. USA 117, 26190–26196 (2020).
Ioannidis, J. P., Cripps, S. & Tanner, M. A. Forecasting for COVID-19 has failed. Int. J. Forecast. 38, 423–438 (2022).
Telenti, A. et al. After the pandemic: perspectives on the future trajectory of COVID-19. Nature 596, 495–504 (2021).
Nepomuceno, M. R. et al. Besides population age structure, health and other demographic factors can contribute to understanding the COVID-19 burden. Proc. Natl Acad. Sci. USA 117, 13881–13883 (2020).
Ruggeri, K. et al. A synthesis of evidence for policy from behavioural science during COVID-19. Nature 625, 134–147 (2023).
Searls, D. B. The language of genes. Nature 420, 211–217 (2002).
Singhal, K. et al. Large language models encode clinical knowledge. Nature 620, 172–180 (2023).
Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature 619, 357–362 (2023).
Thirunavukarasu, A. J. et al. Large language models in medicine. Nat. Med. 29, 1930–1940 (2023).
Bzdok, D. et al. Data science opportunities of large language models for neuroscience and biomedicine. Neuron 112, 698–717 (2024).
Williams, R., Hosseinichimeh, N., Majumdar, A. & Ghaffarzadegan, N. Epidemic modeling with generative agents. Preprint at https://arxiv.org/abs/2307.04986 (2023).
Gruver, N., Finzi, M., Qiu, S. & Wilson, A. G. Large language models are zero-shot time series forecasters. Adv. Neural Inf. Process. Syst. 36, 19622–19635 (2023).
Covid Data Tracker. CDC https://covid.cdc.gov/covid-data-tracker/#maps_new-admissions-percent-change-state (2024).
Nixon, K. et al. Real-time COVID-19 forecasting: challenges and opportunities of model performance and translation. Lancet Digit. Health 4, e699–e701 (2022).
Touvron, H. et al. LLaMA 2: open foundation and fine-tuned chat models. Preprint at https://arxiv.org/abs/2307.09288 (2023).
Rufibach, K. Use of Brier score to assess binary predictions. J. Clin. Epidemiol. 63, 938–939 (2010).
Leung, K., Wu, J. T. & Leung, G. M. Real-time tracking and prediction of COVID-19 infection using digital proxies of population mobility and mixing. Nat. Commun. 12, 1501 (2021).
Cramer, E. Y. et al. Evaluation of individual and ensemble probabilistic forecasts of COVID-19 mortality in the United States. Proc. Natl Acad. Sci. USA 119, e2113561119 (2022).
Lopez, V. K. et al. Challenges of COVID-19 case forecasting in the US, 2020–2021. PLoS Comput. Biol. 20, e1011200 (2024).
Kalia, K., Saberwal, G. & Sharma, G. The lag in SARS-CoV-2 genome submissions to GISAID. Nat. Biotechnol. 39, 1058–1060 (2021).
TAG-VE statement on Omicron sublineages BQ.1 and XBB. WHO https://www.who.int/news/item/27-10-2022-tag-ve-statement-on-omicron-sublineages-bq.1-and-xbb (2024).
Ma, K. C. Genomic surveillance for SARS-CoV-2 variants: circulation of Omicron lineages—United States, January 2022–May 2023. MMWR Morb. Mortal. Wkly Rep. 72, 651–656 (2023).
Bertozzi, A. L., Franco, E., Mohler, G., Short, M. B. & Sledge, D. The challenges of modeling and forecasting the spread of COVID-19. Proc. Natl Acad. Sci. USA 117, 16732–16738 (2020).
Dong, E. et al. The Johns Hopkins University Center for Systems Science and Engineering COVID-19 dashboard: data collection process, challenges faced, and lessons learned. Lancet Infect. Dis. 22, e370–e376 (2022).
COVID-19 reported patient impact and hospital capacity by facility. Department of Health and Human Services https://healthdata.gov/Hospital/COVID-19-Reported-Patient-Impact-and-Hospital-Capa/anag-cw7u/about_data (2024).
Dong, E., Du, H. & Gardner, L. An interactive web-based dashboard to track COVID-19 in real time. Lancet Infect. Dis. 20, 533–534 (2020).
Du, H., Saiyed, S. & Gardner, L. M. Association between vaccination rates and COVID-19 health outcomes in the United States: a population-level statistical analysis. BMC Public Health 24, 220 (2024).
COVID Data Tracker. CDC https://covid.cdc.gov/covid-data-tracker/#vaccine-delivery-coverage (2024).
State population totals and components of change: 2020–2023. US Census Bureau https://www.census.gov/data/tables/time-series/demo/popest/2020s-state-total.html#v2022 (2024).
Dowd, J. B. et al. Demographic science aids in understanding the spread and fatality rates of COVID-19. Proc. Natl Acad. Sci. USA 117, 9696–9698 (2020).
Radley, D., Collins, S. & Hayes, S. The Commonwealth Fund 2019 Scorecard on State Health System Performance (The Commonwealth Fund, 2022).
Bollyky, T. J. et al. Assessing COVID-19 pandemic policies and behaviours and their economic and educational trade-offs across US states from Jan 1, 2020, to July 31, 2022: an observational analysis. Lancet 401, 1341–1360 (2023).
Federal Elections 2020. FEC https://www.fec.gov/introduction-campaign-finance/election-results-and-voting-information/federal-elections-2020/ (2024).
Haug, N. et al. Ranking the effectiveness of worldwide COVID-19 government interventions. Nat. Hum. Behav. 4, 1303–1312 (2020).
Hale, T. et al. A global panel database of pandemic policies (Oxford COVID-19 Government Response Tracker). Nat. Hum. Behav. 5, 529–538 (2021).
Stockdale, J. E., Liu, P. & Colijn, C. The potential of genomics for infectious disease forecasting. Nat. Microbiol. 7, 1736–1743 (2022).
Vaswani, A. et al. Attention is all you need. Adv. Neural Inf. Process. Syst. 30, 5998–6008 (2017).
OpenAI et al. GPT-4 Technical Report. Preprint at https://arxiv.org/abs/2303.08774 (2024).
Fu, Y., Peng, H., Sabharwal, A., Clark, P. & Khot, T. Complexity-based prompting for multi-step reasoning. In The Eleventh International Conference on Learning Representations (2023).
Liang, J. et al. Code as policies: language model programs for embodied control. In 2023 IEEE International Conference on Robotics and Automation (ICRA) 9493–9500 (IEEE, 2023).
Touvron, H. et al. LLaMA: open and efficient foundation language models. Preprint at https://arxiv.org/abs/2302.13971 (2023).
Gage, P. A new algorithm for data compression. C Users J. 12, 23–38 (1994).
Liu, H., Li, C., Wu, Q. & Lee, Y. J. Visual instruction tuning. Adv. Neural Inf. Process. Syst. 36, 34892–34916 (2023).
Du, H. & Zhao, Y. Advancing real-time infectious disease forecasting using large language models. Zenodo https://doi.org/10.5281/zenodo.14788491 (2025).
Hadfield, J. et al. Nextstrain: real-time tracking of pathogen evolution. Bioinformatics 34, 4121–4123 (2018).
Acknowledgements
This work was supported by NSF Award ID 2229996 (L.M.G.), cooperative agreement CDC-RFA-FT-23-0069 (L.M.G.), NOA: 6 NU38FT000012-01, from the CDC Center for Forecasting and Outbreak Analytics (L.M.G.), Merck KGaA Future Insight Prize (L.M.G.), NSF Award ID 2112562 (Y.C.) and ARO W911NF-23-2-0224 (Y.C.). Its contents are solely the responsibility of the authors and do not necessarily represent the official views of the funding agencies.
Author information
Authors and Affiliations
Contributions
H.D., Y.Z., J.Z. and H.F.Y. conceptualized and designed the study. H.D. and S.X. collected the data. H.D. processed the data and designed prompts. H.D., J.Z. and Y.Z. performed the experiments. S.X. ran the baseline models. H.D., J.Z., Y.Z., S.X. and H.F.Y. prepared the figures. H.D., J.Z., Y.Z. and H.F.Y analyzed the results. H.D., J.Z., Y.Z. and H.F.Y. wrote the initial draft. L.M.G., Y.C., X.L. and H.F.Y. provided guidance and feedback for the study. L.M.G., X.L. and H.F.Y. revised the paper. L.M.G. and Y.C. acquired the funding. Y.C. provided computational resources. All authors prepared the final version of the paper.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Peer review
Peer review information
Nature Computational Science thanks Rafael de Andrade Moral and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Peer reviewer reports are available. Primary Handling Editor: Ananya Rastogi, in collaboration with the Nature Computational Science team.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data
Extended Data Fig. 1 Overview of 1 and 3-week Targets Design and Experimental Setups for the PandemicLLMs.
The black line shows the average COVID-19 hospitalization rate per 100,000 people in the U.S. plotted over time. Each data point (dot) represents the hospitalization rate for a specific state at a particular week. The color of each dot corresponds to the state’s resulting HTC category for that week. Three PandemicLLMs were developed, each trained and validated on data ending in June 2021 (PandemicLLM-1), December 2021 (PandemicLLM-2), and September 2022 (PandemicLLM-3), respectively. All models were subsequently evaluated on a dataset spanning from their respective end dates to February 2023, without retraining. Panel (a) visualizes 1-week targets, while panel (b) shows 3-week targets. Dates are in the format year-month-day.
Supplementary information
Supplementary Information
Supplementary Figs. 1–43, Results and Tables 1–9.
Source data
Source Data Fig. 3
Model performance across all 50 states during the evaluated periods.
Source Data Fig. 4
Confidence threshold and model accuracy, along with confusion matrices.
Source Data Fig. 5
Variants proportion across time, model performances and confidences across time.
Source Data Extended Data Fig. 1
The hospitalization rates for each state, recorded on a weekly basis.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Du, H., Zhao, Y., Zhao, J. et al. Advancing real-time infectious disease forecasting using large language models. Nat Comput Sci 5, 467–480 (2025). https://doi.org/10.1038/s43588-025-00798-6
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s43588-025-00798-6
This article is cited by
-
Leveraging large language models for pandemic preparedness
Nature Computational Science (2025)