Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

Advancing real-time infectious disease forecasting using large language models

A preprint version of the article is available at arXiv.

Abstract

Forecasting the short-term spread of an ongoing disease outbreak poses a challenge owing to the complexity of contributing factors, some of which can be characterized through interlinked, multi-modality variables, and the intersection of public policy and human behavior. Here we introduce PandemicLLM, a framework with multi-modal large language models (LLMs) that reformulates real-time forecasting of disease spread as a text-reasoning problem, with the ability to incorporate real-time, complex, non-numerical information. This approach, through an artificial intelligence–human cooperative prompt design and time-series representation learning, encodes multi-modal data for LLMs. The model is applied to the COVID-19 pandemic, and trained to utilize textual public health policies, genomic surveillance, spatial and epidemiological time-series data, and is tested across all 50 states of the United States for a duration of 19 months. PandemicLLM opens avenues for incorporating various pandemic-related data in heterogeneous formats and shows performance benefits over existing models.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Overview of PandemicLLM’s pandemic data streams and pipeline.
Fig. 2: Summary of the AI–human cooperative prompt design.
Fig. 3: PandemicLLM performance evaluation.
Fig. 4: Trustworthiness for PandemicLLMs.
Fig. 5: A comparative analysis with and without the real-time genomic surveillance information.

Similar content being viewed by others

Data availability

All data utilized in this study derive from publicly accessible sources. Hospitalization data were collected from the US Department of Health and Human Services33 and reported case data were sourced from the Johns Hopkins University CSSE COVID-19 Dashboard34. Vaccination data were obtained from the US CDC Vaccine Tracker36. For spatial data, demographic information was sourced from the US Census Bureau37, healthcare system data from the Commonwealth Fund39, and presidential election results from the Federal Elections Commission41. Public health policy data were acquired from the Oxford COVID-19 Government Response Tracker43. Official reports on variants were collected from the World Health Organization (WHO) and the European Centre for Disease Prevention and Control (ECDC), with estimated variant proportions sourced from the CDC30. The data are available at an archived repository52 and at https://github.com/miemieyanga/PandemicLLM. Source data for Figs. 35 and Extended Data Fig. 1 are provided with this paper.

Code availability

All codes are written using Python 3.11.5. Codes are publicly accessible at an archived repository52 and at https://github.com/miemieyanga/PandemicLLM.

References

  1. Giordano, G. et al. Modelling the COVID-19 epidemic and implementation of population-wide interventions in Italy. Nat. Med. 26, 855–860 (2020).

    Article  Google Scholar 

  2. Li, X. et al. Wastewater-based epidemiology predicts COVID-19-induced weekly new hospital admissions in over 150 USA counties. Nat. Commun. 14, 4548 (2023).

    Article  Google Scholar 

  3. Du, H. et al. Incorporating variant frequencies data into short-term forecasting for COVID-19 cases and deaths in the USA: a deep learning approach. EBioMedicine 89, 104482 (2023).

    Article  Google Scholar 

  4. Reich, N. G. et al. Collaborative hubs: making the most of predictive epidemic modeling. Am. J. Public Health 112, 839–842 (2022).

    Article  Google Scholar 

  5. Rosenfeld, R. & Tibshirani, R. J. Epidemic tracking and forecasting: lessons learned from a tumultuous year. Proc. Natl Acad. Sci. USA 118, e2111456118 (2021).

    Article  Google Scholar 

  6. Hsiang, S. et al. The effect of large-scale anti-contagion policies on the COVID-19 pandemic. Nature 584, 262–267 (2020).

    Article  Google Scholar 

  7. Li, J., Lai, S., Gao, G. F. & Shi, W. The emergence, genomic diversity and global spread of SARS-CoV-2. Nature 600, 408–418 (2021).

    Article  Google Scholar 

  8. Nixon, K. et al. An evaluation of prospective COVID-19 modelling studies in the USA: from data to science translation. Lancet Digit. Health 4, e738–e747 (2022).

    Article  Google Scholar 

  9. Castro, M., Ares, S., Cuesta, J. A. & Manrubia, S. The turning point and end of an expanding epidemic cannot be precisely forecast. Proc. Natl Acad. Sci. USA 117, 26190–26196 (2020).

    Article  MathSciNet  Google Scholar 

  10. Ioannidis, J. P., Cripps, S. & Tanner, M. A. Forecasting for COVID-19 has failed. Int. J. Forecast. 38, 423–438 (2022).

    Article  Google Scholar 

  11. Telenti, A. et al. After the pandemic: perspectives on the future trajectory of COVID-19. Nature 596, 495–504 (2021).

    Article  Google Scholar 

  12. Nepomuceno, M. R. et al. Besides population age structure, health and other demographic factors can contribute to understanding the COVID-19 burden. Proc. Natl Acad. Sci. USA 117, 13881–13883 (2020).

    Article  Google Scholar 

  13. Ruggeri, K. et al. A synthesis of evidence for policy from behavioural science during COVID-19. Nature 625, 134–147 (2023).

    Article  Google Scholar 

  14. Searls, D. B. The language of genes. Nature 420, 211–217 (2002).

    Article  Google Scholar 

  15. Singhal, K. et al. Large language models encode clinical knowledge. Nature 620, 172–180 (2023).

    Article  Google Scholar 

  16. Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature 619, 357–362 (2023).

    Article  Google Scholar 

  17. Thirunavukarasu, A. J. et al. Large language models in medicine. Nat. Med. 29, 1930–1940 (2023).

    Article  Google Scholar 

  18. Bzdok, D. et al. Data science opportunities of large language models for neuroscience and biomedicine. Neuron 112, 698–717 (2024).

    Article  Google Scholar 

  19. Williams, R., Hosseinichimeh, N., Majumdar, A. & Ghaffarzadegan, N. Epidemic modeling with generative agents. Preprint at https://arxiv.org/abs/2307.04986 (2023).

  20. Gruver, N., Finzi, M., Qiu, S. & Wilson, A. G. Large language models are zero-shot time series forecasters. Adv. Neural Inf. Process. Syst. 36, 19622–19635 (2023).

    Google Scholar 

  21. Covid Data Tracker. CDC https://covid.cdc.gov/covid-data-tracker/#maps_new-admissions-percent-change-state (2024).

  22. Nixon, K. et al. Real-time COVID-19 forecasting: challenges and opportunities of model performance and translation. Lancet Digit. Health 4, e699–e701 (2022).

    Article  Google Scholar 

  23. Touvron, H. et al. LLaMA 2: open foundation and fine-tuned chat models. Preprint at https://arxiv.org/abs/2307.09288 (2023).

  24. Rufibach, K. Use of Brier score to assess binary predictions. J. Clin. Epidemiol. 63, 938–939 (2010).

    Article  Google Scholar 

  25. Leung, K., Wu, J. T. & Leung, G. M. Real-time tracking and prediction of COVID-19 infection using digital proxies of population mobility and mixing. Nat. Commun. 12, 1501 (2021).

    Article  Google Scholar 

  26. Cramer, E. Y. et al. Evaluation of individual and ensemble probabilistic forecasts of COVID-19 mortality in the United States. Proc. Natl Acad. Sci. USA 119, e2113561119 (2022).

    Article  Google Scholar 

  27. Lopez, V. K. et al. Challenges of COVID-19 case forecasting in the US, 2020–2021. PLoS Comput. Biol. 20, e1011200 (2024).

    Article  Google Scholar 

  28. Kalia, K., Saberwal, G. & Sharma, G. The lag in SARS-CoV-2 genome submissions to GISAID. Nat. Biotechnol. 39, 1058–1060 (2021).

    Article  Google Scholar 

  29. TAG-VE statement on Omicron sublineages BQ.1 and XBB. WHO https://www.who.int/news/item/27-10-2022-tag-ve-statement-on-omicron-sublineages-bq.1-and-xbb (2024).

  30. Ma, K. C. Genomic surveillance for SARS-CoV-2 variants: circulation of Omicron lineages—United States, January 2022–May 2023. MMWR Morb. Mortal. Wkly Rep. 72, 651–656 (2023).

    Article  Google Scholar 

  31. Bertozzi, A. L., Franco, E., Mohler, G., Short, M. B. & Sledge, D. The challenges of modeling and forecasting the spread of COVID-19. Proc. Natl Acad. Sci. USA 117, 16732–16738 (2020).

    Article  MathSciNet  Google Scholar 

  32. Dong, E. et al. The Johns Hopkins University Center for Systems Science and Engineering COVID-19 dashboard: data collection process, challenges faced, and lessons learned. Lancet Infect. Dis. 22, e370–e376 (2022).

    Article  Google Scholar 

  33. COVID-19 reported patient impact and hospital capacity by facility. Department of Health and Human Services https://healthdata.gov/Hospital/COVID-19-Reported-Patient-Impact-and-Hospital-Capa/anag-cw7u/about_data (2024).

  34. Dong, E., Du, H. & Gardner, L. An interactive web-based dashboard to track COVID-19 in real time. Lancet Infect. Dis. 20, 533–534 (2020).

    Article  Google Scholar 

  35. Du, H., Saiyed, S. & Gardner, L. M. Association between vaccination rates and COVID-19 health outcomes in the United States: a population-level statistical analysis. BMC Public Health 24, 220 (2024).

    Article  Google Scholar 

  36. COVID Data Tracker. CDC https://covid.cdc.gov/covid-data-tracker/#vaccine-delivery-coverage (2024).

  37. State population totals and components of change: 2020–2023. US Census Bureau https://www.census.gov/data/tables/time-series/demo/popest/2020s-state-total.html#v2022 (2024).

  38. Dowd, J. B. et al. Demographic science aids in understanding the spread and fatality rates of COVID-19. Proc. Natl Acad. Sci. USA 117, 9696–9698 (2020).

    Article  Google Scholar 

  39. Radley, D., Collins, S. & Hayes, S. The Commonwealth Fund 2019 Scorecard on State Health System Performance (The Commonwealth Fund, 2022).

  40. Bollyky, T. J. et al. Assessing COVID-19 pandemic policies and behaviours and their economic and educational trade-offs across US states from Jan 1, 2020, to July 31, 2022: an observational analysis. Lancet 401, 1341–1360 (2023).

    Article  Google Scholar 

  41. Federal Elections 2020. FEC https://www.fec.gov/introduction-campaign-finance/election-results-and-voting-information/federal-elections-2020/ (2024).

  42. Haug, N. et al. Ranking the effectiveness of worldwide COVID-19 government interventions. Nat. Hum. Behav. 4, 1303–1312 (2020).

    Article  Google Scholar 

  43. Hale, T. et al. A global panel database of pandemic policies (Oxford COVID-19 Government Response Tracker). Nat. Hum. Behav. 5, 529–538 (2021).

    Article  Google Scholar 

  44. Stockdale, J. E., Liu, P. & Colijn, C. The potential of genomics for infectious disease forecasting. Nat. Microbiol. 7, 1736–1743 (2022).

    Article  Google Scholar 

  45. Vaswani, A. et al. Attention is all you need. Adv. Neural Inf. Process. Syst. 30, 5998–6008 (2017).

    Google Scholar 

  46. OpenAI et al. GPT-4 Technical Report. Preprint at https://arxiv.org/abs/2303.08774 (2024).

  47. Fu, Y., Peng, H., Sabharwal, A., Clark, P. & Khot, T. Complexity-based prompting for multi-step reasoning. In The Eleventh International Conference on Learning Representations (2023).

  48. Liang, J. et al. Code as policies: language model programs for embodied control. In 2023 IEEE International Conference on Robotics and Automation (ICRA) 9493–9500 (IEEE, 2023).

  49. Touvron, H. et al. LLaMA: open and efficient foundation language models. Preprint at https://arxiv.org/abs/2302.13971 (2023).

  50. Gage, P. A new algorithm for data compression. C Users J. 12, 23–38 (1994).

    Google Scholar 

  51. Liu, H., Li, C., Wu, Q. & Lee, Y. J. Visual instruction tuning. Adv. Neural Inf. Process. Syst. 36, 34892–34916 (2023).

    Google Scholar 

  52. Du, H. & Zhao, Y. Advancing real-time infectious disease forecasting using large language models. Zenodo https://doi.org/10.5281/zenodo.14788491 (2025).

  53. Hadfield, J. et al. Nextstrain: real-time tracking of pathogen evolution. Bioinformatics 34, 4121–4123 (2018).

    Article  Google Scholar 

Download references

Acknowledgements

This work was supported by NSF Award ID 2229996 (L.M.G.), cooperative agreement CDC-RFA-FT-23-0069 (L.M.G.), NOA: 6 NU38FT000012-01, from the CDC Center for Forecasting and Outbreak Analytics (L.M.G.), Merck KGaA Future Insight Prize (L.M.G.), NSF Award ID 2112562 (Y.C.) and ARO W911NF-23-2-0224 (Y.C.). Its contents are solely the responsibility of the authors and do not necessarily represent the official views of the funding agencies.

Author information

Authors and Affiliations

Authors

Contributions

H.D., Y.Z., J.Z. and H.F.Y. conceptualized and designed the study. H.D. and S.X. collected the data. H.D. processed the data and designed prompts. H.D., J.Z. and Y.Z. performed the experiments. S.X. ran the baseline models. H.D., J.Z., Y.Z., S.X. and H.F.Y. prepared the figures. H.D., J.Z., Y.Z. and H.F.Y analyzed the results. H.D., J.Z., Y.Z. and H.F.Y. wrote the initial draft. L.M.G., Y.C., X.L. and H.F.Y. provided guidance and feedback for the study. L.M.G., X.L. and H.F.Y. revised the paper. L.M.G. and Y.C. acquired the funding. Y.C. provided computational resources. All authors prepared the final version of the paper.

Corresponding authors

Correspondence to Yiran Chen, Lauren M. Gardner or Hao ‘Frank’ Yang.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Computational Science thanks Rafael de Andrade Moral and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Peer reviewer reports are available. Primary Handling Editor: Ananya Rastogi, in collaboration with the Nature Computational Science team.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Overview of 1 and 3-week Targets Design and Experimental Setups for the PandemicLLMs.

The black line shows the average COVID-19 hospitalization rate per 100,000 people in the U.S. plotted over time. Each data point (dot) represents the hospitalization rate for a specific state at a particular week. The color of each dot corresponds to the state’s resulting HTC category for that week. Three PandemicLLMs were developed, each trained and validated on data ending in June 2021 (PandemicLLM-1), December 2021 (PandemicLLM-2), and September 2022 (PandemicLLM-3), respectively. All models were subsequently evaluated on a dataset spanning from their respective end dates to February 2023, without retraining. Panel (a) visualizes 1-week targets, while panel (b) shows 3-week targets. Dates are in the format year-month-day.

Source Data

Supplementary information

Supplementary Information

Supplementary Figs. 1–43, Results and Tables 1–9.

Reporting Summary

Peer Review File

Source data

Source Data Fig. 3

Model performance across all 50 states during the evaluated periods.

Source Data Fig. 4

Confidence threshold and model accuracy, along with confusion matrices.

Source Data Fig. 5

Variants proportion across time, model performances and confidences across time.

Source Data Extended Data Fig. 1

The hospitalization rates for each state, recorded on a weekly basis.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Du, H., Zhao, Y., Zhao, J. et al. Advancing real-time infectious disease forecasting using large language models. Nat Comput Sci 5, 467–480 (2025). https://doi.org/10.1038/s43588-025-00798-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s43588-025-00798-6

This article is cited by

Search

Quick links

Nature Briefing AI and Robotics

Sign up for the Nature Briefing: AI and Robotics newsletter — what matters in AI and robotics research, free to your inbox weekly.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing: AI and Robotics