Abstract
Chest radiography frequently serves as baseline imaging for most lung diseases1. Deep learning has great potential for automating the interpretation of chest radiography2. However, existing chest radiographic deep learning models are limited in diagnostic scope, generalizability, adaptability, robustness and extensibility. To overcome these limitations, we have developed Ark+, a foundation model applied to chest radiography and pretrained by cyclically accruing and reusing the knowledge from heterogeneous expert labels in numerous datasets. Ark+ excels in diagnosing thoracic diseases. It expands the diagnostic scope and addresses potential misdiagnosis. It can adapt to evolving diagnostic needs and respond to novel diseases. It can learn rare conditions from a few samples and transfer to new diagnostic settings without training. It tolerates data biases and long-tailed distributions, and it supports federated learning to preserve privacy. All codes and pretrained models have been released, so that Ark+ is open for fine-tuning, local adaptation and improvement. It is extensible to several modalities. Thus, it is a foundation model for medical imaging. The exceptional capabilities of Ark+ stem from our insight: aggregating various datasets diversifies the patient populations and accrues knowledge from many experts to yield unprecedented performance while reducing annotation costs3. The development of Ark+ reveals that open models trained by accruing and reusing knowledge from heterogeneous expert annotations with a multitude of public (big or small) datasets can surpass the performance of proprietary models trained on large data. We hope that our findings will inspire more researchers to share code and datasets or federate privacy-preserving data to create open foundation models with diverse, global expertise and patient populations, thus accelerating open science and democratizing AI for medicine.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
27,99 € / 30 days
cancel any time
Subscribe to this journal
Receive 51 print issues and online access
199,00 € per year
only 3,90 € per issue
Buy this article
- Purchase on SpringerLink
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout





Similar content being viewed by others
Data availability
All data are publicly available as follows: MIMIC-II (https://physionet.org/content/mimic-cxr-jpg/2.0.0/); CheXpert (https://stanfordmlgroup.github.io/competitions/chexpert/); NIH ChestX-ray14 (https://nihcc.app.box.com/v/ChestXray-NIHCC/folder/36938765345); RSNA Pneumonia Detection Challenge (www.kaggle.com/c/rsna-pneumonia-detection-challenge); VinDr-CXR (https://vindr.ai/datasets/cxr); Shenzhen Hospital X-ray set (https://data.lhncbc.nlm.nih.gov/public/Tuberculosis-Chest-X-ray-Datasets/Shenzhen-Hospital-CXR-Set/index.html); CXR-LT (https://physionet.org/content/cxr-lt-iccv-workshop-cvamd/1.1.0/); ChestDR (https://springernature.figshare.com/articles/dataset/ChestDR_Thoracic_Diseases_Screening_in_Chest_Radiography/22302775); SIIM-ACR (www.kaggle.com/c/siim-acr-pneumothorax-segmentation); TBX-11K (www.kaggle.com/datasets/usmanshams/tbx-11); Mendeley-V2 (www.kaggle.com/datasets/andrewmvd/pediatric-pneumonia-chest-xray); NODE21 (https://node21.grand-challenge.org/) and COVIDxCXR-3 (https://github.com/lindawangg/COVID-Net).
Code availability
The pretrained Ark+ model and source code are publicly available via GitHub at https://github.com/jlianglab/Ark.
References
Broder, J. S. Diagnostic Imaging for the Emergency Physician (ed. Broder, J. S.) Ch. 5, 185–296 (W. B. Saunders, 2011).
Çallı, E., Sogancioglu, E., van Ginneken, B., van Leeuwen, K. G. & Murphy, K. Deep learning for chest X-ray analysis: a survey. Med. Image Anal. 72, 102125 (2021).
Tajbakhsh, N., Roth, H., Terzopoulos, D. & Liang, J. Guest editorial annotation-efficient deep learning: the holy grail of medical imaging. IEEE Trans. Med. Imaging 40, 2526–2533 (2021).
Hosny, A., Parmar, C., Quackenbush, J., Schwartz, L. H. & Aerts, H. J. Artificial intelligence in radiology. Nat. Rev. Cancer 18, 500–510 (2018).
Zhou, Y. et al. A foundation model for generalizable disease detection from retinal images. Nature 622, 156–163 (2023).
Huang, Z., Bianchi, F., Yuksekgonul, M., Montine, T. J. & Zou, J. A visual–language foundation model for pathology image analysis using medical twitter. Nat. Med. 29, 2307–2316 (2023).
Christensen, M., Vukadinovic, M., Yuan, N. & Ouyang, D. Vision–language foundation model for echocardiogram interpretation. Nat. Med. 30, 1481–1488 (2024).
Tiu, E. et al. Expert-level detection of pathologies from unannotated chest X-ray images via self-supervised learning. Nat. Biomed. Eng. 6, 1399–1406 (2022).
Zhang, X., Wu, C., Zhang, Y., Xie, W. & Wang, Y. Knowledge-enhanced visual-language pre-training on chest radiology images. Nat. Commun. 14, 4542 (2023).
Sellergren, A. B. et al. Simplified transfer learning for chest radiography models using less data. Radiology 305, 454–465 (2022).
Azizi, S. et al. Robust and data-efficient generalization of self-supervised machine learning for diagnostic imaging. Nat. Biomed. Eng. 7, 756–779 (2023).
Xu, S. et al. ELIXR: towards a general purpose X-ray artificial intelligence system through alignment of large language models and radiology vision encoders. Preprint at arxiv.org/abs/2308.01317 (2023).
Basdevant, A. et al. Towards a framework for openness in foundation models: proceedings from the Columbia Convening on openness in artificial intelligence. Preprint at arxiv.org/abs/2405.15802 (2024).
Ma, D., Pang, J., Gotway, M. B. & Liang, J. Foundation Ark: accruing and reusing knowledge for superior and robust performance. In Proc. International Conference on Medical Image Computing and Computer-Assisted Intervention (eds Greenspan, H. et al.) 651–662 (Springer, 2023).
Liu, Z. et al. Swin transformer: hierarchical vision transformer using shifted windows. In Proc. IEEE/CVF International Conference on Computer Vision (eds Hassner, T. et al.) 10012–10022 (IEEE, 2021).
Velan, S. S. Benchmarking and Boosting Localizers for Chest X-rays. Master’s thesis, Arizona State Univ. (2024).
Saravanan, M. Benchmarking and Boosting of 3D Segmentation Models. Master’s thesis, Arizona State Univ. (2024).
Islam, N. U. et al. Foundation X: integrating classification, localization, and segmentation through lock-release pretraining strategy for chest X-ray analysis. In Proc. IEEE/CVF Winter Conference on Applications of Computer Vision (eds Biswas, S. et al.) 3647–3656 (IEEE, 2025).
Wang, X. et al. Chestx-ray8: hospital-scale chest X-ray database and benchmarks on weakly-supervised classification and localization of common thorax diseases. In Proc. IEEE Conference on Computer Vision and Pattern Recognition (eds Cucchiara, R. et al.) 2097–2106 (IEEE, 2017).
Pérez-García, F. et al. Exploring scalable medical image encoders beyond text supervision. Nat. Mach. Intell. 7, 119–130 (2025).
Ma, D. et al. Benchmarking and boosting transformers for medical image classification. In Proc. MICCAI Workshop on Domain Adaptation and Representation Transfer (eds Kamnitsas, K. et al.) 12–22 (Springer, 2022).
Cho, K. et al. Chess: chest X-ray pre-trained model via self-supervised contrastive learning. J. Digit. Imaging 36, 902–910 (2023).
Kang, M. et al. Label-assemble: leveraging multiple datasets with partial labels. In Proc. 20th International Symposium on Biomedical Imaging (eds Salvado, O. et al.) 1–5 (IEEE, 2023).
Lee, J. et al. Deep learning for rare disease: a scoping review. J. Biomed. Inform. 135, 104227 (2022).
Yaqing, W., Quanming, Y., Kwok James, T. & Ni Lionel, M. Generalizing from a few examples: a survey on few-shot learning. ACM Comput. Surv. 53, 1–34 (2020).
Holste, G. et al. CXR-LT: multi-label long-tailed classification on chest X-rays. PhysioNet 5, 19 (2023).
Zhou, S. K. et al. A review of deep learning in medical imaging: imaging traits, technology trends, case studies with progress highlights, and future promises. Proc. IEEE 109, 820–838 (2021).
Wang, D. et al. A real-world dataset and benchmark for foundation model adaptation in medical image classification. Sci. Data 10, 574 (2023).
Zhang, L. et al. Generalizing deep learning for medical image segmentation to unseen domains via deep stacked transformation. IEEE Trans. Med. Imaging 39, 2531–2540 (2020).
Cohen, J. P. et al. TorchXRayVision: a library of chest X-ray datasets and models. In Proc. International Conference on Medical Imaging with Deep Learning (eds Konukoglu, E. et al.) 231–249 (PMLR, 2022).
Glocker, B., Jones, C., Roschewitz, M. & Winzeck, S. Risk of bias in chest radiography deep learning foundation models. Radiol.: Artif. Intell. 5, e230060 (2023).
Seyyed-Kalantari, L., Zhang, H., McDermott, M. B., Chen, I. Y. & Ghassemi, M. Underdiagnosis bias of artificial intelligence algorithms applied to chest radiographs in under-served patient populations. Nat. Med. 27, 2176–2182 (2021).
Larrazabal, A. J., Nieto, N., Peterson, V., Milone, D. H. & Ferrante, E. Gender imbalance in medical imaging datasets produces biased classifiers for computer-aided diagnosis. Proc. Natl Acad. Sci. USA 117, 12592–12594 (2020).
Irvin, J. et al. CheXpert: a large chest radiograph dataset with uncertainty labels and expert comparison. In Proc. AAAI Conference on Artificial Intelligence, Vol. 33 (eds Hentenryck, P. V. & Zhou, Z. H.) 590–597 (AAAI, 2019).
Wang, L., Lin, Z. Q. & Wong, A. Covid-net: a tailored deep convolutional neural network design for detection of COVID-19 cases from chest X-ray images. Sci. Rep. 10, 19549 (2020).
Liu, F. et al. A medical multimodal large language model for future pandemics. npj Digit. Med. 6, 226 (2023).
Xiao, J., Bai, Y., Yuille, A. & Zhou, Z. Delving into masked autoencoders for multi-label thorax disease classification. In Proc. IEEE/CVF Winter Conference on Applications of Computer Vision (eds Crandall, D. et al.) 3588–3600 (IEEE, 2023).
Van der Maaten, L. & Hinton, G. Visualizing data using t-SNE. J. Mach. Learn. Res. 9, 2579–2605 (2008).
Acosta, J. N., Falcone, G. J., Rajpurkar, P. & Topol, E. J. Multimodal biomedical AI. Nat. Med. 28, 1773–1784 (2022).
Soenksen, L. R. et al. Integrated multimodal artificial intelligence framework for healthcare applications. npj Digit. Med. 5, 149 (2022).
Ye, M., Fang, X., Du, B., Yuen, P. C. & Tao, D. Heterogeneous federated learning: state-of-the-art and research challenges. ACM Comput. Surv. 56, 1–44 (2023).
Nguyen, H. Q. et al. VinDr-CXR: an open dataset of chest X-rays with radiologist’s annotations. Sci. Data 9, 429 (2022).
Anouk Stein, M. et al. RSNA Pneumonia Detection Challenge. Kaggle https://kaggle.com/competitions/rsna-pneumonia-detection-challenge (2018).
Jaeger, S. et al. Two public chest X-ray datasets for computer-aided screening of pulmonary diseases. Quant. Imaging Med. Surg. 4, 475 (2014).
Johnson, A. E. et al. MIMIC-CXR, a de-identified publicly available database of chest radiographs with free-text reports. Sci. Data 6, 317 (2019).
Tajbakhsh, N. et al. Convolutional neural networks for medical image analysis: full training or fine tuning? IEEE Trans. Med. Imaging 35, 1299–1312 (2016).
Zawacki, A. et al. SIIM-ACR pneumothorax segmentation. Kaggle https://kaggle.com/competitions/siim-acr-pneumothorax-segmentation (2019).
Sogancioglu, E. et al. Nodule detection and generation on chest X-rays: NODE21 challenge. IEEE Trans. Med. Imaging 43, 2839–2853 (2024).
Goldbaum, M., Kermany, D. & Zhang, K. Labeled optical coherence tomography (OCT) and chest X-ray images for classification. Mendeley Data https://doi.org/10.17632/rscbjbr9sj.2 (2018).
Liu, Y., Wu, Y.-H., Ban, Y., Wang, H. & Cheng, M.-M. Rethinking computer-aided tuberculosis diagnosis. In Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition (eds Liu, C. et al.) 2646–2655 (IEEE, 2020).
Khosla, P. et al. Supervised contrastive learning. In Proc. 33rd Advances in Neural Information Processing Systems (eds Larochelle, H. et al.) 18661–18673 (Curran Associates, 2020).
Oquab, M. et al. DINOv2: learning robust visual features without supervision. Transact. Mach. Learn. Res. https://openreview.net/forum?id=a68SUt6zFt (2024).
Xie, Z. et al. SimMIM: a simple framework for masked image modeling. In Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition (eds Dana, K. et al.) 9653–9663 (IEEE, 2022).
Chen, X., Fan, H., Girshick, R. & He, K. Improved baselines with momentum contrastive learning. Preprint at arxiv.org/abs/2003.04297 (2020).
Cohen, J. P., Hashir, M., Brooks, R. & Bertrand, H. On the limits of cross-___domain generalization in automated X-ray prediction. In Proc. Medical Imaging with Deep Learning (eds Arbel, T. et al.) 136–155 (PMLR, 2020).
Unal, I. Defining an optimal cut-point value in roc analysis: an alternative approach. Comput. Math. Methods Med. 2017, 3762651 (2017).
Jennewein, D. M. et al. The Sol supercomputer at Arizona State University. In Proc. Practice and Experience in Advanced Research Computing (eds Sinkovits, R. & Romanella, A.) 296–301 (ACM, 2023).
Song, C., Granqvist, F. & Talwar, K. Flair: federated learning annotated image repository. In Proc. 35th Advances in Neural Information Processing Systems (eds Koyejo, S. et al.) 37792–37805 (Curran Associates, 2022).
Yan, R. et al. Label-efficient self-supervised federated learning for tackling data heterogeneity in medical imaging. IEEE Trans. Med. Imaging 42, 1932–1943 (2023).
Acknowledgements
We thank M. R. Hosseinzadeh Taher for suggesting the term ‘zero-shot transfer’, A. Iyengar for assisting in evaluating the performance of KAD on the zero-shot transfer task, and T. Christenson and E. Sheets for assisting in revising the manuscript. This research has been supported in part by ASU and Mayo Clinic through a seed grant and an innovation grant and in part by the NIH (Award No. R01HL128785). The content is solely the responsibility of the authors and does not necessarily represent the official views of the NIH. This work used GPUs provided in part by the ASU Research Computing Core Facility57 and in part by the Bridges-2 at Pittsburgh Supercomputing Center and the Anvil at Purdue University through allocation MED220025 from the Advanced Cyberinfrastructure Coordination Ecosystem: Services & Support (ACCESS) programme, which is supported by the National Science Foundation (Grant Nos. 2138259, 2138286, 2138307, 2137603 and 2138296). We also acknowledge Google for granting us access to CXR Foundation and ELIXR API, which enabled us to generate the embeddings for the target datasets. We gratefully thank S. Antani, L. A. Celi, C. P. Langlotz, H. Q. Nguyen and R. Summers for allowing us to use their de-identified chest radiographs from the datasets Shenzhen44, MIMIC-II45, CheXpert34, VinDR-CXR42, ChestX-ray14 (ref. 19) (RSNA Pneumonia43), respectively, in the illustrations of this article. We particularly acknowledge the Stanford Office of Technology Licensing for permitting the use of three de-identified chest radiographs from the CheXpert dataset (Material) as shown in Fig. 1. The Material (CheXpert) is copyrighted ©2025 The Board of Trustees of the Leland Stanford Junior University. Permission to use this material was granted by Stanford University which reserves all rights in the material. The two pieces of cover artwork were created by V. Alrich and L. J. P. Liang, respectively, with conceptual input from J. Liang. The content of this paper is covered by pending patents.
Author information
Authors and Affiliations
Contributions
D.M. and J.L. contributed to the conception and design of the work. D.M. and J.P. contributed to the data acquisition and organization. D.M. contributed to the technical implementation, evaluation pipeline, results analysis and visualization for this work. J.P. contributed to the bias study experiments. M.B.G. provided the clinical inputs to the research. D.M. and J.L. contributed to drafting the manuscript, and all authors contributed to revising the manuscript. J.L. and M.B.G. secured the funding for the project. J.L. provided vision, insight and guidance for the research.
Corresponding author
Ethics declarations
Competing interests
D.M., J.P., J.L. and M.B.G. hold several patents.
Peer review
Peer review information
Nature thanks Namkug Kim and Andrew Sellergren for their contribution to the peer review of this work. Peer reviewer reports are available.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data figures and tables
Extended Data Fig. 1 Illustration of training pipeline of Ark+.
Ark+ is built on a teacher-student framework augmented with multi-task heads, each corresponding to a specific task, and employs cyclic pretraining to iteratively accrue and reuse knowledge. At each iteration, the student model sequentially scans datasets (tasks) one by one for one epoch, learning from expert annotations through the task-specific head. The knowledge accrued by the student is accumulated into the teacher via exponential moving averages (EMA), enabling the teacher to guide the student in subsequent tasks. To reinforce the feedback loop between the student and teacher, after their encoders, a projector is introduced to map the representations to the same feature space via the consistency loss. The projected representation also serves as the embedding for linear-probing in our evaluation. After pretraining, the accumulated knowledge in the teacher can be reused and transferred to target tasks. Differing from the previous design in Ark14, Ark+ feeds the teacher with the resized original image instead of random cropping. This update in data augmentation ensures the teacher provides a consistent and steady supervisory signal for computing the consistency loss, thereby accelerating training and enhancing performance. Images adapted with permission from ref. 19, IEEE.
Extended Data Fig. 2 Illustrating a federated learning scenario by distributing Ark+’s training across three local sites.
Ark+ can be federated by deploying a (local) Ark+ at each site to protect privacy and distribute training. In this setup, each local site trains its own Ark+ with all its available data, employing the same cyclic training strategy to train the student and the same epoch-wise EMA to update the teacher. After completing a round of local training, all sites send their student weights to a central server, where weights are averaged to aggregate these local models into a “master” model, consolidating knowledge from all sites. This master model is then distributed back to the local sites, allowing iterative learning and continuous improvement of the local teacher model. For simplicity, the projectors and multi-task heads are omitted from the illustration.
Extended Data Fig. 3 Illustration of how the embeddings for COVID-19, Pneumonia and Normal evolve in t-SNE38.
From Ark+ (a) to Ark++covid (b), an upgraded Ark+ model is created by incrementally and continually pretraining Ark+ with the COVID-19 diagnostic task. Ark++covid has more distinct embeddings for the three conditions, revealing its newly-acquired capacity for capturing the features specific to COVID-19. This capability can be further enhanced through fine-tuning (c). d–g illustrate how the embeddings for COVID-19, Pneumonia and Normal evolve in t-SNE from the pretrained Ark+ (d) to fine-tuning Ark+ with increasing numbers of samples continually (e–g). Ark+ obtains distinguishable embeddings when the training data reach 3,000, representing 10% of the full training set (f). This highlights Ark+’s ability to efficiently develop distinct feature representations, markedly enhancing its diagnostic accuracy and adaptability to new information.
Supplementary information
Supplementary Methods
The supplementary methods include three ablation studies to highlight the advantages of the Ark+ framework. These studies examine the multi-task head design (A.1) and the cyclic training approach (A.2). Furthermore, to demonstrate the performance improvements achieved with Ark+, fine-tuning baselines initialized from an ImageNet-pretrained model are provided for comparison (A.3).
Supplementary Data
The supplementary data includes the source data for the experimental results presented in the article, along with results for underperforming methods that were omitted from the result figures to maintain clarity and visual appeal.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Ma, D., Pang, J., Gotway, M.B. et al. A fully open AI foundation model applied to chest radiography. Nature (2025). https://doi.org/10.1038/s41586-025-09079-8
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41586-025-09079-8