A fully open AI foundation model applied to chest radiography

Ma, DongAo; Pang, Jiaxuan; Gotway, Michael B.; Liang, Jianming

doi:10.1038/s41586-025-09079-8

Article
Published: 11 June 2025

A fully open AI foundation model applied to chest radiography

Nature (2025)Cite this article

5813 Accesses
1 Citations
90 Altmetric
Metrics details

Subjects

Abstract

Chest radiography frequently serves as baseline imaging for most lung diseases¹. Deep learning has great potential for automating the interpretation of chest radiography². However, existing chest radiographic deep learning models are limited in diagnostic scope, generalizability, adaptability, robustness and extensibility. To overcome these limitations, we have developed Ark⁺, a foundation model applied to chest radiography and pretrained by cyclically accruing and reusing the knowledge from heterogeneous expert labels in numerous datasets. Ark⁺ excels in diagnosing thoracic diseases. It expands the diagnostic scope and addresses potential misdiagnosis. It can adapt to evolving diagnostic needs and respond to novel diseases. It can learn rare conditions from a few samples and transfer to new diagnostic settings without training. It tolerates data biases and long-tailed distributions, and it supports federated learning to preserve privacy. All codes and pretrained models have been released, so that Ark⁺ is open for fine-tuning, local adaptation and improvement. It is extensible to several modalities. Thus, it is a foundation model for medical imaging. The exceptional capabilities of Ark⁺ stem from our insight: aggregating various datasets diversifies the patient populations and accrues knowledge from many experts to yield unprecedented performance while reducing annotation costs³. The development of Ark⁺ reveals that open models trained by accruing and reusing knowledge from heterogeneous expert annotations with a multitude of public (big or small) datasets can surpass the performance of proprietary models trained on large data. We hope that our findings will inspire more researchers to share code and datasets or federate privacy-preserving data to create open foundation models with diverse, global expertise and patient populations, thus accelerating open science and democratizing AI for medicine.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on SpringerLink
Instant access to full article PDF

Buy now

Prices may be subject to local taxes which are calculated during checkout

**Fig. 1: Ark⁺ is an open foundation model applied to chest radiography.**

**Fig. 2: Performance on diagnosing common thoracic diseases.**

**Fig. 3: Evaluation of adaptability to evolving diagnostic tasks and the few-shot learning capability for detecting rare conditions.**

**Fig. 4: Performance for long-tailed thoracic diseases.**

**Fig. 5: Performance on recognizing common thoracic diseases in new settings without training.**

Enhancing ___domain generalization in the AI-based analysis of chest radiographs with federated learning

Article Open access 19 December 2023

Collaborative strategies for deploying artificial intelligence to complement physician diagnoses of acute respiratory distress syndrome

Article Open access 08 April 2023

Improving diagnosis accuracy with an intelligent image retrieval system for lung pathologies detection: a features extractor approach

Article Open access 03 October 2023

Data availability

All data are publicly available as follows: MIMIC-II (https://physionet.org/content/mimic-cxr-jpg/2.0.0/); CheXpert (https://stanfordmlgroup.github.io/competitions/chexpert/); NIH ChestX-ray14 (https://nihcc.app.box.com/v/ChestXray-NIHCC/folder/36938765345); RSNA Pneumonia Detection Challenge (www.kaggle.com/c/rsna-pneumonia-detection-challenge); VinDr-CXR (https://vindr.ai/datasets/cxr); Shenzhen Hospital X-ray set (https://data.lhncbc.nlm.nih.gov/public/Tuberculosis-Chest-X-ray-Datasets/Shenzhen-Hospital-CXR-Set/index.html); CXR-LT (https://physionet.org/content/cxr-lt-iccv-workshop-cvamd/1.1.0/); ChestDR (https://springernature.figshare.com/articles/dataset/ChestDR_Thoracic_Diseases_Screening_in_Chest_Radiography/22302775); SIIM-ACR (www.kaggle.com/c/siim-acr-pneumothorax-segmentation); TBX-11K (www.kaggle.com/datasets/usmanshams/tbx-11); Mendeley-V2 (www.kaggle.com/datasets/andrewmvd/pediatric-pneumonia-chest-xray); NODE21 (https://node21.grand-challenge.org/) and COVIDxCXR-3 (https://github.com/lindawangg/COVID-Net).

Code availability

The pretrained Ark⁺ model and source code are publicly available via GitHub at https://github.com/jlianglab/Ark.

References

Broder, J. S. Diagnostic Imaging for the Emergency Physician (ed. Broder, J. S.) Ch. 5, 185–296 (W. B. Saunders, 2011).
Çallı, E., Sogancioglu, E., van Ginneken, B., van Leeuwen, K. G. & Murphy, K. Deep learning for chest X-ray analysis: a survey. Med. Image Anal. 72, 102125 (2021).
Article PubMed Google Scholar
Tajbakhsh, N., Roth, H., Terzopoulos, D. & Liang, J. Guest editorial annotation-efficient deep learning: the holy grail of medical imaging. IEEE Trans. Med. Imaging 40, 2526–2533 (2021).
Article PubMed PubMed Central Google Scholar
Hosny, A., Parmar, C., Quackenbush, J., Schwartz, L. H. & Aerts, H. J. Artificial intelligence in radiology. Nat. Rev. Cancer 18, 500–510 (2018).
Article CAS PubMed PubMed Central Google Scholar
Zhou, Y. et al. A foundation model for generalizable disease detection from retinal images. Nature 622, 156–163 (2023).
Article CAS PubMed PubMed Central ADS Google Scholar
Huang, Z., Bianchi, F., Yuksekgonul, M., Montine, T. J. & Zou, J. A visual–language foundation model for pathology image analysis using medical twitter. Nat. Med. 29, 2307–2316 (2023).
Article CAS PubMed Google Scholar
Christensen, M., Vukadinovic, M., Yuan, N. & Ouyang, D. Vision–language foundation model for echocardiogram interpretation. Nat. Med. 30, 1481–1488 (2024).
Tiu, E. et al. Expert-level detection of pathologies from unannotated chest X-ray images via self-supervised learning. Nat. Biomed. Eng. 6, 1399–1406 (2022).
Article PubMed PubMed Central Google Scholar
Zhang, X., Wu, C., Zhang, Y., Xie, W. & Wang, Y. Knowledge-enhanced visual-language pre-training on chest radiology images. Nat. Commun. 14, 4542 (2023).
Article CAS PubMed PubMed Central ADS Google Scholar
Sellergren, A. B. et al. Simplified transfer learning for chest radiography models using less data. Radiology 305, 454–465 (2022).
Article PubMed Google Scholar
Azizi, S. et al. Robust and data-efficient generalization of self-supervised machine learning for diagnostic imaging. Nat. Biomed. Eng. 7, 756–779 (2023).
Article PubMed Google Scholar
Xu, S. et al. ELIXR: towards a general purpose X-ray artificial intelligence system through alignment of large language models and radiology vision encoders. Preprint at arxiv.org/abs/2308.01317 (2023).
Basdevant, A. et al. Towards a framework for openness in foundation models: proceedings from the Columbia Convening on openness in artificial intelligence. Preprint at arxiv.org/abs/2405.15802 (2024).
Ma, D., Pang, J., Gotway, M. B. & Liang, J. Foundation Ark: accruing and reusing knowledge for superior and robust performance. In Proc. International Conference on Medical Image Computing and Computer-Assisted Intervention (eds Greenspan, H. et al.) 651–662 (Springer, 2023).
Liu, Z. et al. Swin transformer: hierarchical vision transformer using shifted windows. In Proc. IEEE/CVF International Conference on Computer Vision (eds Hassner, T. et al.) 10012–10022 (IEEE, 2021).
Velan, S. S. Benchmarking and Boosting Localizers for Chest X-rays. Master’s thesis, Arizona State Univ. (2024).
Saravanan, M. Benchmarking and Boosting of 3D Segmentation Models. Master’s thesis, Arizona State Univ. (2024).
Islam, N. U. et al. Foundation X: integrating classification, localization, and segmentation through lock-release pretraining strategy for chest X-ray analysis. In Proc. IEEE/CVF Winter Conference on Applications of Computer Vision (eds Biswas, S. et al.) 3647–3656 (IEEE, 2025).
Wang, X. et al. Chestx-ray8: hospital-scale chest X-ray database and benchmarks on weakly-supervised classification and localization of common thorax diseases. In Proc. IEEE Conference on Computer Vision and Pattern Recognition (eds Cucchiara, R. et al.) 2097–2106 (IEEE, 2017).
Pérez-García, F. et al. Exploring scalable medical image encoders beyond text supervision. Nat. Mach. Intell. 7, 119–130 (2025).
Ma, D. et al. Benchmarking and boosting transformers for medical image classification. In Proc. MICCAI Workshop on Domain Adaptation and Representation Transfer (eds Kamnitsas, K. et al.) 12–22 (Springer, 2022).
Cho, K. et al. Chess: chest X-ray pre-trained model via self-supervised contrastive learning. J. Digit. Imaging 36, 902–910 (2023).
Article PubMed PubMed Central Google Scholar
Kang, M. et al. Label-assemble: leveraging multiple datasets with partial labels. In Proc. 20th International Symposium on Biomedical Imaging (eds Salvado, O. et al.) 1–5 (IEEE, 2023).
Lee, J. et al. Deep learning for rare disease: a scoping review. J. Biomed. Inform. 135, 104227 (2022).
Article PubMed Google Scholar
Yaqing, W., Quanming, Y., Kwok James, T. & Ni Lionel, M. Generalizing from a few examples: a survey on few-shot learning. ACM Comput. Surv. 53, 1–34 (2020).
Google Scholar
Holste, G. et al. CXR-LT: multi-label long-tailed classification on chest X-rays. PhysioNet 5, 19 (2023).
Google Scholar
Zhou, S. K. et al. A review of deep learning in medical imaging: imaging traits, technology trends, case studies with progress highlights, and future promises. Proc. IEEE 109, 820–838 (2021).
Article CAS Google Scholar
Wang, D. et al. A real-world dataset and benchmark for foundation model adaptation in medical image classification. Sci. Data 10, 574 (2023).
Article PubMed PubMed Central Google Scholar
Zhang, L. et al. Generalizing deep learning for medical image segmentation to unseen domains via deep stacked transformation. IEEE Trans. Med. Imaging 39, 2531–2540 (2020).
Article PubMed PubMed Central ADS Google Scholar
Cohen, J. P. et al. TorchXRayVision: a library of chest X-ray datasets and models. In Proc. International Conference on Medical Imaging with Deep Learning (eds Konukoglu, E. et al.) 231–249 (PMLR, 2022).
Glocker, B., Jones, C., Roschewitz, M. & Winzeck, S. Risk of bias in chest radiography deep learning foundation models. Radiol.: Artif. Intell. 5, e230060 (2023).
PubMed Google Scholar
Seyyed-Kalantari, L., Zhang, H., McDermott, M. B., Chen, I. Y. & Ghassemi, M. Underdiagnosis bias of artificial intelligence algorithms applied to chest radiographs in under-served patient populations. Nat. Med. 27, 2176–2182 (2021).
Article CAS PubMed PubMed Central Google Scholar
Larrazabal, A. J., Nieto, N., Peterson, V., Milone, D. H. & Ferrante, E. Gender imbalance in medical imaging datasets produces biased classifiers for computer-aided diagnosis. Proc. Natl Acad. Sci. USA 117, 12592–12594 (2020).
Article CAS PubMed PubMed Central ADS Google Scholar
Irvin, J. et al. CheXpert: a large chest radiograph dataset with uncertainty labels and expert comparison. In Proc. AAAI Conference on Artificial Intelligence, Vol. 33 (eds Hentenryck, P. V. & Zhou, Z. H.) 590–597 (AAAI, 2019).
Wang, L., Lin, Z. Q. & Wong, A. Covid-net: a tailored deep convolutional neural network design for detection of COVID-19 cases from chest X-ray images. Sci. Rep. 10, 19549 (2020).
Article CAS PubMed PubMed Central ADS Google Scholar
Liu, F. et al. A medical multimodal large language model for future pandemics. npj Digit. Med. 6, 226 (2023).
Article PubMed PubMed Central Google Scholar
Xiao, J., Bai, Y., Yuille, A. & Zhou, Z. Delving into masked autoencoders for multi-label thorax disease classification. In Proc. IEEE/CVF Winter Conference on Applications of Computer Vision (eds Crandall, D. et al.) 3588–3600 (IEEE, 2023).
Van der Maaten, L. & Hinton, G. Visualizing data using t-SNE. J. Mach. Learn. Res. 9, 2579–2605 (2008).
Google Scholar
Acosta, J. N., Falcone, G. J., Rajpurkar, P. & Topol, E. J. Multimodal biomedical AI. Nat. Med. 28, 1773–1784 (2022).
Article CAS PubMed Google Scholar
Soenksen, L. R. et al. Integrated multimodal artificial intelligence framework for healthcare applications. npj Digit. Med. 5, 149 (2022).
Article PubMed PubMed Central ADS Google Scholar
Ye, M., Fang, X., Du, B., Yuen, P. C. & Tao, D. Heterogeneous federated learning: state-of-the-art and research challenges. ACM Comput. Surv. 56, 1–44 (2023).
Google Scholar
Nguyen, H. Q. et al. VinDr-CXR: an open dataset of chest X-rays with radiologist’s annotations. Sci. Data 9, 429 (2022).
Article PubMed PubMed Central Google Scholar
Anouk Stein, M. et al. RSNA Pneumonia Detection Challenge. Kaggle https://kaggle.com/competitions/rsna-pneumonia-detection-challenge (2018).
Jaeger, S. et al. Two public chest X-ray datasets for computer-aided screening of pulmonary diseases. Quant. Imaging Med. Surg. 4, 475 (2014).
PubMed PubMed Central ADS Google Scholar
Johnson, A. E. et al. MIMIC-CXR, a de-identified publicly available database of chest radiographs with free-text reports. Sci. Data 6, 317 (2019).
Article PubMed PubMed Central Google Scholar
Tajbakhsh, N. et al. Convolutional neural networks for medical image analysis: full training or fine tuning? IEEE Trans. Med. Imaging 35, 1299–1312 (2016).
Article PubMed Google Scholar
Zawacki, A. et al. SIIM-ACR pneumothorax segmentation. Kaggle https://kaggle.com/competitions/siim-acr-pneumothorax-segmentation (2019).
Sogancioglu, E. et al. Nodule detection and generation on chest X-rays: NODE21 challenge. IEEE Trans. Med. Imaging 43, 2839–2853 (2024).
Goldbaum, M., Kermany, D. & Zhang, K. Labeled optical coherence tomography (OCT) and chest X-ray images for classification. Mendeley Data https://doi.org/10.17632/rscbjbr9sj.2 (2018).
Liu, Y., Wu, Y.-H., Ban, Y., Wang, H. & Cheng, M.-M. Rethinking computer-aided tuberculosis diagnosis. In Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition (eds Liu, C. et al.) 2646–2655 (IEEE, 2020).
Khosla, P. et al. Supervised contrastive learning. In Proc. 33rd Advances in Neural Information Processing Systems (eds Larochelle, H. et al.) 18661–18673 (Curran Associates, 2020).
Oquab, M. et al. DINOv2: learning robust visual features without supervision. Transact. Mach. Learn. Res. https://openreview.net/forum?id=a68SUt6zFt (2024).
Xie, Z. et al. SimMIM: a simple framework for masked image modeling. In Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition (eds Dana, K. et al.) 9653–9663 (IEEE, 2022).
Chen, X., Fan, H., Girshick, R. & He, K. Improved baselines with momentum contrastive learning. Preprint at arxiv.org/abs/2003.04297 (2020).
Cohen, J. P., Hashir, M., Brooks, R. & Bertrand, H. On the limits of cross-___domain generalization in automated X-ray prediction. In Proc. Medical Imaging with Deep Learning (eds Arbel, T. et al.) 136–155 (PMLR, 2020).
Unal, I. Defining an optimal cut-point value in roc analysis: an alternative approach. Comput. Math. Methods Med. 2017, 3762651 (2017).
Article PubMed PubMed Central Google Scholar
Jennewein, D. M. et al. The Sol supercomputer at Arizona State University. In Proc. Practice and Experience in Advanced Research Computing (eds Sinkovits, R. & Romanella, A.) 296–301 (ACM, 2023).
Song, C., Granqvist, F. & Talwar, K. Flair: federated learning annotated image repository. In Proc. 35th Advances in Neural Information Processing Systems (eds Koyejo, S. et al.) 37792–37805 (Curran Associates, 2022).
Yan, R. et al. Label-efficient self-supervised federated learning for tackling data heterogeneity in medical imaging. IEEE Trans. Med. Imaging 42, 1932–1943 (2023).
Article PubMed PubMed Central Google Scholar

Download references

Acknowledgements

We thank M. R. Hosseinzadeh Taher for suggesting the term ‘zero-shot transfer’, A. Iyengar for assisting in evaluating the performance of KAD on the zero-shot transfer task, and T. Christenson and E. Sheets for assisting in revising the manuscript. This research has been supported in part by ASU and Mayo Clinic through a seed grant and an innovation grant and in part by the NIH (Award No. R01HL128785). The content is solely the responsibility of the authors and does not necessarily represent the official views of the NIH. This work used GPUs provided in part by the ASU Research Computing Core Facility⁵⁷ and in part by the Bridges-2 at Pittsburgh Supercomputing Center and the Anvil at Purdue University through allocation MED220025 from the Advanced Cyberinfrastructure Coordination Ecosystem: Services & Support (ACCESS) programme, which is supported by the National Science Foundation (Grant Nos. 2138259, 2138286, 2138307, 2137603 and 2138296). We also acknowledge Google for granting us access to CXR Foundation and ELIXR API, which enabled us to generate the embeddings for the target datasets. We gratefully thank S. Antani, L. A. Celi, C. P. Langlotz, H. Q. Nguyen and R. Summers for allowing us to use their de-identified chest radiographs from the datasets Shenzhen⁴⁴, MIMIC-II⁴⁵, CheXpert³⁴, VinDR-CXR⁴², ChestX-ray14 (ref. ¹⁹) (RSNA Pneumonia⁴³), respectively, in the illustrations of this article. We particularly acknowledge the Stanford Office of Technology Licensing for permitting the use of three de-identified chest radiographs from the CheXpert dataset (Material) as shown in Fig. 1. The Material (CheXpert) is copyrighted ©2025 The Board of Trustees of the Leland Stanford Junior University. Permission to use this material was granted by Stanford University which reserves all rights in the material. The two pieces of cover artwork were created by V. Alrich and L. J. P. Liang, respectively, with conceptual input from J. Liang. The content of this paper is covered by pending patents.

Author information

Authors and Affiliations

School of Computing and Augmented Intelligence, Arizona State University, Tempe, AZ, USA
DongAo Ma & Jiaxuan Pang
Department of Radiology, Mayo Clinic, Phoenix, AZ, USA
Michael B. Gotway
Biomedical Informatics and Data Science, Arizona State University, Phoenix, AZ, USA
Jianming Liang

Authors

DongAo Ma
View author publications
Search author on:PubMed Google Scholar
Jiaxuan Pang
View author publications
Search author on:PubMed Google Scholar
Michael B. Gotway
View author publications
Search author on:PubMed Google Scholar
Jianming Liang
View author publications
Search author on:PubMed Google Scholar

Contributions

D.M. and J.L. contributed to the conception and design of the work. D.M. and J.P. contributed to the data acquisition and organization. D.M. contributed to the technical implementation, evaluation pipeline, results analysis and visualization for this work. J.P. contributed to the bias study experiments. M.B.G. provided the clinical inputs to the research. D.M. and J.L. contributed to drafting the manuscript, and all authors contributed to revising the manuscript. J.L. and M.B.G. secured the funding for the project. J.L. provided vision, insight and guidance for the research.

Corresponding author

Correspondence to Jianming Liang.

Ethics declarations

Competing interests

D.M., J.P., J.L. and M.B.G. hold several patents.

Peer review

Peer review information

Nature thanks Namkug Kim and Andrew Sellergren for their contribution to the peer review of this work. Peer reviewer reports are available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data figures and tables

Extended Data Fig. 1 Illustration of training pipeline of Ark⁺.

Ark⁺ is built on a teacher-student framework augmented with multi-task heads, each corresponding to a specific task, and employs cyclic pretraining to iteratively accrue and reuse knowledge. At each iteration, the student model sequentially scans datasets (tasks) one by one for one epoch, learning from expert annotations through the task-specific head. The knowledge accrued by the student is accumulated into the teacher via exponential moving averages (EMA), enabling the teacher to guide the student in subsequent tasks. To reinforce the feedback loop between the student and teacher, after their encoders, a projector is introduced to map the representations to the same feature space via the consistency loss. The projected representation also serves as the embedding for linear-probing in our evaluation. After pretraining, the accumulated knowledge in the teacher can be reused and transferred to target tasks. Differing from the previous design in Ark¹⁴, Ark⁺ feeds the teacher with the resized original image instead of random cropping. This update in data augmentation ensures the teacher provides a consistent and steady supervisory signal for computing the consistency loss, thereby accelerating training and enhancing performance. Images adapted with permission from ref. ¹⁹, IEEE.

Extended Data Fig. 2 Illustrating a federated learning scenario by distributing Ark⁺’s training across three local sites.

Ark⁺ can be federated by deploying a (local) Ark⁺ at each site to protect privacy and distribute training. In this setup, each local site trains its own Ark⁺ with all its available data, employing the same cyclic training strategy to train the student and the same epoch-wise EMA to update the teacher. After completing a round of local training, all sites send their student weights to a central server, where weights are averaged to aggregate these local models into a “master” model, consolidating knowledge from all sites. This master model is then distributed back to the local sites, allowing iterative learning and continuous improvement of the local teacher model. For simplicity, the projectors and multi-task heads are omitted from the illustration.

Extended Data Fig. 3 Illustration of how the embeddings for COVID-19, Pneumonia and Normal evolve in t-SNE 38.

From Ark⁺ (a) to Ark⁺_+covid (b), an upgraded Ark⁺ model is created by incrementally and continually pretraining Ark⁺ with the COVID-19 diagnostic task. Ark⁺_+covid has more distinct embeddings for the three conditions, revealing its newly-acquired capacity for capturing the features specific to COVID-19. This capability can be further enhanced through fine-tuning (c). d–g illustrate how the embeddings for COVID-19, Pneumonia and Normal evolve in t-SNE from the pretrained Ark⁺ (d) to fine-tuning Ark⁺ with increasing numbers of samples continually (e–g). Ark⁺ obtains distinguishable embeddings when the training data reach 3,000, representing 10% of the full training set (f). This highlights Ark⁺’s ability to efficiently develop distinct feature representations, markedly enhancing its diagnostic accuracy and adaptability to new information.

Extended Data Table 1 Comparative overview of Ark⁺ with nine large-scale pretrained models

Full size table

Extended Data Table 2 Overview of datasets utilized for evaluating Ark⁺

Full size table

Extended Data Table 3 Performance on diagnosing 19 thoracic diseases with long-tailed distributions

Full size table

Extended Data Table 4 Comparing four large-scale pretrained foundation models’ ability to tolerate sex-related bias

Full size table

Extended Data Table 5 Performance comparison on COVID-19 diagnostic task

Full size table

Extended Data Table 6 Performance comparison among centralized, isolated (local site), and federated Ark⁺

Full size table

Extended Data Table 7 Overview of datasets utilized for pretraining Ark⁺

Full size table

Supplementary information

Supplementary Methods

The supplementary methods include three ablation studies to highlight the advantages of the Ark⁺ framework. These studies examine the multi-task head design (A.1) and the cyclic training approach (A.2). Furthermore, to demonstrate the performance improvements achieved with Ark⁺, fine-tuning baselines initialized from an ImageNet-pretrained model are provided for comparison (A.3).

Reporting Summary

Peer Review File

Supplementary Data

The supplementary data includes the source data for the experimental results presented in the article, along with results for underperforming methods that were omitted from the result figures to maintain clarity and visual appeal.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Ma, D., Pang, J., Gotway, M.B. et al. A fully open AI foundation model applied to chest radiography. Nature (2025). https://doi.org/10.1038/s41586-025-09079-8

Download citation

Received: 06 July 2024
Accepted: 29 April 2025
Published: 11 June 2025
DOI: https://doi.org/10.1038/s41586-025-09079-8

A fully open AI foundation model applied to chest radiography

Subjects

Abstract

Access options

Similar content being viewed by others

Enhancing ___domain generalization in the AI-based analysis of chest radiographs with federated learning

Collaborative strategies for deploying artificial intelligence to complement physician diagnoses of acute respiratory distress syndrome

Improving diagnosis accuracy with an intelligent image retrieval system for lung pathologies detection: a features extractor approach

Data availability

Code availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Peer review

Peer review information

Additional information

Extended data figures and tables

Extended Data Fig. 1 Illustration of training pipeline of Ark⁺.

Extended Data Fig. 2 Illustrating a federated learning scenario by distributing Ark⁺’s training across three local sites.

Extended Data Fig. 3 Illustration of how the embeddings for COVID-19, Pneumonia and Normal evolve in t-SNE 38.

Supplementary information

Supplementary Methods

Reporting Summary

Peer Review File

Supplementary Data

Rights and permissions

About this article

Cite this article

An open AI model could help medical experts to interpret chest X-rays

Search

Quick links

Subjects

Abstract

Access options

Similar content being viewed by others

Data availability

Code availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Peer review

Peer review information

Additional information

Extended data figures and tables

Extended Data Fig. 3 Illustration of how the embeddings for COVID-19, Pneumonia and Normal evolve in t-SNE38.

Supplementary information

Rights and permissions

About this article

Cite this article

Share this article

Search

Quick links

Extended Data Fig. 3 Illustration of how the embeddings for COVID-19, Pneumonia and Normal evolve in t-SNE 38.