Introduction

Surgery is the optimal treatment for most pulmonary lesions, particularly lung cancer, the leading cause of cancer-related mortality worldwide. With an estimated annual global demand exceeding one million operations1,2, the increasing prevalence of early-stage lung cancer3,4 and a growing shortage of thoracic surgeons5 necessitates innovative solutions to enhance surgical efficiency and safety. Precise preoperative planning is the foundation to achieving this objective6, with accurate identification of anatomical variants and appropriate selection of operation procedures being critical steps.

Two-dimensional (2D) computed tomography (CT) remains the primary tool for preoperative planning in lung surgery. However, the inherent limitations of 2D imaging hinder intuitive visualization of complex anatomical structures, and potentially leading to misidentification, particularly within the distal pulmonary vasculature where anatomical variations are increasingly complex7. Three-dimensional (3D) reconstructions offer improved spatial understanding, facilitating more accurate surgical planning and execution7,8. Similar benefits have been observed in other surgical specialties, including urological9, oesophagogastric10, head and neck11, and pancreatic surgeries12. Despite these advantages, the routine adoption of 3D reconstructions remains constrained by the time-intensive nature of manual image segmentation13, resulting in utilization rates below 25% in major operations despite recognized benefits11.

The integration of artificial intelligence (AI) algorithms represents a critical advancement in 3D reconstruction, providing time-efficient and accurate models that match or exceed the performance of manual methods7,8,13. This has led to investigations into the clinical impact of AI-driven 3D models on both preoperative and perioperative outcomes. Our prior pilot study indicated that with AI driven 3D reconstruction, surgeons can achieve an accuracy of 85% in identifying anatomical variants7, compared to 78% using 2D CT13. Wang et al. 14 have reported that 3D reconstruction may reduce operation time by 12.4%, decrease stapler reload by 13.4%, and lower air leakage ratio by 61.5%. Similarly, Li et al. 8 have reported a 17.2% reduction in operation time with AI driven 3D reconstruction. Contradictory evidence was also reported. A randomized controlled trial (RCT) found no significant difference in operative times with or without AI driven 3D reconstruction15, raising questions about the magnitude and consistency of perioperative benefits. This discrepancy, potentially attributable to limited statistical power in the RCT, underscores the need to rigorously evaluate the direct impact of AI driven 3D reconstruction on preoperative planning before extrapolating to downstream perioperative outcomes.

To address this issue, we conduct a retrospective multi-center, multi-reader, multi-case (MRMC) study. In this study, we show the effectiveness of an AI-driven 3D reconstruction system (referred to as the AI-3D system) in improving the accuracy of anatomical structure identification and operation procedure selection during preoperative planning for anatomical lobectomy and segmentectomy.

Result

Baseline characteristics of the MRMC study

A total of 450 patients from three independent medical centers were consecutively enrolled according to inclusion and exclusion criteria, spanning the period from July 2021 to January 2022. From this cohort, 140 cases were randomly selected. The median age of the patient cohort was 58 years, with an interquartile range of 48.75 to 65 years. Among the participants, there were 76 women (54.3%) and 64 men (45.7%). Regarding the surgical procedures, 62 patients underwent lobectomy, while 78 patients underwent segmentectomy. The nodules were distributed across all lobes, with 49 (35%) located in the right upper lobe, 4 (2.9%) in the right middle lobe, 24 (17.1%) in the right lower lobe, 39 (27.9%) in the left upper lobe, and 24 (17.1%) in the left lower lobe. The most frequently observed histological type of lesion was invasive adenocarcinoma, accounting for 83 cases (59.3%). A comprehensive overview of the baseline characteristics is shown in Table 1.

Table 1 Characteristics of the enrolled patients

Ten thoracic surgeons from three medical centers participated as readers. The age of the readers spanned from 34 to 45, and the years of practicing ranged from 6 to 19 years (Table S1). A schematic of the study is shown in Fig. 1.

Fig. 1: Schematic illustration of the multi-reader multi-cases (MRMC) study.
figure 1

A total of 16 thoracic surgeons from three top-tier hospitals in China participated in this study. Three surgeons collected 450 patients and enrolled in the study according to inclusion and exclusion criteria. A total of 140 cases were randomly selected from eligible cohorts. Three expert surgeons from three centers established the golden standard based on collected surgical logs, videos, CT scans, and manually constructed 3D reconstructions. The other 10 surgeons were divided into two groups randomly and took part in the MRMC study in which two rounds of fully crossed pre-operative planning simulations were conducted with an interval of 4 weeks, either with or without the aid of AI-derived 3D reconstructions of pulmonary vessels.

Anatomical variant identification

In the primary analysis, AI-3D assistance exhibited a superior case-wise median accuracy of 0.87 compared to 0.78 without AI-3D in anatomical variant identification (p < 0.01, Fig. 2A). This improvement corresponded to a 41% reduction in identification error (RR = 0.59, 95% CI = 0.56 – 0.63). Consistent improvement was observed across 10 readers (Fig. 2B), 5 lobes (Fig. S2A) and 90% (35 / 39) anatomical structures (Fig. 2C, and Supplementary figs. S2B–D). An illustrative case demonstrating this improvement is shown in Fig. S2E and its result in Table S2. These results indicate that AI-3D assistance significantly enhances surgeons’ ability to accurately identify anatomical variants during preoperative planning.

Fig. 2: The anatomical variant identification analysis.
figure 2

A Significant improvement of the overall case-wise accuracy of anatomical variant identification. (** stands for statistical significance p < 0.01. Two-sided Mann-Whitney test was used). B Similar degree of improvement of the case-wise accuracy of anatomical variant identification in each reader. C The improvement in accuracy among 35 / 39 anatomical structures. D Positive correlation existed between variant prevalence and identification accuracy. (The error bar stands for the 95% CI of the regression line). E Higher accuracy in anatomical variant identification was observed with AI-3D assistance comparing to control method across different variant prevalence. The difference tended to be larger in lower prevalence variants. (The error bar stands for the 95% CI of the regression line.) Source data are provided as a Source Data file.

A positive correlation was found between identification accuracy and variant prevalence (R2 = 0.68, 95% CI = 0.59 – 0.76, Fig. 2D). Interestingly, a trend toward larger improvement by AI-3D system was observed in variants with lower than with higher prevalence (Fig. 2E), indicating a potential benefit for AI-3D in identifying less common variants (Table S3, S4).

Identification errors in anatomical variants can arise from two reasons: misidentification of the variant and missing value. The latter occurs due to incorrect selection of the operation procedure, which can introduce potential bias into the analysis. To address this issue, we excluded cases with unanswered structure and re-evaluated the impact of AI-3D assistance on anatomical variant identification. Notably, the beneficial effects of AI remained evident after this exclusion (AI vs Ctrl: 0.86 vs 0.79, RR = 0.64, 95% CI = 0.57 – 0.82, Fig. S2F, S2G), demonstrating that the improvements in accuracy are robust and not merely artifacts of procedural errors.

Operation procedure selection

The accuracy for operation procedure selection was improved from 0.77 to 0.85 with AI-3D assistance (estimated improvement 0.08, 95% CI = 0.04 – 0.12, Fig. 3A, B). This improvement corresponded to a 35% reduction in error (RR = 0.65, 95% CI = 0.54 – 0.77). In a binary comparison of the selection between lobectomy and segmentectomy, a 0.04 (95% CI = 0.01 – 0.07) increase in the accuracy was noted under AI-3D assistance (Fig. 3C). The improvement was more pronounced in lobectomy compared to segmentectomy cases (Fig. S3A) as more correction rather than misleading by AI-3D assistance was seen in lobectomy cases (Fig. S3B). An illustrative case is shown in Fig. S3C and its result in Table S5.

Fig. 3: The operation procedure selection analysis.
figure 3

A The accuracy of operation procedure selection was improved by AI-3D assistance. B The accuracy of operation procedure selection of each reader was improved by AI-3D assistance. (n = 140, the error bar stands for the 95% CI). C Binary selection between lobectomy and segmentectomy was improved by AI-3D assistance. D The heatmap demonstration of error types of resection extent determination with or without AI-3D. E Insufficient and mistaken resection were profoundly reduced with AI-3D assistance. (n = 1400, the error bar stands for the upper side of the 95% CI, the lower side is hidden to avoid overlaying numbers presented in the chart). Source data are provided as a Source Data file.

A comprehensive error analysis revealed that AI-3D assistance improved the accuracy of operation procedure selection in favor of certain error types (Fig. 3D, and Supplementary figs. S3D, S3E). Mistaken resection, defined as resecting the wrong lesion, was reduced by 73% (RR = 0.27, 95% CI = 0.16 – 0.45, Fig. 3E). Insufficient resection, characterized by inadequate resection margin, was decreased by 51% (RR = 0.49, 95% CI = 0.38 – 0.70, Fig. 3E). On the contrary, the reduction in excessive resection, which involves removing more lung than necessary, was minimal at 2% (RR = 0.98, 95% CI = 0.77 – 1.26, Fig. 3E). These findings highlight the primary role of AI in operation procedure selection: it effectively pinpoints the target lesion and minimizes the risk of insufficient resection due to misinterpretation of resection margins.

Time consumption, confidence, and interobserver agreement

AI-3D assistance reduced the median time consumption for surgery planning by 63 seconds (95% CI = 42 – 78), representing a 25% decrease (RR = 0.75, 95% CI = 0.69 – 0.80. Figure 4A, B). Notably, cases that initially required longer planning times were associated with lower accuracy, but these cases showed more substantial improvements with AI-3D assistance, indicating that the AI system is particularly beneficial in more challenging scenarios. (Fig. S4). The inference time for the 3D reconstruction itself was 233.76 ± 75.08 seconds. While a direct comparison is beyond the scope of this study, prior experience suggests manual segmentation could take up to 30 minutes.

Fig. 4: The time consumption and confidence analysis.
figure 4

A The overall time consumption was decreased by AI-3D assistance. B Time consumption difference between AI-3D and 2D approach among 10 readers. C The confidence of operation procedure selection was improved by AI-3D assistance. (n for anatomical variant identification are 11,903 with AI-3D and 11471 with 2D; n for operation approach selection is 1382 with or without AI-3D; the error bar stands for the 95% CI). D Positive correlation was observed between operation procedure selection and readers’ confidence level. (The error bar stands for the 95% CI of the regression line). E Positive correlation was observed between anatomical variant identification and readers’ confidence level. (The error bar stands for the 95% CI of the regression line). Source data are provided as a Source Data file.

In the post-hoc analysis of confidence, cases were categorized as confident (confidence score = 100) and non-confident (confidence score < 100), allowing for a binary analysis. The use of AI notably increased the proportion of confident cases in both anatomical variant identification (RR = 1.31, 95% CI = 1.29 – 1.34, Fig. 4C) and operation procedure selection (0.67 vs 0.57, RR = 1.17, 95% CI = 1.10 – 1.25, Fig. 4C). Higher confidence was correlated with higher accuracy in both anatomical variant identification and operation procedure selection (Fig. 4D, E).

Agreement among readers was significantly improved by AI-3D assistance in both anatomical variant identification (κ = 0.43 vs 0.33) and operation procedure planning (κ = 0.76 vs 0.70) (Table 2).

Table 2 Interobserver agreement

Satisfaction with the AI-3D system

As for the independent performance of the model, the overall satisfaction was 99%. Among the 1% (14/1400) cases that were reported with unsatisfactory reconstruction performance, 43% (6/14) were structure masked by the pulmonary lesion, 29% (4/14) were unsatisfied distal branches, 14% (2/14) were classification errors (pulmonary veins misidentified as arteries), 14% (2/14) were unclear demonstration without detailed description. Unsatisfactory reconstruction posed minimal influence on the accuracy (0.82) of identifying anatomical variants. The high satisfaction rate underscores the usability and reliability of the AI-3D system in clinical practice.

Discussion

In this study, we highlighted the pronounced positive impact of AI-3D assistance in improving anatomical variant identification and operation procedure selection across medical centers. Our results demonstrated accuracy improvements averaged around 10%, leading to a substantial reduction in errors ranging from 36% to 52%. Such enhancements are crucial for surgeons, as even minor errors can have severe repercussions for patient outcomes. Therefore, the reduction in errors achieved by the AI-3D system is both statistically significant and clinically meaningful.

AI has emerged as a crucial factor in augmenting the accessibility of 3D reconstruction in medical imaging. Traditional computer vision methods were hindered by several drawbacks: they are time-intensive, often requiring 0.5 to 2 hours for reconstruction16,17, which was impractical in busy clinical environments; the reconstruction quality was often impaired by artifacts18 such as staircase effects or blockiness that required manual correction; accuracy was limited, especially in non-contrast CT scans, as these methods depend on the contrast among arteries, veins and soft tissue. The integration of AI has addressed these issues by profoundly enhancing efficiency, decreasing artifacts, and improving accuracy in non-contrast CT16,19,20. In our study, the time required for the reconstruction process was reduced by 20-fold, and image interpretation time was decreased by 25%. Compared to traditional manual reconstruction, these enhancements saved more than 30 minutes in the planning process. Furthermore, the reconstruction quality and accuracy were satisfactory in 99% of cases. These findings suggest that the integration of AI algorithm has strengthened 3D reconstruction, making it an accessible and reliable auxiliary tool for thoracic surgery planning.

The indication of using AI-3D should be considered from both patients’ and surgeons’ perspectives. While our results showed significant benefits across all patients, certain subpopulations may particularly benefit from AI-3D assistance. Patients with low prevalence anatomical variants, cases that typically require longer planning times, and scenarios where surgeons have lower confidence levels are more prone to inaccuracies. These factors may serve as practical indicators for utilizing AI-3D reconstruction algorithms. However, given the potentially devastating consequences of misidentifying anatomical structures, such as the mis-ligation of arteries or massive bleeding, we advocate using the AI-3D system in both lobectomy and segmentectomy cases, especially in more complex and unfamiliar procedures, such as complex segmentectomies.

Our study showed that AI-3D assistance is beneficial for surgeons across all experience levels. Although some studies have demonstrated that less experienced clinicians (e.g., trainees), benefit more from AI assistance than their more experienced peers21,22,23, our results showed a significant impact on surgeons who have been practicing for 6 – 19 years, ranging from junior to senior attendings alike. Notably, interobserver agreement significantly improved with AI-3D assistance, indicating that the system helps decrease errors and unify surgical decisions regardless of experience level. Furthermore, the improvement observed in our study quantitatively aligns with findings from previous publications8,20, suggesting that this approach may be widely applicable among various surgeons.

Although no unforeseen complications introduced by human-AI interaction were observed, intrinsic risks in applying AI in medicine should not be neglected. Several studies have underscored an increase in sensitivity with a decrease in specificity21,24,25,26,27. This was particularly evident in the early AI application in thoracic surgery such as pulmonary nodule detection. These algorithms struggled to minimize false positives. For convenience, rule-based constraints were usually implemented, sacrificing the flexibility and generalizability of the algorithm16. In our study, the risk of misleading in selection of operation procedures, especially for potential segmentectomy candidates, should be noted. While regarding anatomical variant identification, the AI-3D system outputs the anatomical structure instead of a binary conclusion. The output serves as an intermediate step in the surgery planning process and is subsequently interpreted by the surgeon, and thus shows minimized misleading risk. This success highlighted the advantage of the human-in-the-loop paradigm where AI works collaboratively with human expertize, rather than replacing it28.

This study has several limitations. First, while the identification of anatomical variants is crucial for surgical planning, it does not directly translate to improved surgical outcomes, such as reduced intraoperative bleeding, shorter operative times, or fewer complications. Previous studies have yielded mixed results regarding the impact of AI-driven 3D reconstruction on surgical outcomes8,15. For instance, while Li et al.8 reported improvements in operative times, the only randomized controlled trial (RCT) to date found no significant difference in operative times with or without AI-3D assistance15. This conclusion, however, may be due to an underpowered study design. The hypothesized 14% decrease in operative time is larger than the degree of the benefit we observed in preoperative planning (8% increase in identification accuracy). Additionally, the predominance of simple segmentectomy cases in the RCT likely diluted the potential effects of AI-assisted preoperative planning. Furthermore, most errors in identifying anatomical variants during planning are corrected intraoperatively, which may further obscure the impact of AI-3D assistance on surgical outcomes. Nevertheless, it is reasonable to infer that improved preoperative understanding of anatomical structures could improve surgical outcomes. In cases involving interlobular arteries or veins, prior knowledge could prevent bleeding caused by blunt dissection of the interlobular fissure; in segmentectomy cases, the misidentification of segmental pulmonary arteries and veins may cause mis-ligation of these structures. A better understanding of anatomical structures may reduce the need for intraoperative observation and potentially shorten operative times, as was noted in Li et al.‘s study8. Future research, including RCTs for a certain subset of patients, is required to establish a more definitive relationship between AI-3D assistance and patient outcomes.

Second, the MRMC study underrepresented rare anatomical variants (e.g., independent upper pulmonary vein, bronchus suis, pulmonary artery sling), and the algorithm’s and surgeons’ performance in such cases remains to be evaluated. A dedicated database of these rare variants is currently being compiled for future validation.

Third, the selection of surgical procedure was based on a resection margin no less than the diameter of the nodule on chest CT. However, in real-world clinical practice, the ideal resection margin is more closely related to the pathological margin and the local recurrence rate. These important considerations fall beyond the scope of this study and warrant further investigation in future research.

Fourth, although exploratory analyzes suggested a greater benefit for less common variants and less confident cases, these analyzes were not pre-specified. Furthermore, the current sample size may be insufficient to definitively establish the presence of these beneficial subgroups. While these trends offer valuable potential directions for future research, they require confirmation in a larger, pre-specified study.

Finally, the observed improvements in our study were less pronounced than clinically anticipated. This may be due to the inherent limitations of MRMC design, which may not fully capture the complexities of real-world clinical workflows, including data interpretation sequence, workload, teamwork dynamics, and consequences of misinterpretation. These factors can introduce bias in assessing the perceived value of AI-3D reconstruction29,30. Nevertheless, the MRMC design offers a valuable balance between controlling for bias and reflecting real-world variability, enabling a rigorous evaluation of the AI-3D system’s clinical value30.

In conclusion, the AI-3D system enhances the accuracy of anatomical variant identification during preoperative planning for lung surgery. This improvement, coupled with increased accuracy in operation procedure selection and a reduction in planning time, benefits surgeons across a range of experience levels. While these findings suggest the potential for improved surgical outcomes and expanded surgical capacity, further research, including prospective trials incorporating perioperative patient outcomes is necessary to definitively establish the clinical value of the AI-3D system.

Methods

Ethics statement

The study was approved by the Institutional Review Boards of Peking University People’s Hospital (2021PHA038-002), Shanghai Pulmonary Hospital (21Q035XW-1), and the Second Xiangya Hospital of Central South University (2021-K380). A waiver of consent was granted due to the retrospective nature and minimal risk of the study. All data were anonymized to protect patients’ privacy.

Study design and participants

The study was approved by the Institutional Review Boards of Peking University People’s Hospital (2021PHA038-002), Shanghai Pulmonary Hospital (21Q035XW-1), and the Second Xiangya Hospital of Central South University (2021-K380).

This retrospective, multi-center, MRMC study (Fig. 1) was designed to evaluate the effectiveness of an automatic surgery planning software for chest computed tomography (CT) images (the AI-3D system). The primary objective was to compare the accuracy of thoracic surgeons in identifying anatomical structures during preoperative planning for anatomical lobectomy and segmentectomy, with and without the assistance of the AI-3D system.

Patients were retrospectively enrolled from three centers: Peking University People’s Hospital, Shanghai Pulmonary Hospital, and the Second Xiangya Hospital of Central South University, between July 2021 and January 2022. Eligible patients were adults aged 18 years or older who had undergone anatomical lobectomy or segmentectomy and had high-quality chest CT images in Digital Imaging and Communications in Medicine (DICOM) format with a slice thickness of ≤ 2 mm and no interval reconstruction. Exclusion criteria included patients with prior lung surgery or chest trauma, poor-quality imaging due to non-standard scanning or motion artifacts, trans-lobar invasion of target nodules, or incomplete clinical data (Supplementary note 1).

A total of 450 patients from 3 centers were retrospectively collected based on inclusion and exclusion criteria by 3 thoracic surgeons. From this pool, 140 patients were randomly selected to constitute the reader study dataset.

Ten board-certified thoracic surgeons participated as readers in the study. All readers held at least the title of attending physician, had a minimum of one year of experience in thoracoscopic surgery, possessed medical practitioner qualification certificates, and had Good Clinical Practice certification.

An expert panel of three senior thoracic surgeons, independent of the readers, established the gold standard by reviewing 2D CT, manual 3D reconstruction, surgical videos and logs. The surgical margin was suggested to be no less than the diameter of the nodule on chest CT. The panel consisted of thoracic surgeons with associate senior clinical titles or above. Disagreements were resolved through arbitration by a third expert to establish the gold standard.

AI-3D system for thoracic surgery planning

The AI-3D system (InferOperate Thorax) was developed using deep learning algorithms (Supplementary note 2, Fig. S1). For bronchial segmentation, a patch-based three-dimensional U-Net architecture was employed. Lung vessel segmentation combined 2.5-dimensional and three-dimensional models to optimize performance and computational efficiency. Specifically, the 2.5D model segmented pulmonary blood vessels within the lung parenchyma, while the 3D network focused on mediastinal vessels. A region-growing algorithm extended the 3D segmentation results into the peripheral lungs (Fig. S1). The segmented bronchi and vessels were converted into a triangular patch using MarchingCube, and RayCasting with DepthPeeling was used for efficient 3D rendering to assist surgical planning. A lesion detection module previously developed was embedded in the system.

The algorithmic performance of the system was detailed in Supplementary note 2.

Randomization and procedures

The study employed a two-stage, fully crossed, crossover design to reduce bias. Readers were randomly assigned to two groups using a computer-generated randomization schedule. In the first stage, Group A performed preoperative planning for all cases with AI-3D assistance, while Group B planned without AI-3D assistance. After a washout period of at least 28 days to mitigate memory bias, the 2 groups crossed over in the second stage: Group A planned without AI-3D assistance, and Group B with AI-3D assistance. The sequence of case presentation was randomized in each stage and the surgery performed in the real world was blinded to further minimize potential biases.

Prior to the study, readers received standardized training on the use of the AI-3D system and the assessment procedures. Training included operating the software, interpreting AI-3D reconstructions, and completing the standardized surgical planning forms.

Readers assessed each case using a standardized surgical planning form focused on identifying anatomical variants and selecting the operation procedure (Supplementary note 3). The time taken for each planning session was recorded in seconds using a screen timing device. For each question, readers assigned a confidence score based on their certainty in their planning decisions, with higher scores reflecting greater confidence. After each use of the AI-3D system, readers completed a questionnaire to evaluate user satisfaction on a 5-point Likert scale.

End points

The primary endpoint was the case-wise accuracy of anatomical structure identification. Accuracy per case was calculated as the number of anatomical structures correctly identified by the reader divided by the total number of anatomical structures determined by the expert panel.

Secondary endpoints included 1) Time efficiency: The time required for anatomical structure identification during preoperative planning, measured from the start of planning to the completion of the anatomical structure identification form; 2) Surgical procedure selection accuracy: The consistency rate between the readers’ surgical procedure determinations and those of the expert panel; 3) User satisfaction: Surgeons’ satisfaction levels with the AI-3D system, assessed using a Likert scale ranging from 1 (very dissatisfied) to 5 (very satisfied).

Statistical analysis

Based on a pilot study involving four thoracic surgeons and 20 patients, which showed an effect size of 0.13 in the accuracy difference of identifying anatomical structures with and without the AI-3D system, a conservative effect size of 0.7 within each of the ten participating surgeons was assumed. Using these parameters, it was determined that at least 112 chest CT scans were required to achieve 80% power at a two-sided significance level of 5%. To account for potential dropouts and missing data, 140 cases were included.

The Dorfman-Berbaum-Metz-Hillis (DBMH) method was used to analyze the effectiveness of the AI-3D system in the multi-center MRMC study. The null hypothesis was that the readers’ average identification accuracy with AI-3D assistance was less than or equal to that without AI-3D assistance. Continuous data were compared using t-tests for normal distributions (presented as mean ± standard deviation) and Mann-Whitney tests for skewed distributions (presented as median [P25, P75]). Simple linear regression was used for regression tasks, with Weighted Least Squares (WLS) regression employed when data count disparities were too large. Fleiss’ Kappa statistic was used to assess interobserver agreement among multiple readers. P < 0.05 was considered statistically significant. Data was analyzed using SPSS 23.0, SAS 9.4, Python, and R studios. An independent review committee assessed the primary and secondary endpoints of the MRMC in parallel with the investigator.

Missing data were addressed as follows: In the primary endpoint analysis assessing the accuracy of anatomical variant identification, 918 of 24,400 (3.8%) instances had missing identifications. These were considered misidentifications, reflecting a worst-case scenario. An exploratory analysis excluding these cases was also conducted to assess the robustness of the primary endpoint. Additional missing data points were excluded from their respective analyzes, including time consumption data (24/2,800, 0.9%), confidence ratings for operation procedure selection (18/2,800, 0.6%) and for variant identification (1026/24,400, 4.2%).

Study registration

This study, conducted for National Medical Products Administration product registration, was not registered in a public database to protect trade secrets. The clinical study complied with Good Clinical Practice provisions and completed the premarket clinical study in three centers. The product registration acceptance number is CQZ2301043. Data entry and management were completed by the Statistics Teaching and Research Department of Capital Medical University, while data analysis and statistical analysis reports were completed by the Biostatistics Department of the School of Public Health of Peking University, both independent third parties.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.