Introduction

Worldwide, colorectal cancer (CRC) is a highly prevalent cancer type, with millions of new cases being reported annually1. While advances in surgical techniques and adjuvant therapies have significantly improved patient survival outcomes2,3,4, the post-treatment quality of life (QoL) of patients with CRC has received insufficient attention. Bowel resection and anastomosis can result in lifestyle changes that negatively affect patients’ QoL5,6. Moreover, there has been a recent emphasis on the importance of dietary management and physical activity in patients with CRC. Physical activity reduces the risk of CRC recurrence, while a high-insulinogenic diet is associated with an increased risk of recurrence and mortality7,8,9. Therefore, postoperative lifestyle modification in patients with CRC could further improve patient outcomes and QoL. However, implementing personalized lifestyle modifications using conventional postoperative management protocols presents a significant challenge.

The time interval for routine post-surgery surveillance for patients with CRC is every 6 to 12 months, for up to 5 years10. In this context, it is challenging for clinicians to obtain comprehensive information regarding patients’ lifestyles, making personalized consultations impractical. To overcome time and distance obstacles, portable devices such as smartphones have been proposed for patient management11,12. Technological advances have allowed daily information collection via digital devices, including step counts, calories consumed, dietary habits, and real-time patient feedback. Accordingly, Chung et al. reported that the use of smartphone applications and wearable armbands significantly improved distress levels12. Moreover, mobile applications have been applied to patients with CRC before and after surgery to enhance self-management, nutrition and diet management, education, and mental health13,14,15.

The interest and evidence regarding the use of digital interventions for improving healthcare has significantly increased. In 2022 alone, nearly 10,000 related papers were registered in the National Center for Biotechnology Information Literature Resources (search keyword “digital intervention” & “health”), which is thrice the number of papers registered since 2019. Currently, more than 50,000 medical applications exist. Previously, we proposed using smartphone applications to assess patients’ daily lifestyles and offer personalized advice16. We selected three mobile health (mHealth) applications with different functions and goals. While one application guides users based on cancer-specific information, the other two applications focus on weight control by recording body weight, caloric intake, and exercise routines. These applications have distinct characteristics and were included to evaluate the overall impact of easily accessible commercial mHealth applications—whether cancer-specific or not—on QoL. However, no solid evidence has demonstrated that mobile applications meet the high expectations of digital healthcare. Several studies have reported that using digital healthcare has an insufficient or no effect17,18,19. Recent studies have reported the possibility of CRC patients utilizing mobile devices at home after surgery to detect complications or monitor physical activity; however, the results of these studies show that the objective benefits of mHealth devices in clinical outcomes are pending20,21,22. To date, systematic reviews have shown that the wide range of digital healthcare platforms has not been extensively evaluated; thus, the values or demerits of such applications are inconclusive23,24. Accordingly, high-quality research on the effectiveness of digital healthcare is warranted.

This study presents the 6-month outcomes of a previously proposed trial. Specifically, we aimed to evaluate the effectiveness of a smartphone healthcare application intended to assist patients in managing their lifestyles following curative surgery for CRC. The results offer insights into the potential of digital interventions in encouraging healthy lifestyles and improving outcomes after CRC surgery.

Results

Patient enrollment and follow-up

Initially, 579 patients were eligible for screening, of whom 259 were excluded: 13 declined to participate, and 246 met the exclusion criteria. Ultimately, 320 patients were included in the study. The first patient was enrolled on November 4, 2020, while the last patient was enrolled on November 5, 2021. Baseline values were complete for all 320 patients. At the 6-month follow-up, 298 (93.1%) patients visited the outpatient clinic for evaluation, with 277 (86.6%) patients completing the questionnaires. The most common reason for loss to follow-up was the COVID-19 pandemic, which was surging at the time of the trial (Fig. 1).

Fig. 1
figure 1

Flowchart of the study’s screening and inclusion process.

Groups A, B, C, and D had 81, 80, 79, and 80 participants, respectively. Baseline characteristics were comparable across all groups. Similarly, the EQ-5D index scores at baseline were similar across all groups; however, they were slightly higher in Groups C and D. This trend of higher QoL scores was also observed on the SF-12 and HINT-8, albeit without statistical significance. Other than the significantly higher triglyceride levels in Group C, the remaining metabolic parameters were comparable across groups (Table 1). Postoperative morbidity and mortality rates were comparable across all groups (Table 2). Early postoperative complications occurred in 11, 15, 13, and 9 patients in Groups A to D, respectively (P = 0.567). Ileus was the most common early complication, and all groups experienced 4 or 5 cases of complications classified as Clavien-Dindo Grade 3 or higher (P = 0.669), with no mortality cases.

Table 1 Clinicopathological characteristics of the study participants
Table 2 Postoperative complications

Clinical outcomes at 6 months postoperative

At six months postoperative, no significant between-group differences were observed for the EQ-5D index scores (F = 0.452, P = 0.716). Additionally, although the EQ-5D index score showed greater improvement in all three intervention groups than in the control group at six months postoperative, the difference was not statistically significant (F (3,272) = 1.872, P = 0.134; Fig. 2). Similarly, the improvement on the other QoL measurement tools (SF-12, FACT-C, and HINT-8) did not significantly differ between the intervention and control groups (Table 3).

Fig. 2: Box plots comparing the EQ-5D dimension results for each intervention and control group.
figure 2

a EQ-5D index score at six months postoperative; (b) changes in the EQ-5D index scores from baseline to six months postoperative.

Table 3 Comparison of questionnaire scores between baseline and six-month follow-up

Among the metabolic parameters and fat/muscle areas, the repeated-measures ANOVA revealed a significant difference in the skeletal muscle area (SMA; F (3291) = 2.692, P = 0.046). The post hoc analysis that compared the intervention groups with the control group showed that Group C had a significantly greater improvement in the SMA at six months postoperative (4.02 cm2 vs. –4.21 cm2). The improvements in the other outcomes did not significantly differ between the intervention and control groups (Table 4).

Table 4 Comparison of metabolic parameters between baseline and six-month follow-up

Subgroup analysis according to compliance with application usage

Login data from the six-month study period were available for 235 (97.9%) patients in the intervention groups. The mean number of days of application usage over the study period was 41.0 (SD = 19.2). No significant differences were found between the patients who used the application more (n = 117) and less (n = 116) than the mean value (SD), in terms of age (54.1 (8.58) years vs. 54.9 (8.97) years, P = 0.514) or sex distribution (male 45.4% vs. 54.6%, P = 0.098). Subsequently, we performed a subgroup analysis comparing patients who used the application more than the mean value with the control group. The repeated-measures ANOVA revealed a significant between-group difference in the EQ-5D index scores, while the post hoc analysis did not show significant differences in the pairwise comparisons (Supplementary Table 1). Similar to the ITT analysis, only the SMA showed a significant difference between the metabolic parameters. Post hoc analysis showed significantly greater improvement in Group C’s SMA at six months postoperative compared with the control group, (18.94 cm2 vs. –4.21 cm2; Supplementary Table 2). An identical analysis was conducted using the login data from the first 12 weeks of the study period, which yielded results similar to those at 6 months (Supplementary Tables 3, 4).

The three intervention groups showed distinct compliance patterns. Groups A and B had a higher percentage of patients in the long-use cluster (45.7% and 24%, respectively) than Group C (5.0%; Supplementary Fig. 1).

Discussion

This study examined the impact of digital intervention through mobile applications designed to help patients with CRC manage their lifestyle following curative surgical resection. We assessed the effects of three different mHealth applications in a prospective cohort of individuals with CRC. The current short-term results indicate that using mobile applications did not significantly affect the EQ-5D index score. However, we found a significant improvement in the SMA in the intervention group that used an application (Second Doctor©) that provided coaching to patients based on cancer-specific information.

Recently, interest in digital interventions in healthcare has risen, partly because of the COVID-19 pandemic, which made remote medical care essential during quarantine. While Internet-mediated interventions using stationary desktop computers were the primary focus in the initial studies on digital interventions, this approach has been noted for its reduced effectiveness owing to decreased patient adherence over time25. Recently, health-related mobile applications have become a research focus, and various studies have demonstrated their effectiveness in facilitating communication between clinicians and patients26,27. Research has shown promising results regarding the efficacy of mobile digital interventions for patients with breast cancer, including an increase in step counts (P < 0.0001) and a decrease in distress scores (P = 0.009)26. However, this remains to be established in patients with CRC, given the limited number of published randomized or retrospective studies assessing mHealth applications.

The lack of significant improvement in our study suggested that the selected mobile applications had a limited impact on the overall well-being of patients with CRC, which could have occurred for multiple reasons. The primary function of mobile applications is to offer patients continuous clinical support and provide medical personnel with self-documented or automated real-time data from patients. This process can be complex, given the substantial differences in lifestyle changes and individual responses to digital interventions. We observed significant differences in adherence to each application; additionally, improvements in clinical outcomes differed according to the type of application used. Studies evaluating digital interventions have raised similar concerns regarding the importance of user compliance with applications28,29. Factors such as interface design (e.g., larger display, illustrations) and other motivational components have been found to influence the user-friendliness of applications30.

Elucidating why there was greater user compliance with Noom© than with Second Doctor©, as well as why Second Doctor© was the only application that showed significant improvement in clinical outcomes, could offer valuable insights for future interventions. We found that age and sex were not related to application compliance. We speculate that the easily accessible interface of Noom© and the personalized coaching based on CRC information from Second Doctor© contributed to better results. Other factors, such as the free-of-charge characteristic of WalkOn© or the discontinued free-of-charge period for Noom© after 12 weeks, might have affected the outcomes. In a different study, Chin et al. assessed patients using the Noom© application for 19 months, with a median duration app usage of 267 days (interquartile range 182)31. While this study shows a relatively longer use of the application, the study selectively included patients who used the application twice or more monthly, for six consecutive months. This indicates that users with low compliance were excluded, unlike in the present study. Other studies also enrolled patients who actively logged on to mHealth applications32. This difference in the inclusion criteria makes it difficult to directly compare application usage reported in prior studies with that in the present study, in which patients were randomly assigned to applications and could decide to use the applications at their discretion. Furthermore, forcing patients to use such applications can be extremely taxing, and even using login data as the compliance index can be controversial (simply logging into the application may not necessarily indicate patient engagement with its content). Overall, the patients randomly assigned to the intervention groups showed relatively low compliance, with an average of only 41 days of application login over the 6-month period. Therefore, the study’s non-significant results should be interpreted cautiously. The importance of compliance with such applications warrants further investigation in future studies.

We do not recommend generalizing the current findings to the overall effect of mHealth applications, as only three applications were included in our study. As mentioned previously, the applications had different goals and were not specifically designed for patients who have undergone surgery for CRC. Step counts, caloric intake, and weight loss data were not collected to assess the direct cause-and-effect relationship with cancer surgery. Consequently, this study could not estimate the step count required to have an impact. Although data for the anthropometric index (i.e., BMI, waist circumference, and fat/muscle area in abdominal CT) were collected, these were not the primary outcomes, and they were not anticipated to have an evident positive effect in this short-term outcome report. A different study design would be necessary to assess the effects of each application in increased detail, involving the collection of a more exact step count, strict documentation of caloric intake and exercise, and even mandatory application usage with a method to evaluate a representative index for adherence to provided functions. Related to this, a study investigating the WalkOn© application confirmed that a step count of approximately 8,683 per week was correlated with a 0.77-point decrease in distress scores26. Although this finding does not confirm a significant effect of step counts in CRC patients, it presents evidence of the beneficial effects of increased step counts. Challenges remain in studying such digital healthcare devices. A systematic review identified the limitations of mHealth applications, notably that inaccurate data input can lead to misinterpretation and that few studies have addressed application “use and adherence.”33 Nonetheless, in this study, we aimed to assess the impact of easily accessible commercial mHealth applications on patients’ QoL through a holistic approach using various questionnaires.

The EQ-5D is widely used to assess overall health-related QoL in diverse patients. Its five dimensions (i.e., mobility, self-care, usual activities, pain/discomfort, and anxiety/depression) are considered reliable for comparing the overall health status among patients with various diseases34,35. However, this questionnaire has limitations, including its sensitivity to change and lack of specificity36,37. In cases where improvements reach the ceiling of the scale or intervention effects are subtle, the score may not adequately capture the patient’s true health status. Moreover, it cannot elaborately assess disease-specific manifestations, such as bowel habit changes following colorectal surgery. The negative findings of the present trial may, in part, be attributed to an inappropriate designation of the primary endpoint. Postoperative complications have been shown to be associated with QoL following colorectal surgery38,39. Not only life-threatening complications, such as anastomosis leakage, but also chronic functional disorders, including low anterior resection syndrome (LARS), impact QoL after bowel surgery40,41. The current study presented comparable early and late complication rates, as well as individual morbidities such as anastomosis leakage or LARS, between the four groups, reflecting minimal unanticipated influence from postoperative complications on the primary outcome. However, it should be noted that stratified randomization for this study did not account for morbidity; therefore, the influence of postoperative complications on the outcomes cannot be entirely excluded. The comparable incidence of postoperative morbidity between the four groups appears coincidental rather than the result of statistical adjustment. Future studies should consider selecting alternative, objective primary endpoints representing CRC-specific functional outcomes, such as LARS scores, alongside health-related QoL questionnaires, within a stratified study population that accounts for postoperative morbidity.

This study has some limitations. The first pertains to the period during which the study was conducted. The incidence of COVID-19 infection rates during the second surge, starting in August 2020 in the Republic of Korea, remained high throughout the study period. Numerous studies have reported decreased physical activity during the pandemic42, particularly among older adults in the Republic of Korea43. This may have significantly impacted the intervention groups since quarantine measures and reduced social activities likely disrupted recommended physical activity levels. Additionally, the voluntary use of the applications may have contributed to the lack of significant differences between the intervention and control groups. The limited number of patients who actively used the applications may have weakened the per-protocol analysis of the impact of mHealth applications. Further, the highly homogenous population of the Republic of Korea may limit the generalizability of compliance with digital healthcare across different races and ethnicities. Moreover, including both rectal cancer and colon cancer could have influenced the results, as changes in bowel habits are known to be more frequent in patients who have undergone rectum resection, in addition to the creation of ostomies. Before assessing the results, it is important to note that the sample size calculation was derived from a previous study that included only breast cancer patients. This may have influenced the findings of our study, which included CRC patients. We found limited studies reporting the EQ-5D index scores from patients who underwent colorectal surgery during study protocol development. Although Kameyama et al. reported a similar EQ-5D index score of 0.867 (range 0.324-1.000), the study included a small number of patients (n = 30), and the full study manuscript could not be attained; only the abstract was reported44. Despite these limitations, the strength of the present study lies in its randomized design and the use of objective measures alongside self-reported questionnaire outcomes. Future studies involving large randomized controlled trials are required across different populations to investigate the efficacy of mobile applications in improving the QoL in patients with CRC.

In conclusion, although the intervention groups did not show significant improvements in the primary outcome and most secondary outcomes, our findings demonstrate the potential benefits of digital intervention in enhancing the health of CRC survivors, especially concerning muscle health. However, varying and low compliance with each application must be considered when interpreting this study’s findings. As such, further studies are warranted to determine the importance of compliance with mobile applications. This study contributes to the growing body of evidence on digital healthcare interventions for cancer survivors, and it may highlight the high expectations regarding the efficacy of mobile applications, which remain unclear.

Materials and methods

Study design

The study design followed a previously published protocol16. A randomized controlled trial design was implemented for a planned follow-up period of 18 postoperative months. Both male and female patients with CRC from across the country were recruited from a university-affiliated referral medical center. This study was approved by the Institutional Review Board (IRB) of Asan Medical Center (IRB no. 2020–1015) and has been reported in accordance with the Consolidated Standards of Reporting Trials guidelines45. Informed consent was obtained from all participants. The patients were informed about the detailed study objectives and invited to participate voluntarily. This study is registered in the Clinical Research Information Service of the Republic of Korea (KCT0005447). The study was audited by the IRB of Asan Medical Center during the enrollment period.

Participants

Patients diagnosed with CRC between November 2020 and November 2021 were eligible for screening and included in the study if they met the following criteria: (1) were aged 20–70 years; (2) had a pathologically confirmed diagnosis of adenocarcinoma in the colon or rectum; (3) had undergone surgical resection with curative intent; (4) and were of Korean nationality and ethnicity. Patients aged 70 years and older were excluded based on two studies reporting information and communication technology anxiety among this population46,47. Patients were also ineligible if they had a distant metastasis, were candidates for neoadjuvant treatment (either chemotherapy or radiotherapy), had a planned permanent ostomy, were pregnant or lactating, had been diagnosed with inflammatory bowel disease or with a malignancy in another organ within the previous 5 years, or were non-ambulant16.

Randomization and intervention

The patients were randomly assigned to one of four study groups using computer-generated random numbers. Subsequently, stratified randomization was performed based on age (≤40 years vs. >40 years) and sex assigned at birth. Groups A (active assurance), B (walking), and C (cancer-specific) were the intervention groups, each being matched with a different type of mHealth application. Group D served as the control group (Fig. 1). Since the usability of applications is crucial for patients’ adherence to provided recommendations, various mHealth applications using different platforms were selected because user-friendliness can vary and affect outcomes.

The active assurance group (Group A) was assigned to use the Noom© application, which is a paid global mobile application (Noom, Inc., New York, NY, USA). Noom© users can continuously record their body weight, diet patterns, and exercise routines. Both human advisors and artificial intelligence use this information to provide regular supervision to users through text messages (Supplementary Fig. 2). The basic walking group (Group B) was assigned to use WalkOn (Swallaby Co., Ltd., Seoul, Republic of Korea), a free application that records users’ step count and walking intensity. This application allows users to check their ranking compared to other users and create communities within the application for motivation (Supplementary Fig. 3). The cancer-specific group (Group C) was assigned to use the Second Doctor© application (Medi Plus Solution C., Ltd., Seoul, Republic of Korea), a paid application specifically designed for patients with cancer. Users of this application can select their specific condition (e.g., CRC) and upload basic information along with details about the treatment they have received. The Second Doctor© application offers one-on-one consultations with human coaches, including nutritionists, and patients can automatically record their daily step count, consumed calories, heartbeat, and sleep patterns using a smart band (Supplementary Fig. 4)16. The patients in the control group received routine education during hospitalization that included basic information about CRC by nurses and colorectal surgeons. Moreover, upon discharge, they received booklets that contained diet recommendations for 30 postoperative days (Supplementary Fig. 5).

Outcomes

The primary outcome was the index score from the European Quality of Life-5 Dimensions (EQ-5D), a self-administered paper-based questionnaire assessing health-related QoL. Total scores are converted to a range from 0 to 1, with higher scores representing better health status48.

The secondary endpoints included health-related QoL calculated from multiple instruments (i.e., Health-related Quality of Life Instrument with 8 Items [HINT-8]49, 12-Item Short Form Survey [SF-12]50, and Functional Assessment of Cancer Therapy-Colorectal [FACT-C]51) and metabolic parameters (i.e., blood pressure, body mass index, skeletal muscle area, fat area, waist circumference, fasting glucose, HbA1c, triglycerides, and high-density lipoprotein cholesterol).

Clinicopathologic data, including age, sex assigned at birth, comorbidities, tumor ___location, type of operation, pathologic stage, and adjuvant treatments were collected from the participants’ electronic medical records. Fat/muscle areas were assessed using abdominal computed tomography (CT) and calculated, as previously described52. Postoperative morbidity and mortality were recorded and classified as early (within 30 days of surgery) or late (within 90 days of surgery) postoperative complications. Complications were graded according to the Clavien-Dindo grading system53. Major complications were categorized as Grade 3 or higher, and minor complications as Grade 1 or 2.

Outcomes were collected at the preoperative baseline and every six months until the end of the study. Each application provided data regarding patient login histories, which allowed for the evaluation of compliance with the use of the mobile application.

Statistical analysis

The sample size was calculated based on the difference in EQ-5D scores at six months postoperative. A previous study on health-related QoL reported EQ-5D index scores of 0.86 (standard deviation (SD) = 0.1) and 0.95 (SD = 0.08) in patients with breast cancer and a healthy age-matched population, respectively, at six months. The significant difference in EQ-5D index score is considered to be between 0.03 and 0.0754,55. Accordingly, we hypothesized a significant difference in EQ-5D at six months postoperative (P < 0.05)56. Patients were randomized into four groups, with any Type 1 errors corrected accordingly. The Type 1 error margin was set at 0.008 in accordance with the study design. Considering a power of 0.8, we estimated that each group would require at least 63 patients to detect a significant difference. To account for an anticipated dropout rate of approximately 20%, the required sample size was calculated to be 80 patients for each group (63 × 100/80 = 78.75).

Intention-to-treat (ITT) analysis was conducted to compare the intervention groups with the control group. Variables and outcomes were analyzed using a chi-square or Fisher’s exact test. Continuous data were analyzed by conducting Student’s t test or Mann–Whitney U test, as appropriate. Changes at six months postoperative relative to the baseline values were compared between the intervention and control groups using a linear mixed model. Between-group differences in outcome values were compared by conducting a repeated-measures ANOVA. Post hoc analysis using Tukey’s B was performed to determine pairwise comparisons of means that influenced the significant difference identified from the repeated-measures ANOVA. All continuous variables were expressed as mean values with SD, and P values < 0.05 were considered to indicate statistical significance. Missing values were imputed using multiple imputations with SPSS® version 22.0 (IBM, Armonk, NY, USA) based on clinicopathological characteristics, questionnaire scores, and metabolic parameters at two different time points. All questionnaire scores were complete at baseline; however, they were missing for 43 patients at the 6-month follow-up (13.4%). Five imputed datasets were created for analysis.

Subgroup analysis was performed according to the patients’ compliance with using each application. K-means clustering was performed using Python with three variables (days used, mean days used, and the time difference between the last date of login and the first date of login) to classify the patients into three clusters based on their usage: short, medium, and long.