Introduction

Colorectal cancer (CRC) is the third most commonly diagnosed cancer and the second leading cause of cancer death worldwide, accounting for up to 10% of new cancer cases and 9% of cancer deaths in 20201. Night shift work was classified as a probable risk factor (Group 2A carcinogen) to humans according to the 2019 report by the International Agency for Research on Cancer2. Beyond previously studied associations between night shift work and CRC, epidemiological evidence linking other metrics of circadian disruption with risk of CRC in the general population is emerging. A dose–response meta-analysis of six studies in the United States showed that long but not short sleep duration relative to 7–8 h of sleep was associated with a 21% increased risk of developing CRC3. In a more recent nested case–control study from Taiwan, participants (166 cases) with sleep disorders (having three or more outpatient diagnoses according to ICD-9) were found to have a 29% increased risk of developing CRC compared to those without4.

Genome-wide association studies (GWASs) have identified a large number of genetic variants robustly associated with sleep traits. According to recently published GWASs, 13.7% of chronotype (morning or evening preference) variance was explained by additive genetic variants5. Likewise, the heritability of insomnia was estimated at 16.7%6 and independently at 2.6%7 and of sleep duration at 9.8%8. Although data linking chronotype and insomnia with CRC risk are limited, Mendelian randomization (MR), a common approach to address inherent biases in observational studies and investigate potential causal associations provided that certain assumptions are met, has also been widely adopted to study the association of sleep-related traits with several outcomes, including breast9,10 and prostate cancer10, as well as known risk factors of CRC, such as adiposity11 and type 2 diabetes mellitus (T2DM)12. MR studies have reported that morning preference was associated with increased T2DM risk13, alcohol consumption14, educational attainment15, and reduced risks of breast9,10 and prostate cancer10; insomnia was associated with an increased T2DM risk, higher body mass index (BMI) and a decrease in educational attainment7; increased sleep duration was associated with an increased risk of breast cancer9 and a higher BMI in children8 and, lastly, short sleep duration was associated with an increased CRC risk using data from UK Biobank (UKB) based on 5,486 CRC cases16. To our knowledge, there is no other MR study that has investigated the potential association between sleep-related traits and CRC risk.

The aim of the current study was to assess whether genetically predicted chronotype, insomnia and sleep duration are associated with CRC risk in males, females and overall and according to CRC anatomical subsites using two-sample MR, leveraging data from the Genetics and Epidemiology of Colorectal Cancer Consortium (GECCO), the Colorectal Cancer Transdisciplinary Study (CORECT) and the Colon Cancer Family Registry (CCFR). We also investigated whether genetic variants associated with short sleep, long sleep and accelerometer-derived measures of sleep traits support the primary associations of self-reported sleep traits on CRC risk. We accounted for potential pleiotropy due to BMI, alcohol use, educational attainment, smoking and T2DM, phenotypes with a high frequency of associations with genome-wide significant sleep trait variants according to PhenoScanner database. Furthermore, we used genetic variants identified in sex-specific sleep trait GWASs to assess the extent to which findings differed by sex, as it has been suggested by observational findings17,18,19,20,21 on the effect of sleep traits on CRC outcomes and also by evidence from molecular studies reporting a differential timing of clock gene expression among men and women22.

Methods

Sleep trait GWASs

A GWAS Catalog search (https://www.ebi.ac.uk/gwas) was conducted in December 2020 for published GWASs on sleep traits. Genetic variants reported to be independently associated with self-reported chronotype5, insomnia symptoms6,7 and sleep duration8 at genome wide significance (P < 5 × 10–8) in participants of European ancestry were selected from the largest GWASs with adequate summary association data.

Self-reported chronotype was assessed at recruitment in UKB23 and 23andMe24 via a single question: “Do you consider yourself to be?” and “Are you naturally a night person or a morning person?”, respectively, and several possible answers, as described in more detail elsewhere5. To maximise statistical power, Jones et al. dichotomized chronotype into morning or evening preference and used results from both UKB and 23andMe in the GWAS meta-analysis (n = 697,828). Summary statistics were adjusted for age, sex, genotyping platform (UKB, 23andMe), study centre (UKB) and principal components (23andMe). Overall, 351 independent genetic variants at P < 5 × 10–8 were identified5.

Self-reported insomnia symptoms were assessed in UKB23 in response to the question: “Do you have trouble falling asleep at night, or do you wake up in the middle of the night?”, and similar questions in the Nord-Trøndelag Health Study (HUNT)25 and the Partners HealthCare Biobank (Partners Biobank)26. Lane et al. performed two GWAS meta-analyses for frequent insomnia symptoms (n = 254,767) and for any insomnia symptoms (n = 532,378). The models were adjusted for age, sex, genetic ancestry, and genotyping array6, and a total of 57 independent variants at P < 5 × 10–8 were identified. Summary statistics for insomnia complaints measured with similar questionnaires were also obtained from a large-scale GWAS meta-analysis of up to 1,331,010 European-ancestry individuals from UKB (n = 386,533) and 23andMe (n = 944,477) and overall 250 lead variants at P < 5 × 10–8 were identified7.

Self-reported sleep duration was assessed at recruitment in UKB23 with the question: “About how many hours sleep do you get in every 24 h? (please include naps)” and was treated as a continuous variable. Dashti et al. performed a GWAS among UKB participants (n = 446,118) and identified 78 independent variants after adjustment for age, sex, 10 principal components, genotyping array, and genetic correlation matrix. In addition, a fixed-effects meta-analysis of the UKB and the Cohorts for Heart and Aging Research in Genomics Epidemiology (CHARGE) (n = 493,298)27 identified 13 additional new variants. Overall, 91 independent variants at P < 5 × 10–8 emerged from this study8.

Genetic variant selection

Out of the 3515 genetic variants for chronotype, 576 and 2507 for insomnia symptoms and 918 for continuous sleep duration, we excluded those variants which were not available in the CRC datasets, were palindromic with minor allele frequency ≥ 0.3, were in linkage disequilibrium (LD) (R2 ≥ 0.01) and had an F-statistic < 1028 to account for weak instrument bias based on the strength of their association with the sleep trait. The final number of variants used in the primary analysis were 290 for chronotype, 38 and 190 for insomnia and 76 for sleep duration, with 1% genetic variant overlap observed between chronotype and insomnia by Jansen et al. (2019), 2% between insomnia by Lane et al. (2019) and insomnia by Jansen et al. (2019), and no overlap in genetic variants between the rest of the exposures. Genetic variant selection is shown in Supplementary Figure S1.

Colorectal cancer GWASs

The association of the genetic instruments for chronotype, insomnia and sleep duration with CRC risk was obtained from the GECCO, CORECT and CCRF consortia29,30,31. Further information on the characteristics of the contributing studies is provided in Supplementary Table S1. All CRC cases were defined as colorectal neoplasia and confirmed by pathological reports, medical records or death certificate information. In analysis of both sexes combined data on 58,221 CRC cases and 67,694 controls were available for 16,900,397 autosomal variants. GWAS results were also available by sex, with data on 59,889 females and 66,026 males overall, and by tumour anatomical subsite, with summary association data on 31,083 colon, 13,857 proximal colon, 15,306 distal colon and 15,775 rectal cancer cases.

Statistical power

An online tool available at http://cnsgenomics.com/shiny/mRnd/ was employed to perform statistical power calculations32. For an odds ratio (OR) of 1.10 (or its reciprocal 0.91) per 1 standard deviation (SD) change in the exposure and a type 1 error of 5%, the statistical power (Supplementary Table S2) and the minimum detectable OR at 80% power were calculated (Supplementary Table S3).

Primary MR analyses

Prior to MR analysis, it is essential to harmonize the data so that all variants are standardized to associate with the exposure in the same direction and that the effect and other alleles are aligned between the instrument-exposure (GX) and the instrument-outcome (GY) datasets ensuring that they are both coded from the same strand. Pearson’s correlation coefficient of the effect allele frequencies (EAFs) among datasets prior and post harmonization was estimated as a quality control measure of the harmonization process33 (Supplementary Tables S6, S15, S22, S43). Two-sample MR was conducted using summary association data from published GWASs to assess whether genetically predicted sleep traits (chronotype, insomnia and sleep duration) are associated with CRC risk overall and according to sex and anatomical subsites. Meta-analysis of the Wald estimates for each variant was performed by implementing an inverse-variance weighted (IVW) method. The fixed-effects IVW model assumes absence of horizontal pleiotropy and will return an unbiased estimate provided that all variants are valid instrumental variables (IVs)34. Correspondingly, the IVW random-effects model will return an unbiased estimate in the presence of balanced horizontal pleiotropy35.

Sensitivity analyses for the assessment of MR assumptions

To evaluate instrument strength and potential violation of the first MR assumption, i.e., the genetic variants must be associated with the sleep trait of interest36,37, the F-statistic for each variant-trait association was calculated38. According to the second and third MR assumptions, the genetic variants should not be associated with any confounder of the sleep trait-CRC relationship and should not influence CRC via some other pathway than that of the sleep trait36,37. To account for potential horizontal pleiotropy, another five MR methods were performed, each of them providing a valid MR estimate under different combinations of assumptions. MR-Egger provides an unbiased causal effect estimate even if the third MR assumption is violated and all the variants are invalid IVs, provided that the InSIDE assumption of independency between the horizontal pleiotropic effects and the variants-exposure effects is met39. Instrument strength and variability of the genetic instrument-exposure association due to potential violation of the NO Measurement Error (NOME) assumption were also inspected via the I2GX statistic40. The weighted median returns a valid estimate when at least 50% of the weight of the genetic variants comes from valid IVs41. The weighted mode is valid under the Zero Modal Pleiotropy Assumption (ZEMPA) according to which out of all clusters of variants with similar effects, the largest is the group of valid IVs42. The contamination mixture is a likelihood-based method that identifies groups of variants with similar estimates, an indicator of the existence of distinct mechanisms in the exposure-outcome relationship, and provides robust estimates under the ZEMPA assumption43. To detect potential outlying genetic variants, we implemented the MR pleiotropy residual sum and outlier test (MR-PRESSO), a method that identifies and excludes outliers, applying a random-effects IVW model44. The Cochran’s Q statistic for heterogeneity was also computed34. To further graphically assess the presence of pleiotropy, we generated diagnostic scatter, funnel and forest plots. To explore potential associations between the genetic instruments with secondary phenotypic traits that could indicate possible violations in the second or third MR assumptions, we queried the PhenoScanner database (http://www.phenoscanner.medschl.cam.ac.uk/) using a P threshold of 1 × 10–5 and tabulated the resulting associations per variant. All analyses were conducted in the statistical software R version 4.2.0 using the packages MendelianRandomization45 and MRCIEU/TwoSampleMR36.

Secondary analyses using alternative exposures/instruments

To account for potential non-linear associations of sleep duration with CRC risk, we performed two-sample MR using 27 genome-wide significant variants for short sleep (< 7 h vs. 7–8 h; n = 106,192 cases and 305,742 controls) and 8 variants for long sleep (> 8 h vs. 7–8 h; n = 34,184 cases and 305,742 controls) in participants of European ancestry from UKB8.

Given that all sleep traits were based on self-reported information, we assessed the robustness of our findings using accelerometer-derived equivalents of sleep measures in up to 85,670 participants from UKB. We used genetic variants for seven accelerometer-derived sleep traits representative of chronotype [L5 timing defined as the midpoint (number of hours since the previous midnight) of the least active five hours of each day, reflects whether a person goes to bed earlier or later in the day (6 variants); M10 timing defined as the midpoint of the most-active 10 h, reflects whether a person is most active earlier or later in the day (1 variant); sleep midpoint (1 variant)], insomnia [number of sleep episodes (21 variants)] and sleep duration [actigraphy nocturnal sleep duration (11 variants); sleep efficiency (5 variants); diurnal inactivity (2 variants)] as described elsewhere46. The genetic correlation of self-reported and accelerometer-derived measures has been investigated in a previous study, according to which the correlation coefficient is equal to 0.903 between chronotype and L5 timing, 0.095 between any insomnia and the number of sleep episodes, and 0.430 between sleep duration and actigraphy nocturnal sleep duration9. The observed overlap in genetic variants was approximately 1% between L5 timing and chronotype, and 1%, 0.5%, 1% and 0.5% between insomnia by Jansen et al. (2019) and L5 timing, number of sleep episodes, actigraphy nocturnal sleep duration and sleep efficiency, respectively. There was no overlap between the rest of accelerometer-derived measures and any self-reported sleep trait.

Traits with a high frequency of associations with genome-wide significant variants for sleep traits using the PhenoScanner database, i.e., BMI, alcohol use, educational attainment, lifetime smoking and T2DM were followed up in Multivariable Mendelian randomization (MVMR) analysis to assess direct effects of sleep traits independent of potential confounders and to evaluate potential pleiotropy arising from those factors47. In order for the MVMR estimates to be valid, the genetic variants should be robustly associated with at least one of the exposures in the model (relevance), be independent of all confounders of all the exposure-outcome associations (exchangeability) and be independent of the outcome conditional on the exposures and confounders (exclusion restriction). Details on the GWASs used in MVMR analysis are available in Supplementary Table S28 and summary association data for the corresponding genetic instruments are included in Supplementary Tables S29-S33. To test for instrument strength in the MVMR setting, the conditional F-statistic was calculated with a threshold of less than 10 suggesting probable weak instrument bias48. To further test instrument strength, a modified version of the Cochran’s Q statistic was calculated under the null hypothesis that IVs do not sufficiently account for any of the variation in the exposures. Also, to test for pleiotropy, an adjusted version of the Cochran’s Q statistic (QA), under the null hypothesis of no heterogeneity in the genetic variants (Supplementary Table S34), as well as the MVMR MR-Egger regression49 (Supplementary Table S35) were used. MVMR analysis was conducted using the R packages MendelianRandomization45 and MRPracticals49, R version 4.2.0. MVMR findings were followed up in a bi-directional MR framework to decipher the direction of the association of sleep traits with factors of interest from the MVMR analyses (Supplementary Tables S36-S41).

The genetic instruments used in all previous primary and secondary analyses were identified in the corresponding GWAS meta-analyses for chronotype, insomnia and sleep duration, in which results for females and males were combined. We also carried out MR analyses using instruments identified in sex-specific GWASs from UKB. Following LD-clumping, 32 variants were associated at P < 5 × 10–8 with chronotype in males, 65 with chronotype in females, 10 with frequent insomnia symptoms in males, 15 with frequent insomnia symptoms in females, 7 with sleep duration in males, and 16 with sleep duration in females. After relaxing the threshold at P < 5 × 10–6 in supplementary analysis to increase the number of instruments, 199, 126, 90, 58, 80 and 72 variants were identified, respectively.

Results

An outline of our study design is presented in Figure 1. Associations of the genetic instruments with the exposures/outcomes are shown in Supplementary Tables S4, S5 (primary analyses) and Supplementary Tables S13, S14, S21, S29-S33, S36, S42 (secondary analyses). Figure 2 shows all primary MR associations. Table 1 depicts all primary IVW MR associations and two sensitivity analyses for the assessment of MR assumptions (Weighted median, MR-Egger). The random-effects IVW estimate is reported throughout the manuscript (ORIVW), unless otherwise specified where there is under-dispersion in the association estimates (ORIVW-fixed). All two-sample MR and corresponding sensitivity analyses are detailed in Supplementary Tables S7-S12 (primary analyses) and Supplementary Tables S16-S20, S23-S27, S34, S35, S37-S41, S44-S48 (secondary analyses). Diagnostic scatter, funnel and forest plots for the primary analyses are shown in supplementary figures (Supplementary Fig. S2-S148).

Fig. 1
figure 1

Outline of the study design and data sources for the primary and secondary analyses performed. BMI: body mass index; CCFR: Colon Cancer Family Registry; CHARGE: Cohorts for Heart and Aging Research in Genomics Epidemiology; CORECT: Colorectal Cancer Transdisciplinary Study; CRC: colorectal cancer; GECCO: Genetics and Epidemiology of Colorectal Cancer Consortium; GWAS: Genome-wide association study; HUNT: Nord-Trøndelag Health Study; L5: midpoint of the least active five hours of each day; M10: midpoint of the most-active 10 h; MR: Mendelian randomization; MR-PRESSO: MR pleiotropy residual sum and outlier test; MVMR: Multivariable Mendelian randomization; Partners Biobank: Partners HealthCare Biobank; T2DM: type 2 diabetes mellitus; UKB: UK Biobank.

Fig. 2
figure 2

Mendelian randomization (MR) estimates for the association of sleep traits with colorectal cancer outcomes in primary MR analyses. Forest plot of two-sample MR random-effects inverse-variance weighted estimates for the association between sleep traits and risk of colorectal, colon, proximal colon, distal colon, and rectal cancer in both sexes combined and in sex-stratified analyses. CI: confidence interval; NA: Non Applicable; OR: odds ratio.

Table 1 Two-sample Mendelian randomization estimates for the association between sleep traits and risk of colorectal, colon, proximal colon, distal colon, and rectal cancer in both sexes combined and in sex-stratified analyses.

Primary MR analyses of sleep traits and CRC and assessment of MR assumptions

The F-statistics for the eligible genetic variants did not indicate the presence of weak instruments (F-statistic range = 30–430 for chronotype, 16–217 for insomnia, 30–221 for sleep duration) (Supplementary Table S4).

Two-sample MR suggested a 13% lower risk of CRC in men with morning preference in comparison with evening preference chronotype [ORIVW = 0.87, 95% confidence interval (CI) = 0.78, 0.97, P = 0.01]. This finding was consistent in site-specific analysis of colon cancer (ORIVW = 0.86, 95% CI = 0.76, 0.98, P = 0.02) and rectal cancer (ORIVW-fixed = 0.87, 95% CI = 0.75, 1.00, P = 0.04) in males. However, there was no evidence of an association for chronotype with CRC risk in women (ORIVW = 0.99, 95% CI = 0.89, 1.09, P = 0.78), with an 83.2% of the variation in the effect estimates between men and women being attributed to heterogeneity rather than chance (Cochrane’s Q = 5.97, Cochrane’s Q test P = 0.02). Similarly, there was no evidence of an association in both sexes combined (ORIVW = 0.97, 95% CI = 0.92, 1.02, P = 0.19) or in site-specific analyses (Table 1, Supplementary Table S7 and Fig. 2). There was strong evidence for heterogeneity that cannot be explained by sampling variation alone among the causal estimates, suggesting potential violation of the MR assumptions (Supplementary Table S9; smallest Cochran’s Q test P = 6.80 × 10–11). Some indication of horizontal pleiotropy based on the MR-Egger intercept test was evident in the association of morning preference with CRC in females (intercept P = 0.04, ORMR-Egger = 0.77, 95% CI = 0.60, 0.99, P = 0.04) (Table 1, Supplementary Tables S7, S8), however, the association was not supported by any other primary or sensitivity analyses. Results were consistent in terms of direction in most of the applied sensitivity analyses (Table 1, Supplementary Table S7), although the association of chronotype with CRC, colon and rectal cancer in males attenuated in the weighted median analysis. The MR-PRESSO analysis identified two outliers for chronotype (i.e., rs45597035, rs6007594) (Supplementary Table S10). A PhenoScanner search revealed that the variants used as genetic instruments for MR analyses (Supplementary Table S12) and the outlying genetic instruments from MR-PRESSO (Supplementary Table S11) were associated at P < 5 × 10–5 with several secondary phenotypic traits relevant to CRC. However, after removal of the outlying variants, the IVW effect estimates for the association of chronotype with CRC (MR-PRESSO ORIVW = 0.89, 95% CI = 0.78, 0.99, P = 0.02) and with colon cancer in males (MR-PRESSO ORIVW = 0.87, 95% CI = 0.74, 1.00, P = 0.03) remained largely unchanged (Supplementary Table S10).

There was no evidence of an association for genetically predicted frequent/any insomnia symptoms with CRC or its subsites in both sexes combined and in sex-stratified analyses (Table 1, Supplementary Table S7 and Fig. 2). The results were largely consistent in sensitivity analyses (Table 1, Supplementary Tables S7-S10). Some indication of horizontal pleiotropy based on the MR-Egger intercept test was evident in the association of insomnia with proximal cancer in both sexes combined (intercept P = 0.01, ORMR-Egger = 0.46, 95% CI = 0.26, 0.84, P = 0.01) and with rectal cancer in females (intercept P = 0.03, ORMR-Egger = 1.69, 95% CI = 1.09, 2.61, P = 0.02) (Table 1, Supplementary Tables S7, S8), however, these associations were not consistent among other sensitivity analysis methods.

Two-sample IVW MR analysis found some evidence of an association of sleep duration with rectal cancer in males (ORIVW = 0.70, 95% CI = 0.50, 0.98 per hour increase, P = 0.04), however, this finding was not supported by sensitivity analyses. There was no evidence of an association for genetically predicted sleep duration with CRC or its subsites in both sexes combined and in other sex-stratified analyses (Table 1, Supplementary Table S7 and Fig. 2).

Secondary analyses

Two-sample MR showed no clear evidence that either short (ORIVW = 1.02, 95% CI = 0.75, 1.38 per doubling of genetic liability for short sleep duration, P = 0.91) or long sleep duration (ORIVW-fixed = 1.03, 95% CI = 0.91, 1.17 per doubling of genetic liability for long sleep duration, P = 0.62) compared to 7–8 h of sleep were associated with CRC in both sexes combined (F-statistic range = 16–57 for short sleep duration, 16–46 for long sleep duration). Likewise, there was no evidence of an association for either short or long sleep duration and CRC in females (Supplementary Table S16). Corresponding data in males were not available. The results were consistent in sensitivity analyses (Supplementary Tables S16-S20).

In analyses using variants associated with accelerometer-derived equivalents of sleep measures (F-statistic range = 30–155), there was no evidence of an association for L5 timing (ORIVW-fixed = 1.04, 95% CI = 0.86, 1.27 per hour elapsed since previous midpoint, P = 0.67), M10 timing (ORIVW-fixed = 1.23, 95% CI = 0.68, 2.23 per hour elapsed since previous midpoint, P = 0.49), sleep midpoint (Wald ratio estimator = 1.39, 95% CI = 0.70, 2.76 per hour increase, P = 0.34), actigraphy nocturnal sleep duration (ORIVW-fixed = 0.95, 95% CI = 0.81, 1.12 per hour increase, P = 0.53), sleep efficiency (ORIVW-fixed = 0.97, 95% CI = 0.76, 1.25 per 1% increase, P = 0.83), diurnal inactivity (Wald ratio estimator = 1.32, 95% CI = 0.89, 1.97 per hour increase, P = 0.17) and number of sleep episodes (ORIVW-fixed = 0.96, 95% CI = 0.84, 1.09 per sleep episode, P = 0.51) and CRC using both the IVW method and in sensitivity analyses. We further assessed the association of accelerometer-derived measures with colon and rectal cancer, but the results did not change (Supplementary Tables S23-S27). Although the findings for L5, M10 and sleep midpoint have wide CIs, they are consistent with the association of evening preference with an increased CRC risk.

Results from the MVMR analysis on CRC in both sexes combined were largely unchanged in terms of direction and increased in magnitude when accounting for BMI, alcohol, smoking or T2DM compared to the unadjusted association estimates. After adjusting for educational attainment, a direct effect of chronotype on CRC was observed (OReducational attainment-adj = 0.80, 95% CI = 0.65, 0.98, P = 0.03) (Supplementary Table S34). In bi-directional MR analysis, a SD increase in educational attainment was associated with a decrease in morning preference chronotype and CRC risk (ORIVW = 0.98, 95% CI = 0.97, 1.00, P = 0.01 and ORIVW = 0.75, 95% CI = 0.69, 0.82, P = 3.88 × 10–10, respectively), and morning compared to evening preference was associated with a decrease in years of education by 0.98 SD (Supplementary Table S37). The bi-directional association between chronotype and educational attainment supports that morning preference is associated with a lower educational attainment and vice versa, however, it does not preclude a potential role of educational attainment as a confounder in the association of chronotype with CRC.

Finally, the association of morning preference with a decrease in CRC risk in males was consistent (but with wider CIs) in MR analyses using instruments identified in sex-specific GWASs from UKB (Supplementary Table S44).

Discussion

In the current study, we employed genetic variants as IVs in a two-sample MR framework to investigate potential associations of three modifiable sleep traits, including chronotype, the endogenous circadian rhythm, insomnia symptoms and sleep duration with CRC risk. Morning compared with an evening preference was suggestively associated with lower CRC and colon cancer risk among men. This finding was consistent in some, but not all, sensitivity analysis methods. Site-specific analyses and secondary analyses using alternative exposures/instruments predominantly supported this association, although with wider CIs. There was no evidence of an association between chronotype and CRC risk in women or in both sexes combined. There was also no evidence of an association between insomnia and sleep duration and CRC or its subsites.

The epidemiological evidence for an association between sleep traits and CRC risk is limited and, in most cases, inconsistent. The suggestive association of morning preference with a decreased risk of CRC and colon cancer in men is supported by previous findings from a population-based case–control study, according to which colon cancer risk was elevated among men who ever worked at night compared to men who never worked at night17, and a cohort of Korean male professional emergency responders (ERs), whose standardized incidence ratios showed an increase with reference to the general population18. The adverse effect of light-at-night exposure, inherent to night-shift work, on carcinogenesis is aligned with animal study observations50. In addition, the lack of an association between chronotype and CRC risk in women is in accordance with an earlier female cohort of radio and telegraph operators working evening or night shifts19, and with two female cohorts of nurses working rotating night shifts, the Nurses´ Health Study (NHS)20 and the NHS220,21. In a previous NHS analysis, a positive association between night shift work for over 15 years and CRC risk was documented51; however, in a most recent analysis of the NHS cohort by Papantoniou et al. (2018), which expanded both the follow-up time and sample size, the former relationship was attenuated20. It is worth noting that circadian misalignment and its potential adverse effects caused by night shift work might vary depending on diurnal preference, with Vetter et al. (2015) reporting an increased risk in T2DM in women with timing discrepancy between their chronotype and work schedule52. Moreover, according to the Bureau of Labor Statistics of the U.S. Department of Labor, 15% of full-time wage and salary employees worked on alternative shifts in 2004, but only 11.5% of them worked a non-day schedule out of personal preference53. In this context and given the shortage of epidemiological evidence evaluating the effect of chronotype per se on carcinogenesis and the insufficient evaluation in existing epidemiological studies of the participants’ socioeconomic status impact on the potential interconnection between shift work and colorectal malignancies, shift work should be used with prudence as a proxy for chronotype for lack of a more suitable measure, considering also that chronotype could potentially be acting as an effect modifier for shift work i.e., the magnitude of the effect of shift work on cancer risk might differ depending on an individual’s chronotype. The speculated difference in the effect of chronotype by sex is augmented by molecular studies on clock gene expression, according to which daily rhythms of PER2, PER3 and ARNTL1 genes are delayed in men compared to women22 as well as by evidence supporting an earlier pineal melatonin secretion in women compared to men relative to their sleep time54. Thus, the observed association of morning preference with CRC and colon cancer risk in men could potentially be attributed to inherent sex differences and not to sample variation alone.

We observed a lack of an association between insomnia and risk of CRC in the current study using both self- and accelerometer-derived measures of insomnia. The existing literature is limited to examining insomnia symptoms as a disease aftermath in cancer patients or survivors and is accordingly short on exposure data prior to the onset of the malignancy. Therefore, there is a need for the accumulation of exposure data long before cancer diagnosis to assess whether insomnia is a potential risk factor for CRC in an observational setting.

No evidence of an association between sleep duration and CRC risk was observed both in this comprehensive MR study and in the literature. Specifically, two cohorts in women21,55, one cohort in both sexes combined56, and an MR study using data from the UKB16 reported no evidence of an association between sleep duration and CRC. Our secondary analyses using accelerometer-derived measures for sleep duration showed no evidence for an association of sleep duration equivalents, with CRC, colon and rectal cancer and our analysis assessing potential non-linear effects of sleep duration also indicated that neither short nor long sleep duration associated with CRC risk relative to 7–8 h of sleep. These findings partly support the results from a recent MR study of 367,586 UKB participants reporting no evidence of an effect for long sleep (P = 0.66) and 48% higher odds of CRC (P = 0.01) for short sleep duration. However, this study included only 5,486 CRC cases from UKB and its results were not supported in replication analysis using data from the FinnGen consortium16. In contrast to our findings, some observational studies reported an adverse effect of sleep duration. Among those, three are case–control studies57,58,59, prone to recall bias and further limitations due to their design, and two are cohorts but with data on specific population groups (postmenopausal women receiving hormone replacement therapy and overweight individuals who snorted) and therefore ambiguous in terms of generalizability60,61. Moreover, according to a previous study, there is a moderate correlation between self-reported and measured sleep duration62, introducing potential information bias in the arising role of sleep duration as a potential risk factor for CRC. Consequently, the observed variation in the literature could be partially attributed to the potentially high levels of participants’ misclassification, and hence any future studies should focus on the accurate characterization of sleep traits.

To our knowledge, this study constitutes the first comprehensive MR investigation of the association of five self-reported sleep traits and seven accelerometer-derived equivalents with the risk of CRC and its subsites in both sex-combined and in sex-stratified analyses. Among its principal strengths is the implementation of MR, a methodology which counteracts by design some of the main limitations inherent to observational studies, namely exposure measurement error, reverse causation, and residual confounding, provided that certain assumptions are met36,37. Although it is not possible to eliminate potential violations of the MR assumptions, we used several sensitivity analyses, with differences in statistical power and assumptions to be satisfied, to address this issue, and the interpretation of our findings was based on a thorough co-assessment of these methods. To account for potential heterogeneity, we implemented additional stratified analyses by sex and CRC anatomical subsites. We further conducted analyses to explore potential pleiotropic effects and detect outlying variants using MR-PRESSO. MVMR was implemented to assess the direct effects of sleep traits on CRC and to account for potential horizontal pleiotropy due to suspected confounders arising from our PhenoScanner search. Another key strength is the use of a large number of genetic variants robustly associated with sleep traits and identified in meta-analysis GWASs of well-established epidemiological cohorts with the largest available sample size to date, as well as the inclusion of colorectal cancer data from the GECCO, CORECT and CCFR consortia. In the case of insomnia, two meta-analytic summary association datasets were used to account for winner’s curse, i.e., the effect sizes for the genetic variants being larger in the discovery sample compared to the general population, the published GWAS by Jansen et al. (2019)7 and the second largest available dataset by Lane et al. (2019)6. However, there is an overlap of UKB participants in these datasets. As genetic variants for all analyses were identified in GWASs of both sexes combined, we also examined whether our findings were supported by genetic variants identified in sex-specific sleep trait GWASs. To assist the reproducibility of results and the evaluation of the harmonization process, all post harmonization datasets are available as supplementary material.

One potential limitation of the current study is the use of self-reported exposure data in the conducted GWASs. To mitigate this source of bias, we conducted secondary analyses using accelerometer-derived equivalents to the self-reported traits as more objective measures. However, given the limited number of variants associated with accelerometer-derived sleep traits, the smaller datasets and the lack of sex-specific data, the resulting effect estimates were of low precision. In addition, the use of summary data poses a restriction to the examination of potential non-linear effects and the stratification by other potential risk factors of the exposure-outcome association. However, we did try to account for potential deviation from linearity using instruments robustly associated with short and long sleep duration. Also, the use of smaller datasets to assess the site-specific effects of sleep traits might be limiting the precision of the corresponding analyses. Our study was also restricted to participants of European ancestry, so further studies are required to explore the associations of sleep traits in non-European ancestry groups. Another limitation related to the selection of participants is a small sample overlap between the exposure and outcome datasets. However according to simulation studies this bias should be considered negligible especially in the case of no weak genetic instruments63. It should be noted that although we did not explicitly correct for multiple testing, we interpreted our findings in a conservative way to account for the large number of exposures and outcomes evaluated in this study. Furthermore, results from MVMR and bi-directional analyses suggested a potential role of educational attainment in the chronotype-CRC association, which remains to be investigated in future studies.

This study provides little to no evidence of an association between genetically predicted sleep traits and CRC risk. The suggestive association observed between genetically predicted chronotype and CRC in men was consistent in terms of direction in some sensitivity and secondary analyses, however, due to their lack in precision and the aforementioned limitations of our study, any observations should be interpreted tentatively. Considering the ever-increasing circadian rhythm dysregulation imposed by modern lifestyle and the emerging evidence supporting its role in carcinogenesis, it is important to expand upon our findings and investigate potential underlying biological mechanisms that might inform on future interventions for CRC prevention.