Germline biallelic pathogenic variants in the genes encoding the DNA base excision repair (BER) glycosylases MUTYH, NTHL1 and MBD4 cause recessive hereditary cancer syndromes characterised by the presence of gastrointestinal adenomatous polyposis and increased risk of colorectal cancer (CRC) and other tumour types [1,2,3]. The associated syndromes differ in prevalence, tumour spectrum, and type of mutations accumulated in the tumours, consequence of the glycosylase-specific defective DNA repair (Table 1).

Table 1 Characteristics of the polyposis syndromes caused by biallelic germline pathogenic variants in base excision repair (BER) glycosylases.

The MBD4-associated neoplasia syndrome (MANS) is a rare recessive condition. Considering the reported cases (11 carriers in 8 families) [2, 4,5,6], the MANS-associated phenotype is characterised by acute myeloid leukaemia (AML) or myelodysplastic syndrome (MDS) [7/11 carriers; mean age at diagnosis 35 (range: 30–49)]; colorectal polyposis [9/11 carriers] diagnosed in adulthood (in their 30s), with variable expressivity, and associated with increased CRC risk [2/11 carriers]; schwannomas [3/11 carriers]; and uveal melanoma (UVM) [2/11 carriers] (Table S1). Other phenotypes observed in biallelic carriers include papillary thyroid cancer, lymphoma, meningioma, breast cancer, ovarian germ cell tumours, and other benign lesions including upper gastrointestinal polyps, making this a multi-tumour syndrome resembling the NTHL1 tumour syndrome [7] but with different extracolonic tumour spectrum. Based on reported phenotypes, clinical surveillance in MANS should focus on the high AML risk and gastrointestinal manifestations [2]. Due to recurrence of UVM and schwannomas, routine ophthalmological (eye) exams [8] and monitoring of schwannomas may be recommended. Based on response of MBD4-deficient UVMs [9,10,11], MANS tumours may be considered good candidates for immune checkpoint inhibitor-based therapies.

Sequencing data from MANS-associated colorectal adenomas revealed a somatic mutational spectrum enriched in CpG>TpG mutations, consequence of the failure to repair G:T mismatches resulting from deamination of 5’-methylcytosine. This generates a mutational signature, SBS96, that largely resembles SBS1 [2, 12], present in tumours with either inherited or acquired MBD4 mutations.

Heterozygous germline MBD4 pathogenic variants are associated with genetic predisposition to UVM. In these cases, the associated tumours show loss of the wildtype MBD4 allele and, like in MANS, elevated mutation burden enriched in CpG>TpG mutations (signature SBS96) [9, 10, 13]. Palles et al., identified four individuals heterozygous for loss-of-function MBD4 variants among 1,611 patients affected with ≥10 colorectal adenomas, familial and/or early onset CRC, or CRC in combination with other tumours. Adenomas developed by the heterozygotes did not show the typical mutational spectrum of MBD4-deficient tumours [2]. Nevertheless, these results were not conclusive of an association of heterozygous pathogenic variants with increased CRC risk or lack of it.

To expand the knowledge about MANS and the role of heterozygous MBD4 variants in CRC and polyposis predisposition, we studied the germline mutational status of MBD4 in patients affected with CRC (n = 543; data source: TCGA), adenomatous polyposis (n = 177; source: Hereditary Cancer Program, Catalan Institute of Oncology, IDIBELL), or tumours that fit the syndrome’s phenotypic spectrum (n = 8; source: Hereditary Cancer Program, Catalan Institute of Oncology, IDIBELL). Somatic analysis of the TCGA CRC cohort had been performed in two pan-cancer analyses that considered the identification of loss-of-function MBD4 variants and/or MBD4-deficient tumours [4, 9]. Sequencing of MBD4 coding exons and exon-intron boundaries was performed in the in-house cohorts with direct automated (Sanger) sequencing. Variants with a population minor allele frequency (MAF) <0.5% were considered. Description of patients and methods is included in Supplementary Materials and Methods. Review of MANS patients reported in the literature and re-analysis of the sequencing data available from five MBD4-deficient tumours, including three MANS-associated adenomas [3, 5,6,7] and two TCGA cancers [9], were performed.

No loss-of-function germline MBD4 variants were detected in any of the 728 patients studied. No rare (MAFgnomAD_non-cancer <0.5%) variants were identified in either the 177 adenomatous polyposis patients studied or the eight probands with MANS-suspected phenotypes, confirming the extremely low frequency of MBD4 pathogenic variants as cause of polyposis and the rarity of MANS. Eight of the 543 TCGA CRC patients analysed harboured rare germline missense variants in MBD4: one in homozygosis and the other seven in heterozygosis. Only two of the five variants identified, c.368C>T;p.(Ser123Leu) and c.1400A>G;p.(Asn467Ser), had REVEL [14] pathogenicity scores >0.500 (Table 2).

Table 2 Rare (population MAF <0.5%) germline variants in MBD4 identified in the patients included in the study.

MBD4 c.181T>C;p.(Cys61Arg), predicted benign, was identified in homozygosis in a patient diagnosed with a sigmoid colon cancer at age 68 (TCGA-D5-6924), and in heterozygosity in a 69-year-old CRC patient (TCGA-AA-3685). Two additional homozygous individuals, one of them a >80-year-old cancer patient, were reported in the Genome Aggregation Database (gnomAD v.2.1.1). No effect of this variant on MBD4 glycosylase activity was observed in a previous study [15].

An Asian patient diagnosed with CRC at age 46 (TCGA-CA-6718) resulted heterozygous for a predicted pathogenic variant, c.368C>T;p.(Ser123Leu), relatively common in individuals of East Asian origin (MAF: 0.5%; source: gnomAD). Three patients, diagnosed with CRC in their 70 s (TCGA-AA-3869, TCGA-AA-3950, TCGA-DM-A28K), were heterozygous for c.1400A>G;p.(Asn467Ser) (REVEL: 0.540; MAFgnomAD_non-cancer: 0.17%; ~0.3% in Europeans). The presence of two homozygous individuals in the gnomAD v.2.1.1. non-cancer dataset, together with the experimental evidence showing no effect on MBD4 glycosylase activity [13], suggest a non-pathogenic nature for c.1400A>G. Two additional variants, predicted benign, were identified in the other two CRC patients (Table 2). A somatic MBD4 alteration (potential second hit) was detected in the CRC of the c.368C>T;p.Ser123Leu heterozygote but not in the other patients’ tumours.

MDB4-deficient tumours accumulate relatively high number of somatic mutations due to defective DNA repair, with an average tumour mutational burden (TMB) of ~10 mutations per Mb (mut/Mb) (range: 2.7–24.2) (Tables S1 and S2). Exome sequencing data was obtained from TCGA for the CRCs developed by the eight heterozygous or homozygous MBD4 variant carriers identified. The highest TMBs were identified in the CRCs developed by the c.368C>T;p.(Ser123Leu) and c.1160C>T;p.(S387L) heterozygotes (TCGA-CA-6718 ~ 100 mut/Mb; TCGA-D5-6530 24.4 mut/Mb). These tumours were deficient in polymerase proofreading and MMR, respectively, which caused the elevated TMB as showed the associated tumour mutational signatures (Table S3). The other six tumours had TMBs <5 mut/Mb (Table 2).

An agnostic analysis of tumour mutational signatures using FitMS through the web-based application Signal (all Cancer Reference Signatures, not selecting by tumour type) [16] was performed in eight TCGA CRCs belonging to the MBD4 variant carriers herein identified, five previously reported MBD4-deficient tumours (three MANS adenomas, one MBD4-deficient TCGA UVM, and one MBD4-deficient TCGA glioblastoma), and 43 randomly selected TCGA CRCs. While the five MBD4-deficient tumours reached SBS96 contributions of ~100% (range: 97–100%), all six polymerase proofreading-proficient and MMR-proficient CRCs from MBD4 variant carriers had SBS96 contributions of 32% to 69%; similar to the profiles of the 43 TCGA CRCs (Fig. S1 and Table S3). Due to the similarity between SBS96 and SBS1, the latter ubiquitously detected in tumours and associated with aging, we next analysed solely those two signatures. While the five MBD4-deficient tumours had 100% SBS96 contribution (0% SBS1), the eight TCGA CRCs developed by the MBD4 variant carriers preferentially harboured SBS1 over SBS96 (SBS1 contribution: 73-100%; SBS96 contribution: 0% in all except for one MMR-deficient CRC with 17% SBS96 contribution), confirming no MDB4 deficiency (Fig. S2 and Table S3).

Classification of the MBD4 variants identified in this study following the ACMG/AMP classification guidelines [17] and taking into consideration: presence of homozygous individuals in the gnomAD non-cancer dataset (BS2), in silico pathogenicity predictions (BP4, PP3), effect on MBD4 glycosylase activity (BS3, PS3), and major presence (or not) of tumour SBS96 (over SBS1) when the variant is in homozygosis or there is a clear somatic second hit in MDB4 (BP5, PP4), indicates that c.181T>C;p.Cys61Arg and c.1400A>G;p.Asn467Ser are benign, while the other four are variants of unknown significance (Table 2). Nonetheless, our findings, mostly supported by the tumour mutational spectra, indicate that none of the herein identified variants had an impact on CRC development.

Most germline MBD4 pathogenic variants reported, either biallelic or monoallelic, are loss-of-function variants scattered along the MBD4 sequence (Fig. 1). In-frame deletions and missense pathogenic variants occur at the C-terminal half of the protein, where the glycosylase ___domain for DNA repair is located. A relevant proportion of truncating variants also affect the protein C-terminal part, likely not inducing RNA nonsense-mediated decay but affecting the integrity or structural stability of the glycosylase ___domain, highlighting the relevance of this region in the associated cancer predisposition syndromes. An in vitro assay that evaluates MBD4 glycosylase activity [15, 18] may be used to characterise variants of unknown significance in this gene, particularly when tumour sequencing data are unavailable or non-informative.

Fig. 1: Location within the MBD4 protein sequence (PFAM domains) of the pathogenic variants identified to date.
figure 1

Homozygous and compound heterozygous variants identified in MANS patients (Table S1) are represented in the upper part of the figure, while heterozygous variants associated with monoallelic cancer predisposition (Table S4) are represented in the bottom. Black dots represent truncating mutations, including nonsense, frameshift deletions or insertions and splice-site variants. Grey dots indicate missense variants and in-frame deletions or insertions. Position of the protein domains: Methyl-CpG binding ___domain (MBD), 80–149 amino acids; DNA glycosylase, 461–535 amino acids.

Due to the extreme rarity of MANS, as confirmed in our study, the inclusion of MBD4 in multi-gene panels designed to study cancer predisposition in the clinical practice will likely not increase the diagnostic yield. Nevertheless, if not included in the diagnostic gene panel, we recommend performing individual analysis of MBD4 when either phenotypic (patients with any combination of MDS/AML, polyposis, CRC, UVM and/or schwannomas) or tumour molecular characteristics (~100% SBS96 contribution) suggest an MBD4-related genetic aetiology. Major contribution of tumour SBS96 (~100%; >90%) may be used as pathogenicity-supporting evidence for variant interpretation.

In conclusion, our negative results confirm that MANS is an extremely infrequent syndrome and show that the few rare heterozygous (missense) MBD4 germline variants identified in our study are not driver carcinogenic events leading to tumour MBD4 deficiency.