Abstract
Identifying genetic risk factors for highly heterogeneous disorders such as epilepsy remains challenging. Here we present, to our knowledge, the largest whole-exome sequencing study of epilepsy to date, with more than 54,000 human exomes, comprising 20,979 deeply phenotyped patients from multiple genetic ancestry groups with diverse epilepsy subtypes and 33,444 controls, to investigate rare variants that confer disease risk. These analyses implicate seven individual genes, three gene sets and four copy number variants at exome-wide significance. Genes encoding ion channels show strong association with multiple epilepsy subtypes, including epileptic encephalopathies and generalized and focal epilepsies, whereas most other gene discoveries are subtype specific, highlighting distinct genetic contributions to different epilepsies. Combining results from rare single-nucleotide/short insertion and deletion variants, copy number variants and common variants, we offer an expanded view of the genetic architecture of epilepsy, with growing evidence of convergence among different genetic risk loci on the same genes. Top candidate genes are enriched for roles in synaptic transmission and neuronal excitability, particularly postnatally and in the neocortex. We also identify shared rare variant risk between epilepsy and other neurodevelopmental disorders. Our data can be accessed via an interactive browser, hopefully facilitating diagnostic efforts and accelerating the development of follow-up studies.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
27,99 € / 30 days
cancel any time
Subscribe to this journal
Receive 12 print issues and online access
209,00 € per year
only 17,42 € per issue
Buy this article
- Purchase on SpringerLink
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout







Similar content being viewed by others
Data availability
We provide summary-level data at the variant and gene level in an online browser for visualization and download (https://epi25.broadinstitute.org/). There are no restrictions on the aggregated data released on the browser. Full results from the exome-wide burden analysis are also available in Supplementary Data 1 and 4. WES data from Epi25 cohorts are available via the NHGRI’s controlled-access AnVIL platform (https://anvilproject.org/; dbGaP accession number: phs001489). Data availability of non-Epi25 control cohorts is provided in the Supplementary Information. Source data are provided with this paper.
Publicly available datasets analyzed in this study include:
Gene family: https://zenodo.org/records/3582386
CORUM protein complexes: https://mips.helmholtz-muenchen.de/corum/
Protein Data Bank: https://www.rcsb.org/
(Structure analyzed in Fig. 3c: https://www.rcsb.org/structure/6x3z)
BrainSpan: https://www.brainspan.org/
Gene Ontology: https://geneontology.org/
Code availability
No custom code was used in this study. For sequence data generation, we used GATK version 3.4 and version 3.6 (GATK nightly-2015-07-31-g3c929b0, 3.4-89-ge494930 and 3.6-0-g89b7209), Picard version 1.1431 and VerifyBamlD version 1.0.0. Sample and variant QC was performed using functions in Hail 0.1 and 0.2 (website: https://www.hail.is; documentation: https://hail.is/docs/0.1/ and https://hail.is/docs/0.2/; GitHub repository: https://github.com/hail-is/hail). Variant annotation was performed using the Ensembl Variant Effect Predictor (VEP) version 85 tool as implemented in Hail 0.1 with the LOFTEE annotation provided as default (https://github.com/konradjk/loftee/tree/27b0040f524348baa7f3257flce58993529e09ef). For phenotyping data, case record forms were hosted on the REDCap platform version 14 and entered into the Epi25 data repository (https://github.com/Epi25/epi25-edc). For gene burden analysis, we used the R (version 3.6.1) package logistf version 1.26.0 (https://cran.r-project.org/web/packages/logistf/index.html) to implement the Firth regression model. Additional processing and visualization were performed using R functions in the tidyverse library version l.3.0 (https://www.tidyverse.org/packages/).
References
Fisher, R. S. et al. ILAE official report: a practical clinical definition of epilepsy. Epilepsia 55, 475–482 (2014).
World Health Organization. Epilepsy: a public health imperative. https://www.who.int/publications/i/item/epilepsy-a-public-health-imperative (2019).
Annegers, J. F., Hauser, W. A., Anderson, V. E. & Kurland, L. T. The risks of seizure disorders among relatives of patients with childhood onset epilepsy. Neurology 32, 174–179 (1982).
Berkovic, S. F., Howell, R. A., Hay, D. A. & Hopper, J. L. Epilepsies in twins: genetics of the major epilepsy syndromes. Ann. Neurol. 43, 435–445 (1998).
Oliver, K. L. et al. Genes4Epilepsy: an epilepsy gene resource. Epilepsia 64, 1368–1375 (2023).
May, P. et al. Rare coding variants in genes encoding GABAA receptors in genetic generalised epilepsies: an exome-based case–control study. Lancet Neurol. 17, 699–708 (2018).
Baldassari, S. et al. The landscape of epilepsy-related GATOR1 variants. Genet. Med. 21, 398–408 (2019).
Epi4K consortium; Epilepsy Phenome/Genome Project. Ultra-rare genetic variation in common epilepsies: a case-control sequencing study. Lancet Neurol. 16, 135–143 (2017).
Epi25 Collaborative. Ultra-rare genetic variation in the epilepsies: a whole-exome sequencing study of 17,606 individuals. Am. J. Hum. Genet. 105, 267–282 (2019).
Epi25 Collaborative. Sub-genic intolerance, ClinVar, and the epilepsies: a whole-exome sequencing study of 29,165 individuals. Am. J. Hum. Genet. 108, 965–982 (2021).
Samocha, K. E. et al. Regional missense constraint improves variant deleteriousness prediction. Preprint at bioRxiv https://doi.org/10.1101/148353 (2017).
Barwell, J., Snape, K. & Wedderburn, S. The new genomic medicine service and implications for patients. Clin. Med. (Lond.) 19, 273–277 (2019).
Goodspeed, K. et al. Current knowledge of SLC6A1-related neurodevelopmental disorders. Brain Commun. 2, fcaa170 (2020).
Absalom, N. L. et al. Gain-of-function and loss-of-function GABRB3 variants lead to distinct clinical phenotypes in patients with developmental and epileptic encephalopathies. Nat. Commun. 13, 1822 (2022).
Koko, M. et al. Distinct gene-set burden patterns underlie common generalized and focal epilepsies. EBioMedicine 72, 103588 (2021).
Lal, D. et al. Gene family information facilitates variant interpretation and identification of disease-associated genes in neurodevelopmental disorders. Genome Med. 12, 28 (2020).
Ruepp, A. et al. CORUM: the comprehensive resource of mammalian protein complexes. Nucleic Acids Res. 36, D646–650, (2008).
Farrant, M. & Nusser, Z. Variations on an inhibitory theme: phasic and tonic activation of GABAA receptors. Nat. Rev. Neurosci. 6, 215–229 (2005).
Maljevic, S. et al. Spectrum of GABAA receptor variants in epilepsy. Curr. Opin. Neurol. 32, 183–190 (2019).
Zhu, S. et al. Structure of a human synaptic GABAA receptor. Nature 559, 67–72 (2018).
Kellogg, E. H., Leaver-Fay, A. & Baker, D. Role of conformational sampling in computing mutation-induced changes in protein structure and stability. Proteins 79, 830–838 (2011).
Fu, J. M. et al. Rare coding variation provides insight into the genetic architecture and phenotypic context of autism. Nat. Genet. 54, 1320–1331 (2022).
de Kovel, C. G. et al. Recurrent microdeletions at 15q11.2 and 16p13.11 predispose to idiopathic generalized epilepsies. Brain 133, 23–32 (2010).
Whitney, R. et al. The spectrum of epilepsy in children with 15q13.3 microdeletion syndrome. Seizure 92, 221–229 (2021).
Hardies, K. et al. Duplications of 17q12 can cause familial fever-related epilepsy syndromes. Neurology 81, 1434–1440 (2013).
DiStefano, C. et al. Behavioral characterization of dup15q syndrome: toward meaningful endpoints for clinical trials. Am. J. Med. Genet. A 182, 71–84 (2020).
Coughlin, C. R. et al. Mutations in the mitochondrial cysteinyl-tRNA synthase gene, CARS2, lead to a severe epileptic encephalopathy and complex movement disorder. J. Med. Genet. 52, 532–540 (2015).
Kapoor, D., Majethia, P., Anand, A., Shukla, A. & Sharma, S. Expanding the electro-clinical phenotype of CARS2 associated neuroregression. Epilepsy Behav. Rep. 16, 100485 (2021).
Pusalkar, M. et al. Acute and chronic electroconvulsive seizures (ECS) differentially regulate the expression of epigenetic machinery in the adult rat hippocampus. Int. J. Neuropsychopharmacol. 19, pyw040 (2016).
International League Against Epilepsy Consortium on Complex Epilepsies GWAS meta-analysis of over 29,000 people with epilepsy identifies 26 risk loci and subtype-specific genetic architecture. Nat. Genet. 55, 1471–1482 (2023).
Priori, S. G. et al. Mutations in the cardiac ryanodine receptor gene (hRyR2) underlie catecholaminergic polymorphic ventricular tachycardia. Circulation 103, 196–200 (2001).
Lehnart, S. E. et al. Leaky Ca2+ release channel/ryanodine receptor 2 causes seizures and sudden cardiac death in mice. J. Clin. Invest. 118, 2230–2245 (2008).
Yap, S. M. & Smyth, S. Ryanodine receptor 2 (RYR2) mutation: a potentially novel neurocardiac calcium channelopathy manifesting as primary generalised epilepsy. Seizure 67, 11–14 (2019).
Kang, H. J. et al. Spatio-temporal transcriptome of the human brain. Nature 478, 483–489 (2011).
Satterstrom, F. K. et al. Large-scale exome sequencing study implicates both developmental and functional changes in the neurobiology of autism. Cell 180, 568–584 (2020).
Mi, H., Muruganujan, A., Casagrande, J. T. & Thomas, P. D. Large-scale gene function analysis with the PANTHER classification system. Nat. Protoc. 8, 1551–1566 (2013).
Kaplanis, J. et al. Evidence for 28 genetic disorders discovered by combining healthcare and research data. Nature 586, 757–762 (2020).
Singh, T. et al. Rare coding variants in ten genes confer substantial risk for schizophrenia. Nature 604, 509–516 (2022).
Kruidenier, L. et al. A selective jumonji H3K27 demethylase inhibitor modulates the proinflammatory macrophage response. Nature 488, 404–408 (2012).
Stamberger, H. et al. NEXMIF encephalopathy: an X-linked disorder with male and female phenotypic patterns. Genet. Med. 23, 363–373 (2021).
de Lange, I. M. et al. De novo mutations of KIAA2022 in females cause intellectual disability and intractable epilepsy. J. Med. Genet. 53, 850–858 (2016).
Sirmaci, A. et al. Mutations in ANKRD11 cause KBG syndrome, characterized by intellectual disability, skeletal malformations, and macrodontia. Am. J. Hum. Genet. 89, 289–294 (2011).
Low, K. et al. Clinical and genetic aspects of KBG syndrome. Am. J. Med. Genet. A 170, 2835–2846 (2016).
Sheikh, B. N., Guhathakurta, S. & Akhtar, A. The non-specific lethal (NSL) complex at the crossroads of transcriptional control and cellular homeostasis. EMBO Rep. 20, e47630 (2019).
Koolen, D. A. et al. Mutations in the chromatin modifier gene KANSL1 cause the 17q21.31 microdeletion syndrome. Nat. Genet. 44, 639–641 (2012).
Zollino, M. et al. Mutations in KANSL1 cause the 17q21.31 microdeletion syndrome phenotype. Nat. Genet. 44, 636–638 (2012).
Fujiwara, K. et al. Deletion of JMJD2B in neurons leads to defective spine maturation, hyperactive behavior and memory deficits in mouse. Transl. Psychiatry 6, e766 (2016).
Duncan, A. R. et al. Heterozygous variants in KDM4B lead to global developmental delay and neuroanatomical defects. Am. J. Hum. Genet. 107, 1170–1177 (2020).
Delhaye, S. & Bardoni, B. Role of phosphodiesterases in the pathophysiology of neurodevelopmental disorders. Mol. Psychiatry 26, 4570–4582 (2021).
Lee, D. Global and local missions of cAMP signaling in neural plasticity, learning, and memory. Front. Pharmacol. 6, 161 (2015).
Nilsson, D. et al. Whole-genome sequencing of cytogenetically balanced chromosome translocations identifies potentially pathological gene disruptions and highlights the importance of microhomology in the mechanism of formation. Hum. Mutat. 38, 180–192 (2017).
Lopriore, P., Gomes, F., Montano, V., Siciliano, G. & Mancuso, M. Mitochondrial epilepsy, a challenge for neurologists. Int. J. Mol. Sci. 23, 13216 (2022).
Harris, P. A. et al. Research electronic data capture (REDCap)—a metadata-driven methodology and workflow process for providing translational research informatics support. J. Biomed. Inform. 42, 377–381 (2009).
EPGP Collaborative. The epilepsy phenome/genome project. Clin. Trials 10, 568–586 (2013).
Van der Auwera, G. A. et al. From FastQ data to high confidence variant calls: the Genome Analysis Toolkit best practices pipeline. Curr. Protoc. Bioinformatics 43, 11.10.11–11.10.33 (2013).
McLaren, W. et al. The Ensembl Variant Effect Predictor. Genome Biol. 17, 122 (2016).
Karczewski, K. J. et al. The mutational constraint spectrum quantified from variation in 141,456 humans. Nature 581, 434–443 (2020).
Hail version 0.2.62-84fa81b9ea3d. https://github.com/hail-is/hail/commit/84fa81b9ea3d
Li, H. Toward better understanding of artifacts in variant calling from high-coverage samples. Bioinformatics 30, 2843–2851 (2014).
Babadi, M. et al. GATK-gCNV enables the discovery of rare copy number variants from exome sequencing data. Nat. Genet. 55, 1589–1597 (2023).
Keenan, A. B. et al. ChEA3: transcription factor enrichment analysis by orthogonal omics integration. Nucleic Acids Res. 47, W212–W224 (2019).
Acknowledgements
We thank the Epi25 principal investigators (PIs), local staff overseeing individual cohorts and all of the individuals with epilepsy and their families who participated in Epi25 for their commitment to this international collaboration. This work is part of the Centers for Common Disease Genomics (CCDG) program, funded by the National Human Genome Research Institute (NHGRI) and the National Heart, Lung, and Blood Institute. CCDG-funded Epi25 research activities at the Broad Institute, including genomic data generation in the Broad Genomics Platform, were supported by NHGRI grant UM1 HG008895 (PIs: E.S.L., S.B.G., M.J.D. and S.K.). Genome Sequencing Program efforts were also supported by NHGRI grant U01HG009088. A supplemental grant for Epi25 phenotyping was supported by ‘Epi25 Clinical Phenotyping R03’, National Institutes of Health R03NS108145 (PIs: D.H.L. and S.F.B.). Additional support for analysis was provided by National Institute of Neurological Disorders and Stroke grant R01NS106104 (PI: C.C.). The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health. We also thank the Stanley Center for Psychiatric Research at the Broad Institute for supporting the genomic data generation. Additional funding sources and acknowledgment of individual cohorts are listed in the supplementary materials.
Author information
Authors and Affiliations
Consortia
Contributions
All authors contributed to patient phenotyping data and sample collection or to analyses. Roles in specific committees of the project are listed in the supplementary materials.
Corresponding authors
Ethics declarations
Competing interests
B.M.N. is a member of the scientific advisory boards of Deep Genomics and Neumora. No other authors have competing interests to declare.
Peer review
Peer review information
Nature Neuroscience thanks the anonymous reviewers for their contribution to the peer review of this work.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data
Extended Data Fig. 1 Results from burden analysis of synonymous URVs.
a,b, Burden of synonymous URVs at the individual-gene (a) and the gene-set (b) level. The observed −log10-transformed P values are plotted against the expectation given a uniform distribution. Burden analyses are performed across four epilepsy groups – 1,938 DEEs, 5,499 GGE, 9,219 NAFE, and 20,979 epilepsy-affected individuals combined – versus 33,444 controls. P values are computed using a Firth logistic regression model testing the association between the case-control status and the number of URVs (two-sided); the red dashed line indicates exome-wide significance P = 3.4 × 10−7 after Bonferroni correction (see Methods).
Extended Data Fig. 2 Spatiotemporal expression of 13 exome-wide significant genes in the human brain.
Expression values (log2[TPM + 1]) are normalized to the mean for each BrainSpan sample; each dot represents the expression value of a particular gene in a sample collected in a particular brain region and developmental time (from early fetal to adulthood: N = 47/5/5/9/5/4, 69/6/6/7/5/4, 19/2/1/2/1/2, 27/2/2/2/2/3, 30/2/3/2/3/3, 41/3/4/3/4/5, 30/3/3/1/1/3, 36/3/3/2/2/4, and 63/6/6/6/6/6 neocortex/hippocampus/amygdala/striatum/thalamus/cerebellum samples, respectively). LOESS smooth curves are plotted for each brain region across developmental time.
Extended Data Fig. 3 Distributions of URVs from this study and de novo variants from other NDD studies on the same genes.
Schematic protein plots of nine genes that are significant in both our epilepsy cohort (DEE: developmental and epileptic encephalopathy; EPI: all-epilepsy combined) and previous large-scale WES studies of severe developmental disorders (DD) and/or autism spectrum disorder (ASD) are shown. Asterisk indicates recurring URVs in epilepsy; recurring de novo variants in DD/ASD as well as detailed variant information are provided in Supplementary Data 13.
Extended Data Fig. 4 Results from genetic ancestry- and sex-specific burden analyses.
a, The numbers of epilepsy cases (orange) and controls (blue) by genetic ancestry. b, Comparison of protein-truncating (left) and damaging missense (right) URV burden in the top ten genes from the primary analysis (‘All’) across genetic ancestry subgroups. Red color indicates enrichment in cases (log[OR] > 1), with an asterisk indicating nominal significance (P≤ 0.05; see Supplementary Data 14 for exact P values). P values are computed using a Firth logistic regression model testing the association between the case-control status and the number of URVs (two-sided). c, Genetic ancestry-specific burden of URVs in established epilepsy genes (N = 171 curated by the Genetic Epilepsy Syndromes [GMS] panel with a known monogenic/X-linked cause), constrained genes (N = 1,917 scored by the loss-of-function observed/expected upper bound fraction [LOEUF] metric as the most constrained 10% genes), and constrained genes excluding established epilepsy genes (N = 1,813). Overall, different ancestral groups show at least partially shared burden of deleterious URVs in these gene sets. In a-c, NFE: Non-Finnish European (Ncase=16,040, Ncontrol = 25,641), AFR: African (Ncase=1,598, Ncontrol = 2,592), AMR: Ad Mixed American (Ncase=480, Ncontrol = 3,106), EAS: East Asian (Ncase=1,698, Ncontrol = 1,215), FIN: Finnish (Ncase=926, Ncontrol = 537), SAS: South Asian (Ncase=237, Ncontrol = 353). d, Sex-specific burden of URVs in established epilepsy genes. Burden analyses are performed for three gene sets described in c, with an additional set of 37 X-linked GMS epilepsy genes, across four epilepsy groups (female: NDEE = 811, NGGE = 4,807, NNAFE = 3,511, NEPI(all)=11,372, Ncontrol = 18,144; male: NDEE = 997, NGGE = 2,579, NNAFE = 4,395, NEPI(all)=10,397, Ncontrol = 15,302). There is an overall trend of shared URV burden between female and male subgroups in these gene sets. In c and d, the dot represents the log odds ratio and the error bars represent the 95% confidence intervals of the point estimates. For presentation purposes, error bars that exceed a large log odds ratio value are capped, indicated by arrows at the end of the error bars (see Supplementary Data 14 and 15 for exact values). e, Comparison of sex-specific burden of protein-truncating URVs at level of the individual genes. For each gene, the −log10-transformed P value from the female subgroup analysis (y-axis) is plotted against that from the male subgroup analysis (x-axis). Top ten genes with URV burden in epilepsy are labeled for each subgroup, with genes on the sex chromosomes colored in blue. The red dashed line indicates exome-wide significance P = 3.4 × 10−7 after Bonferroni correction.
Extended Data Fig. 5 Results from burden analysis of protein-truncating and damaging missense URVs combined.
a, Joint burden of protein-truncating and damaging missense URVs at the individual-gene level. The observed −log10-transformed P values are plotted against the expectation given a uniform distribution. Burden analyses are performed across four epilepsy groups – 1,938 DEEs, 5,499 GGE, 9,219 NAFE, and 20,979 epilepsy-affected individuals combined – versus 33,444 controls. P values are computed using a Firth logistic regression model testing the association between the case-control status and the number of URVs (two-sided); the red dashed line indicates exome-wide significance P = 3.4 × 10−7 after Bonferroni correction (see Methods). b, Comparison of the joint burden in a with the burden of protein-truncating URVs. The odds ratio (OR) of protein-truncating plus damaging missense URVs (y-axis) and that of protein-truncating URVs alone (x-axis) are compared. Each dot represents a gene with nominally significant enrichment (OR > 0 and P ≤0.05) of either protein-truncating URVs or the two variant classes combined.
Extended Data Fig. 6 URV discovery and burden results across Epi25 data collection.
a, Increase in the number of protein-truncating and damaging missense URVs discovered in epilepsy genes with a known monogenic cause. b, Increase in the number of monogenic epilepsy genes identified with a protein-truncating or damaging missense URV. In a and b, variant/gene count is plotted against the year of Epi25 data collection; the total number of epilepsy cases analyzed in each year is indicated in parenthesis. c, URV burden of previously top-ranked genes in this study. The odds ratio of protein-truncating URVs in genes from this study (y-axis) and the prior Epi25 publication (x-axis) are compared. Each dot represents one of the top ten genes implicated by our previous burden analysis (across three epilepsy subtypes). Genes with a known monogenic/X-linked cause are labeled and colored in purple. d, Increase in the total, non-European ancestry, and effective sample size in this study over our previous publications. The effective sample size is computed as 4/(1/Ncase + 1/Ncontrol). e,f, The sample size required for well-powered gene burden testing. The percentage of genes powered to detect significant URV burden (Fisher’s exact P ≤0.05) at different effect sizes (e) and case:control ratios (f) is shown as a function of log-scaled sample size of epilepsy cases. Lighter color indicates smaller effect size (weaker burden), which requires a larger sample size to detect. The gray vertical line indicates the current sample size of 20,979 cases. In e, horizontal lines indicate 80% and 50% detection power, and vertical dashed lines indicate the estimated number of cases required to achieve 80% at the benchmarked effective sizes. In f, dashed and dotted curves indicate power estimation with increased control:case ratios from 1.6 (in this study) to 3.2 and 6.4, respectively; horizontal lines indicate the estimated power achieved by doubling and quadrupling the number of controls at the current sample size of cases. g, Epilepsy subtype-specific burden of URVs in established epilepsy genes (N = 171 curated by the Genetic Epilepsy Syndromes [GMS] panel with a known monogenic/X-linked cause), constrained genes (N = 1,917 scored by the loss-of-function observed/expected upper bound fraction [LOEUF] metric as the most constrained 10% genes), and constrained genes excluding established epilepsy genes (N = 1,813). Burden analyses are performed across three epilepsy subtypes – 1,938 DEEs, 5,499 GGE, and 9,219 NAFE – versus 33,444 controls. Protein-truncating and damaging missense URVs from DEEs exhibit the strongest enrichment in epilepsy panel genes, while all epilepsy subtypes show significant enrichment in constrained genes even after excluding the panel genes. No enrichment is observed for synonymous URVs. The dot represents the log odds ratio and the error bars represent the 95% confidence intervals of the point estimates.
Supplementary information
Supplementary Information
Supplementary Tables 1–3, Supplementary Figs. 1–5, Supplementary Subjects and Methods, Supplementary Acknowledgments and Supplementary References
Supplementary Data 1–15
Supplementary Data 1 | Results from exome-wide gene-based burden analysis of URVs. a,b, Burden of protein-truncating (a) and damaging missense (b) URVs in each protein-coding gene with at least one epilepsy or control carrier. For each variant class, burden analyses were performed across four epilepsy groups—1,938 DEEs, 5,499 GGE, 9,219 NAFE and 20,979 epilepsy-affected individuals combined (‘EPI’)—versus 33,444 controls. P values were computed using a Firth logistic regression model with adjustment for sex and ancestry. Supplementary Data 2 | Results from burden analysis of GATOR1 genes. Burden of protein-truncating URVs in GATOR1 complex and the three GATOR1-encoding genes (DEPDC5, NPRL3 and NPRL2), analyzed separately for familial (n = 1,162) and non-familial (n = 8,014) NAFE cases. P values were computed using a Firth logistic regression model with adjustment for sex and ancestry. Supplementary Data 3 | List of damaging missense URVs in SLC6A1 and GABRB3. Damaging missense URVs identified in SLC6A1 and GABRB3 with at least one epilepsy carrier; ‘novel’ indicates that the variant has not been previously reported. Coordinates are on GRCh38. Supplementary Data 4 | Results from exome-wide gene-set-based burden analysis of URVs. a,b, Burden of protein-truncating (a) and damaging missense (b) URVs in each gene set (gene family/protein complex) with at least one epilepsy or control carrier. For each variant class, burden analyses were performed across four epilepsy groups—1,938 DEEs, 5,499 GGE, 9,219 NAFE and 20,979 epilepsy-affected individuals combined (‘EPI’)—versus 33,444 controls. P values were computed using a Firth logistic regression model with adjustment for sex and ancestry. Supplementary Data 5 | List of protein-truncating URVs in GATOR1 genes. Protein-truncating URVs identified in GATOR1-encoding genes (DEPDC5, NPRL3 and NPRL2) with at least one NAFE carrier; ‘novel’ indicates that the variant has not been previously reported. Coordinates are on GRCh38. Supplementary Data 6 | Results from burden analysis of GABAA receptor complex. Burden of damaging missense URVs in the (α1)2(β2)2(γ2) GABAA receptor complex with respect to its structural ___domain; ECD, extracellular ___domain; TMD, transmembrane ___domain; TMD-2, the second TMD that forms the ion channel pore. For each ___domain, burden analyses were performed across three epilepsy groups—1,938 DEEs, 5,499 GGE and 9,219 NAFE—versus 33,444 controls. P values were computed using a Firth logistic regression model with adjustment for sex and ancestry. Supplementary Data 7 | Results from protein structural analysis of ion channel complexes. ddG values of missense URVs across 16 ion channel protein complexes with experimentally resolved 3D structures available (‘PDB_ID’). A higher absolute ddG value suggests a more deleterious effect on protein stability; positive and negative values suggest destabilizing and stabilizing effects, respectively. Coordinates are on GRCh38. Supplementary Data 8 | Results from burden analysis of ion channel complexes by ddG. Burden of damaging missense URVs in 16 ion channel protein complexes stratified by ddG. ddG ≥ 1 and ddG ≤ −1 were applied to define destabilizing and stabilizing missense URVs, respectively; |ddG| ≥ 1 comprises both. P values were computed using a Firth logistic regression model with adjustment for sex and ancestry. Supplementary Data 9 | Results from burden analysis of GABAA receptor complex by ddG. Burden of (de)stabilizing missense (|ddG| ≥ 1) URVs in the (α1)2(β2)2(γ2) GABAA receptor complex with respect to its structural ___domain; ECD: extracellular ___domain, TMD: transmembrane ___domain. ddG≥1 and ddG≤-1 are applied to define destabilizing and stabilizing missense URVs, respectively; |ddG| ≥ 1 comprises both. P values were computed using a Firth logistic regression model with adjustment for sex and ancestry. Supplementary Data 10 | Results from exome-wide burden analysis of rare CNVs. a–d, GD-based burden of CNVs (a), gene-based burden of CNV deletions (b), CNV deletions plus protein-truncating URVs (c) and CNV duplications (d) with at least one epilepsy or control carrier. For each variant class, burden analyses were performed on the subset of samples that passed CNV calling QC, across four epilepsy groups—1,743 DEEs, 4,980 GGE, 8,425 NAFE and 18,963 epilepsy-affected individuals combined (‘EPI’)—versus 29,804 controls. P values were computed using a Firth logistic regression model with adjustment for sex and ancestry. Supplementary Data 11 | Results from burden analysis of GGE GWAS genes. a, Gene-set-based burden of URVs in 23 genes implicated by GGE GWAS loci. Burden analyses were performed across four variant classes and two epilepsy groups—5,499 GGE and 9,219 NAFE—versus 33,444 controls. b, Gene-based burden of protein-truncating URVs in 14 GGE GWAS genes with enrichment in GGE (‘GGE_logOR’ > 0). P values were computed using a Firth logistic regression model with adjustment for sex and ancestry. Supplementary Data 12 | Results from functional analysis of candidate epilepsy genes. a, List of candidate epilepsy genes with a prenatal (n = 43) or postanal (n = 50) expression preference in the human brain. Ten prenatal genes with a TF function are indicated in ‘prenatal_TF’, and their regulatory targets overlapping with the postnatal genes are listed in ‘postnatal_targets’. b,c, GO terms enriched for the 43 prenatal and 50 postnatal genes (b) and all regulatory target genes linked to the 10 prenatal TFs (c). GO enrichment analysis was performed via the Gene Ontology Enrichment Analysis webserver (http://geneontology.org/). Supplementary Data 13 | Results from burden analysis of NDD genes. a, Gene-set-based burden of URVs in genes implicated by WES of severe DDs (n = 285), ASD (n = 185) and SCZ (n = 32) and on the subsets of mutually exclusive genes (that is, 196 DD-only, 99 ASD-only and 22 SCZ-only genes). b, Gene-based burden of URVs in genes analyzed in a. In a and b, P values were computed using a Firth logistic regression model with adjustment for sex and ancestry. c,d, Protein-truncating and damaging missense variants identified in nine genes that are significant in both this and other NDD WES studies (c) and in KDM6B (d). Supplementary Data 14 | Results from ancestry-specific burden analysis of URVs. a,b, Burden of protein-truncating (a) and damaging missense (b) URVs in each protein-coding gene with at least one epilepsy or control carrier. c, Gene-set-based burden of URVs in established epilepsy genes (n = 171 curated by the GMS panel), constrained genes (n = 1,917 scored by the loss of function observed/expected upper bound fraction (LOEUF) metric) and constrained genes excluding established epilepsy genes (n = 1,813). Burden analyses were performed across six genetic ancestry groups—16,040/25,641, 1,598/2,592, 480/3,106, 1,698/1,215, 926/537 and 237/353 case/control—of Non-Finnish European (NFE), African (AFR), Admixed American (AMR), East Asian (EAS), Finnish (FIN) and South Asian (SAS) samples, respectively. P values were computed using a Firth logistic regression model with adjustment for sex. Supplementary Data 15 | Results from sex-specific burden analysis of URVs. a,b, Burden of protein-truncating (a) and damaging missense (b) URVs in each protein-coding gene with at least one epilepsy or control carrier. c, Gene-set-based burden of URVs in established epilepsy genes (n = 171 curated by the GMS panel), X-linked GMS genes (n = 37), constrained genes (n = 1,917 scored by the LOEUF metric) and constrained genes excluding established epilepsy genes (n = 1,813). Burden analyses were performed for female and male subgroups separately, across four epilepsy groups—868/1,070 DEEs, 3,251/2,248 GGE, 4,818/4,401 NAFE and 11,001/9,978 all-epilepsy combined (‘EPI’) female/male cases—versus 18,143/15,301 female/male controls. P values were computed using a Firth logistic regression model with adjustment for ancestry.
Source data
Source Data Figs. 1–7 and Extended Data Figs 1–6
Source Data Statistical Source Data for Figs. 1–7 and Extended Data Figs. 1–6
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Epi25 Collaborative. Exome sequencing of 20,979 individuals with epilepsy reveals shared and distinct ultra-rare genetic risk across disorder subtypes. Nat Neurosci 27, 1864–1879 (2024). https://doi.org/10.1038/s41593-024-01747-8
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s41593-024-01747-8
This article is cited by
-
Monogenic Epilepsies in Adult Epilepsy Clinics and Gene-Driven Approaches to Treatment
Current Neurology and Neuroscience Reports (2025)
-
Ultraseltene genetische Risikovarianten bei Epilepsie: Ergebnisse der Exomsequenzierung von 20.979 Patient:innen
DGNeurologie (2025)