Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

Exome sequencing of 20,979 individuals with epilepsy reveals shared and distinct ultra-rare genetic risk across disorder subtypes

Abstract

Identifying genetic risk factors for highly heterogeneous disorders such as epilepsy remains challenging. Here we present, to our knowledge, the largest whole-exome sequencing study of epilepsy to date, with more than 54,000 human exomes, comprising 20,979 deeply phenotyped patients from multiple genetic ancestry groups with diverse epilepsy subtypes and 33,444 controls, to investigate rare variants that confer disease risk. These analyses implicate seven individual genes, three gene sets and four copy number variants at exome-wide significance. Genes encoding ion channels show strong association with multiple epilepsy subtypes, including epileptic encephalopathies and generalized and focal epilepsies, whereas most other gene discoveries are subtype specific, highlighting distinct genetic contributions to different epilepsies. Combining results from rare single-nucleotide/short insertion and deletion variants, copy number variants and common variants, we offer an expanded view of the genetic architecture of epilepsy, with growing evidence of convergence among different genetic risk loci on the same genes. Top candidate genes are enriched for roles in synaptic transmission and neuronal excitability, particularly postnatally and in the neocortex. We also identify shared rare variant risk between epilepsy and other neurodevelopmental disorders. Our data can be accessed via an interactive browser, hopefully facilitating diagnostic efforts and accelerating the development of follow-up studies.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Results from gene-based burden analysis of URVs.
Fig. 2: Results from gene-set-based burden analysis of URVs.
Fig. 3: Protein structural analysis of missense URVs in ion channel genes.
Fig. 4: Convergence of CNV deletions and protein-truncating URVs in gene-based burden.
Fig. 5: Epilepsy genetic architecture from large-scale genetic association studies.
Fig. 6: Functional analysis of candidate epilepsy genes.
Fig. 7: Shared rare variant risk between epilepsy and other NDDs.

Similar content being viewed by others

Data availability

We provide summary-level data at the variant and gene level in an online browser for visualization and download (https://epi25.broadinstitute.org/). There are no restrictions on the aggregated data released on the browser. Full results from the exome-wide burden analysis are also available in Supplementary Data 1 and 4. WES data from Epi25 cohorts are available via the NHGRI’s controlled-access AnVIL platform (https://anvilproject.org/; dbGaP accession number: phs001489). Data availability of non-Epi25 control cohorts is provided in the Supplementary Information. Source data are provided with this paper.

Publicly available datasets analyzed in this study include:

Gene family: https://zenodo.org/records/3582386

CORUM protein complexes: https://mips.helmholtz-muenchen.de/corum/

Protein Data Bank: https://www.rcsb.org/

(Structure analyzed in Fig. 3c: https://www.rcsb.org/structure/6x3z)

BrainSpan: https://www.brainspan.org/

Gene Ontology: https://geneontology.org/

ChEA3: https://maayanlab.cloud/chea3/

Code availability

No custom code was used in this study. For sequence data generation, we used GATK version 3.4 and version 3.6 (GATK nightly-2015-07-31-g3c929b0, 3.4-89-ge494930 and 3.6-0-g89b7209), Picard version 1.1431 and VerifyBamlD version 1.0.0. Sample and variant QC was performed using functions in Hail 0.1 and 0.2 (website: https://www.hail.is; documentation: https://hail.is/docs/0.1/ and https://hail.is/docs/0.2/; GitHub repository: https://github.com/hail-is/hail). Variant annotation was performed using the Ensembl Variant Effect Predictor (VEP) version 85 tool as implemented in Hail 0.1 with the LOFTEE annotation provided as default (https://github.com/konradjk/loftee/tree/27b0040f524348baa7f3257flce58993529e09ef). For phenotyping data, case record forms were hosted on the REDCap platform version 14 and entered into the Epi25 data repository (https://github.com/Epi25/epi25-edc). For gene burden analysis, we used the R (version 3.6.1) package logistf version 1.26.0 (https://cran.r-project.org/web/packages/logistf/index.html) to implement the Firth regression model. Additional processing and visualization were performed using R functions in the tidyverse library version l.3.0 (https://www.tidyverse.org/packages/).

References

  1. Fisher, R. S. et al. ILAE official report: a practical clinical definition of epilepsy. Epilepsia 55, 475–482 (2014).

    Article  PubMed  Google Scholar 

  2. World Health Organization. Epilepsy: a public health imperative. https://www.who.int/publications/i/item/epilepsy-a-public-health-imperative (2019).

  3. Annegers, J. F., Hauser, W. A., Anderson, V. E. & Kurland, L. T. The risks of seizure disorders among relatives of patients with childhood onset epilepsy. Neurology 32, 174–179 (1982).

    Article  CAS  PubMed  Google Scholar 

  4. Berkovic, S. F., Howell, R. A., Hay, D. A. & Hopper, J. L. Epilepsies in twins: genetics of the major epilepsy syndromes. Ann. Neurol. 43, 435–445 (1998).

    Article  CAS  PubMed  Google Scholar 

  5. Oliver, K. L. et al. Genes4Epilepsy: an epilepsy gene resource. Epilepsia 64, 1368–1375 (2023).

    Article  PubMed  PubMed Central  Google Scholar 

  6. May, P. et al. Rare coding variants in genes encoding GABAA receptors in genetic generalised epilepsies: an exome-based case–control study. Lancet Neurol. 17, 699–708 (2018).

    Article  CAS  PubMed  Google Scholar 

  7. Baldassari, S. et al. The landscape of epilepsy-related GATOR1 variants. Genet. Med. 21, 398–408 (2019).

    Article  CAS  PubMed  Google Scholar 

  8. Epi4K consortium; Epilepsy Phenome/Genome Project. Ultra-rare genetic variation in common epilepsies: a case-control sequencing study. Lancet Neurol. 16, 135–143 (2017).

    Article  Google Scholar 

  9. Epi25 Collaborative. Ultra-rare genetic variation in the epilepsies: a whole-exome sequencing study of 17,606 individuals. Am. J. Hum. Genet. 105, 267–282 (2019).

    Article  Google Scholar 

  10. Epi25 Collaborative. Sub-genic intolerance, ClinVar, and the epilepsies: a whole-exome sequencing study of 29,165 individuals. Am. J. Hum. Genet. 108, 965–982 (2021).

    Article  Google Scholar 

  11. Samocha, K. E. et al. Regional missense constraint improves variant deleteriousness prediction. Preprint at bioRxiv https://doi.org/10.1101/148353 (2017).

  12. Barwell, J., Snape, K. & Wedderburn, S. The new genomic medicine service and implications for patients. Clin. Med. (Lond.) 19, 273–277 (2019).

    PubMed  Google Scholar 

  13. Goodspeed, K. et al. Current knowledge of SLC6A1-related neurodevelopmental disorders. Brain Commun. 2, fcaa170 (2020).

    Article  PubMed  PubMed Central  Google Scholar 

  14. Absalom, N. L. et al. Gain-of-function and loss-of-function GABRB3 variants lead to distinct clinical phenotypes in patients with developmental and epileptic encephalopathies. Nat. Commun. 13, 1822 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  15. Koko, M. et al. Distinct gene-set burden patterns underlie common generalized and focal epilepsies. EBioMedicine 72, 103588 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  16. Lal, D. et al. Gene family information facilitates variant interpretation and identification of disease-associated genes in neurodevelopmental disorders. Genome Med. 12, 28 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  17. Ruepp, A. et al. CORUM: the comprehensive resource of mammalian protein complexes. Nucleic Acids Res. 36, D646–650, (2008).

    Article  CAS  PubMed  Google Scholar 

  18. Farrant, M. & Nusser, Z. Variations on an inhibitory theme: phasic and tonic activation of GABAA receptors. Nat. Rev. Neurosci. 6, 215–229 (2005).

    Article  CAS  PubMed  Google Scholar 

  19. Maljevic, S. et al. Spectrum of GABAA receptor variants in epilepsy. Curr. Opin. Neurol. 32, 183–190 (2019).

    Article  CAS  PubMed  Google Scholar 

  20. Zhu, S. et al. Structure of a human synaptic GABAA receptor. Nature 559, 67–72 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  21. Kellogg, E. H., Leaver-Fay, A. & Baker, D. Role of conformational sampling in computing mutation-induced changes in protein structure and stability. Proteins 79, 830–838 (2011).

    Article  CAS  PubMed  Google Scholar 

  22. Fu, J. M. et al. Rare coding variation provides insight into the genetic architecture and phenotypic context of autism. Nat. Genet. 54, 1320–1331 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  23. de Kovel, C. G. et al. Recurrent microdeletions at 15q11.2 and 16p13.11 predispose to idiopathic generalized epilepsies. Brain 133, 23–32 (2010).

    Article  PubMed  Google Scholar 

  24. Whitney, R. et al. The spectrum of epilepsy in children with 15q13.3 microdeletion syndrome. Seizure 92, 221–229 (2021).

    Article  PubMed  Google Scholar 

  25. Hardies, K. et al. Duplications of 17q12 can cause familial fever-related epilepsy syndromes. Neurology 81, 1434–1440 (2013).

    Article  CAS  PubMed  Google Scholar 

  26. DiStefano, C. et al. Behavioral characterization of dup15q syndrome: toward meaningful endpoints for clinical trials. Am. J. Med. Genet. A 182, 71–84 (2020).

    Article  PubMed  Google Scholar 

  27. Coughlin, C. R. et al. Mutations in the mitochondrial cysteinyl-tRNA synthase gene, CARS2, lead to a severe epileptic encephalopathy and complex movement disorder. J. Med. Genet. 52, 532–540 (2015).

    Article  CAS  PubMed  Google Scholar 

  28. Kapoor, D., Majethia, P., Anand, A., Shukla, A. & Sharma, S. Expanding the electro-clinical phenotype of CARS2 associated neuroregression. Epilepsy Behav. Rep. 16, 100485 (2021).

    Article  PubMed  PubMed Central  Google Scholar 

  29. Pusalkar, M. et al. Acute and chronic electroconvulsive seizures (ECS) differentially regulate the expression of epigenetic machinery in the adult rat hippocampus. Int. J. Neuropsychopharmacol. 19, pyw040 (2016).

  30. International League Against Epilepsy Consortium on Complex Epilepsies GWAS meta-analysis of over 29,000 people with epilepsy identifies 26 risk loci and subtype-specific genetic architecture. Nat. Genet. 55, 1471–1482 (2023).

    Article  CAS  Google Scholar 

  31. Priori, S. G. et al. Mutations in the cardiac ryanodine receptor gene (hRyR2) underlie catecholaminergic polymorphic ventricular tachycardia. Circulation 103, 196–200 (2001).

    Article  CAS  PubMed  Google Scholar 

  32. Lehnart, S. E. et al. Leaky Ca2+ release channel/ryanodine receptor 2 causes seizures and sudden cardiac death in mice. J. Clin. Invest. 118, 2230–2245 (2008).

    CAS  PubMed  PubMed Central  Google Scholar 

  33. Yap, S. M. & Smyth, S. Ryanodine receptor 2 (RYR2) mutation: a potentially novel neurocardiac calcium channelopathy manifesting as primary generalised epilepsy. Seizure 67, 11–14 (2019).

    Article  PubMed  Google Scholar 

  34. Kang, H. J. et al. Spatio-temporal transcriptome of the human brain. Nature 478, 483–489 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  35. Satterstrom, F. K. et al. Large-scale exome sequencing study implicates both developmental and functional changes in the neurobiology of autism. Cell 180, 568–584 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  36. Mi, H., Muruganujan, A., Casagrande, J. T. & Thomas, P. D. Large-scale gene function analysis with the PANTHER classification system. Nat. Protoc. 8, 1551–1566 (2013).

    Article  PubMed  PubMed Central  Google Scholar 

  37. Kaplanis, J. et al. Evidence for 28 genetic disorders discovered by combining healthcare and research data. Nature 586, 757–762 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  38. Singh, T. et al. Rare coding variants in ten genes confer substantial risk for schizophrenia. Nature 604, 509–516 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  39. Kruidenier, L. et al. A selective jumonji H3K27 demethylase inhibitor modulates the proinflammatory macrophage response. Nature 488, 404–408 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  40. Stamberger, H. et al. NEXMIF encephalopathy: an X-linked disorder with male and female phenotypic patterns. Genet. Med. 23, 363–373 (2021).

    Article  CAS  PubMed  Google Scholar 

  41. de Lange, I. M. et al. De novo mutations of KIAA2022 in females cause intellectual disability and intractable epilepsy. J. Med. Genet. 53, 850–858 (2016).

    Article  PubMed  Google Scholar 

  42. Sirmaci, A. et al. Mutations in ANKRD11 cause KBG syndrome, characterized by intellectual disability, skeletal malformations, and macrodontia. Am. J. Hum. Genet. 89, 289–294 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  43. Low, K. et al. Clinical and genetic aspects of KBG syndrome. Am. J. Med. Genet. A 170, 2835–2846 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  44. Sheikh, B. N., Guhathakurta, S. & Akhtar, A. The non-specific lethal (NSL) complex at the crossroads of transcriptional control and cellular homeostasis. EMBO Rep. 20, e47630 (2019).

    Article  PubMed  PubMed Central  Google Scholar 

  45. Koolen, D. A. et al. Mutations in the chromatin modifier gene KANSL1 cause the 17q21.31 microdeletion syndrome. Nat. Genet. 44, 639–641 (2012).

    Article  CAS  PubMed  Google Scholar 

  46. Zollino, M. et al. Mutations in KANSL1 cause the 17q21.31 microdeletion syndrome phenotype. Nat. Genet. 44, 636–638 (2012).

    Article  CAS  PubMed  Google Scholar 

  47. Fujiwara, K. et al. Deletion of JMJD2B in neurons leads to defective spine maturation, hyperactive behavior and memory deficits in mouse. Transl. Psychiatry 6, e766 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  48. Duncan, A. R. et al. Heterozygous variants in KDM4B lead to global developmental delay and neuroanatomical defects. Am. J. Hum. Genet. 107, 1170–1177 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  49. Delhaye, S. & Bardoni, B. Role of phosphodiesterases in the pathophysiology of neurodevelopmental disorders. Mol. Psychiatry 26, 4570–4582 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  50. Lee, D. Global and local missions of cAMP signaling in neural plasticity, learning, and memory. Front. Pharmacol. 6, 161 (2015).

    Article  PubMed  PubMed Central  Google Scholar 

  51. Nilsson, D. et al. Whole-genome sequencing of cytogenetically balanced chromosome translocations identifies potentially pathological gene disruptions and highlights the importance of microhomology in the mechanism of formation. Hum. Mutat. 38, 180–192 (2017).

    Article  CAS  PubMed  Google Scholar 

  52. Lopriore, P., Gomes, F., Montano, V., Siciliano, G. & Mancuso, M. Mitochondrial epilepsy, a challenge for neurologists. Int. J. Mol. Sci. 23, 13216 (2022).

  53. Harris, P. A. et al. Research electronic data capture (REDCap)—a metadata-driven methodology and workflow process for providing translational research informatics support. J. Biomed. Inform. 42, 377–381 (2009).

    Article  PubMed  Google Scholar 

  54. EPGP Collaborative. The epilepsy phenome/genome project. Clin. Trials 10, 568–586 (2013).

    Article  Google Scholar 

  55. Van der Auwera, G. A. et al. From FastQ data to high confidence variant calls: the Genome Analysis Toolkit best practices pipeline. Curr. Protoc. Bioinformatics 43, 11.10.11–11.10.33 (2013).

    Google Scholar 

  56. McLaren, W. et al. The Ensembl Variant Effect Predictor. Genome Biol. 17, 122 (2016).

    Article  PubMed  PubMed Central  Google Scholar 

  57. Karczewski, K. J. et al. The mutational constraint spectrum quantified from variation in 141,456 humans. Nature 581, 434–443 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  58. Hail version 0.2.62-84fa81b9ea3d. https://github.com/hail-is/hail/commit/84fa81b9ea3d

  59. Li, H. Toward better understanding of artifacts in variant calling from high-coverage samples. Bioinformatics 30, 2843–2851 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  60. Babadi, M. et al. GATK-gCNV enables the discovery of rare copy number variants from exome sequencing data. Nat. Genet. 55, 1589–1597 (2023).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  61. Keenan, A. B. et al. ChEA3: transcription factor enrichment analysis by orthogonal omics integration. Nucleic Acids Res. 47, W212–W224 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgements

We thank the Epi25 principal investigators (PIs), local staff overseeing individual cohorts and all of the individuals with epilepsy and their families who participated in Epi25 for their commitment to this international collaboration. This work is part of the Centers for Common Disease Genomics (CCDG) program, funded by the National Human Genome Research Institute (NHGRI) and the National Heart, Lung, and Blood Institute. CCDG-funded Epi25 research activities at the Broad Institute, including genomic data generation in the Broad Genomics Platform, were supported by NHGRI grant UM1 HG008895 (PIs: E.S.L., S.B.G., M.J.D. and S.K.). Genome Sequencing Program efforts were also supported by NHGRI grant U01HG009088. A supplemental grant for Epi25 phenotyping was supported by ‘Epi25 Clinical Phenotyping R03’, National Institutes of Health R03NS108145 (PIs: D.H.L. and S.F.B.). Additional support for analysis was provided by National Institute of Neurological Disorders and Stroke grant R01NS106104 (PI: C.C.). The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health. We also thank the Stanley Center for Psychiatric Research at the Broad Institute for supporting the genomic data generation. Additional funding sources and acknowledgment of individual cohorts are listed in the supplementary materials.

Author information

Authors and Affiliations

Consortia

Contributions

All authors contributed to patient phenotyping data and sample collection or to analyses. Roles in specific committees of the project are listed in the supplementary materials.

Corresponding authors

Correspondence to Benjamin M. Neale or Samuel F. Berkovic.

Ethics declarations

Competing interests

B.M.N. is a member of the scientific advisory boards of Deep Genomics and Neumora. No other authors have competing interests to declare.

Peer review

Peer review information

Nature Neuroscience thanks the anonymous reviewers for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Results from burden analysis of synonymous URVs.

a,b, Burden of synonymous URVs at the individual-gene (a) and the gene-set (b) level. The observed −log10-transformed P values are plotted against the expectation given a uniform distribution. Burden analyses are performed across four epilepsy groups – 1,938 DEEs, 5,499 GGE, 9,219 NAFE, and 20,979 epilepsy-affected individuals combined – versus 33,444 controls. P values are computed using a Firth logistic regression model testing the association between the case-control status and the number of URVs (two-sided); the red dashed line indicates exome-wide significance P = 3.4 × 10−7 after Bonferroni correction (see Methods).

Source data

Extended Data Fig. 2 Spatiotemporal expression of 13 exome-wide significant genes in the human brain.

Expression values (log2[TPM + 1]) are normalized to the mean for each BrainSpan sample; each dot represents the expression value of a particular gene in a sample collected in a particular brain region and developmental time (from early fetal to adulthood: N = 47/5/5/9/5/4, 69/6/6/7/5/4, 19/2/1/2/1/2, 27/2/2/2/2/3, 30/2/3/2/3/3, 41/3/4/3/4/5, 30/3/3/1/1/3, 36/3/3/2/2/4, and 63/6/6/6/6/6 neocortex/hippocampus/amygdala/striatum/thalamus/cerebellum samples, respectively). LOESS smooth curves are plotted for each brain region across developmental time.

Source data

Extended Data Fig. 3 Distributions of URVs from this study and de novo variants from other NDD studies on the same genes.

Schematic protein plots of nine genes that are significant in both our epilepsy cohort (DEE: developmental and epileptic encephalopathy; EPI: all-epilepsy combined) and previous large-scale WES studies of severe developmental disorders (DD) and/or autism spectrum disorder (ASD) are shown. Asterisk indicates recurring URVs in epilepsy; recurring de novo variants in DD/ASD as well as detailed variant information are provided in Supplementary Data 13.

Source data

Extended Data Fig. 4 Results from genetic ancestry- and sex-specific burden analyses.

a, The numbers of epilepsy cases (orange) and controls (blue) by genetic ancestry. b, Comparison of protein-truncating (left) and damaging missense (right) URV burden in the top ten genes from the primary analysis (‘All’) across genetic ancestry subgroups. Red color indicates enrichment in cases (log[OR] > 1), with an asterisk indicating nominal significance (P≤ 0.05; see Supplementary Data 14 for exact P values). P values are computed using a Firth logistic regression model testing the association between the case-control status and the number of URVs (two-sided). c, Genetic ancestry-specific burden of URVs in established epilepsy genes (N = 171 curated by the Genetic Epilepsy Syndromes [GMS] panel with a known monogenic/X-linked cause), constrained genes (N = 1,917 scored by the loss-of-function observed/expected upper bound fraction [LOEUF] metric as the most constrained 10% genes), and constrained genes excluding established epilepsy genes (N = 1,813). Overall, different ancestral groups show at least partially shared burden of deleterious URVs in these gene sets. In a-c, NFE: Non-Finnish European (Ncase=16,040, Ncontrol = 25,641), AFR: African (Ncase=1,598, Ncontrol = 2,592), AMR: Ad Mixed American (Ncase=480, Ncontrol = 3,106), EAS: East Asian (Ncase=1,698, Ncontrol = 1,215), FIN: Finnish (Ncase=926, Ncontrol = 537), SAS: South Asian (Ncase=237, Ncontrol = 353). d, Sex-specific burden of URVs in established epilepsy genes. Burden analyses are performed for three gene sets described in c, with an additional set of 37 X-linked GMS epilepsy genes, across four epilepsy groups (female: NDEE = 811, NGGE = 4,807, NNAFE = 3,511, NEPI(all)=11,372, Ncontrol = 18,144; male: NDEE = 997, NGGE = 2,579, NNAFE = 4,395, NEPI(all)=10,397, Ncontrol = 15,302). There is an overall trend of shared URV burden between female and male subgroups in these gene sets. In c and d, the dot represents the log odds ratio and the error bars represent the 95% confidence intervals of the point estimates. For presentation purposes, error bars that exceed a large log odds ratio value are capped, indicated by arrows at the end of the error bars (see Supplementary Data 14 and 15 for exact values). e, Comparison of sex-specific burden of protein-truncating URVs at level of the individual genes. For each gene, the −log10-transformed P value from the female subgroup analysis (y-axis) is plotted against that from the male subgroup analysis (x-axis). Top ten genes with URV burden in epilepsy are labeled for each subgroup, with genes on the sex chromosomes colored in blue. The red dashed line indicates exome-wide significance P = 3.4 × 10−7 after Bonferroni correction.

Source data

Extended Data Fig. 5 Results from burden analysis of protein-truncating and damaging missense URVs combined.

a, Joint burden of protein-truncating and damaging missense URVs at the individual-gene level. The observed −log10-transformed P values are plotted against the expectation given a uniform distribution. Burden analyses are performed across four epilepsy groups – 1,938 DEEs, 5,499 GGE, 9,219 NAFE, and 20,979 epilepsy-affected individuals combined – versus 33,444 controls. P values are computed using a Firth logistic regression model testing the association between the case-control status and the number of URVs (two-sided); the red dashed line indicates exome-wide significance P = 3.4 × 10−7 after Bonferroni correction (see Methods). b, Comparison of the joint burden in a with the burden of protein-truncating URVs. The odds ratio (OR) of protein-truncating plus damaging missense URVs (y-axis) and that of protein-truncating URVs alone (x-axis) are compared. Each dot represents a gene with nominally significant enrichment (OR > 0 and P ≤0.05) of either protein-truncating URVs or the two variant classes combined.

Source data

Extended Data Fig. 6 URV discovery and burden results across Epi25 data collection.

a, Increase in the number of protein-truncating and damaging missense URVs discovered in epilepsy genes with a known monogenic cause. b, Increase in the number of monogenic epilepsy genes identified with a protein-truncating or damaging missense URV. In a and b, variant/gene count is plotted against the year of Epi25 data collection; the total number of epilepsy cases analyzed in each year is indicated in parenthesis. c, URV burden of previously top-ranked genes in this study. The odds ratio of protein-truncating URVs in genes from this study (y-axis) and the prior Epi25 publication (x-axis) are compared. Each dot represents one of the top ten genes implicated by our previous burden analysis (across three epilepsy subtypes). Genes with a known monogenic/X-linked cause are labeled and colored in purple. d, Increase in the total, non-European ancestry, and effective sample size in this study over our previous publications. The effective sample size is computed as 4/(1/Ncase + 1/Ncontrol). e,f, The sample size required for well-powered gene burden testing. The percentage of genes powered to detect significant URV burden (Fisher’s exact P ≤0.05) at different effect sizes (e) and case:control ratios (f) is shown as a function of log-scaled sample size of epilepsy cases. Lighter color indicates smaller effect size (weaker burden), which requires a larger sample size to detect. The gray vertical line indicates the current sample size of 20,979 cases. In e, horizontal lines indicate 80% and 50% detection power, and vertical dashed lines indicate the estimated number of cases required to achieve 80% at the benchmarked effective sizes. In f, dashed and dotted curves indicate power estimation with increased control:case ratios from 1.6 (in this study) to 3.2 and 6.4, respectively; horizontal lines indicate the estimated power achieved by doubling and quadrupling the number of controls at the current sample size of cases. g, Epilepsy subtype-specific burden of URVs in established epilepsy genes (N = 171 curated by the Genetic Epilepsy Syndromes [GMS] panel with a known monogenic/X-linked cause), constrained genes (N = 1,917 scored by the loss-of-function observed/expected upper bound fraction [LOEUF] metric as the most constrained 10% genes), and constrained genes excluding established epilepsy genes (N = 1,813). Burden analyses are performed across three epilepsy subtypes – 1,938 DEEs, 5,499 GGE, and 9,219 NAFE – versus 33,444 controls. Protein-truncating and damaging missense URVs from DEEs exhibit the strongest enrichment in epilepsy panel genes, while all epilepsy subtypes show significant enrichment in constrained genes even after excluding the panel genes. No enrichment is observed for synonymous URVs. The dot represents the log odds ratio and the error bars represent the 95% confidence intervals of the point estimates.

Source data

Supplementary information

Supplementary Information

Supplementary Tables 1–3, Supplementary Figs. 1–5, Supplementary Subjects and Methods, Supplementary Acknowledgments and Supplementary References

Reporting Summary

Supplementary Data 1–15

Supplementary Data 1 | Results from exome-wide gene-based burden analysis of URVs. a,b, Burden of protein-truncating (a) and damaging missense (b) URVs in each protein-coding gene with at least one epilepsy or control carrier. For each variant class, burden analyses were performed across four epilepsy groups—1,938 DEEs, 5,499 GGE, 9,219 NAFE and 20,979 epilepsy-affected individuals combined (‘EPI’)—versus 33,444 controls. P values were computed using a Firth logistic regression model with adjustment for sex and ancestry. Supplementary Data 2 | Results from burden analysis of GATOR1 genes. Burden of protein-truncating URVs in GATOR1 complex and the three GATOR1-encoding genes (DEPDC5, NPRL3 and NPRL2), analyzed separately for familial (n = 1,162) and non-familial (n = 8,014) NAFE cases. P values were computed using a Firth logistic regression model with adjustment for sex and ancestry. Supplementary Data 3 | List of damaging missense URVs in SLC6A1 and GABRB3. Damaging missense URVs identified in SLC6A1 and GABRB3 with at least one epilepsy carrier; ‘novel’ indicates that the variant has not been previously reported. Coordinates are on GRCh38. Supplementary Data 4 | Results from exome-wide gene-set-based burden analysis of URVs. a,b, Burden of protein-truncating (a) and damaging missense (b) URVs in each gene set (gene family/protein complex) with at least one epilepsy or control carrier. For each variant class, burden analyses were performed across four epilepsy groups—1,938 DEEs, 5,499 GGE, 9,219 NAFE and 20,979 epilepsy-affected individuals combined (‘EPI’)—versus 33,444 controls. P values were computed using a Firth logistic regression model with adjustment for sex and ancestry. Supplementary Data 5 | List of protein-truncating URVs in GATOR1 genes. Protein-truncating URVs identified in GATOR1-encoding genes (DEPDC5, NPRL3 and NPRL2) with at least one NAFE carrier; ‘novel’ indicates that the variant has not been previously reported. Coordinates are on GRCh38. Supplementary Data 6 | Results from burden analysis of GABAA receptor complex. Burden of damaging missense URVs in the (α1)2(β2)2(γ2) GABAA receptor complex with respect to its structural ___domain; ECD, extracellular ___domain; TMD, transmembrane ___domain; TMD-2, the second TMD that forms the ion channel pore. For each ___domain, burden analyses were performed across three epilepsy groups—1,938 DEEs, 5,499 GGE and 9,219 NAFE—versus 33,444 controls. P values were computed using a Firth logistic regression model with adjustment for sex and ancestry. Supplementary Data 7 | Results from protein structural analysis of ion channel complexes. ddG values of missense URVs across 16 ion channel protein complexes with experimentally resolved 3D structures available (‘PDB_ID’). A higher absolute ddG value suggests a more deleterious effect on protein stability; positive and negative values suggest destabilizing and stabilizing effects, respectively. Coordinates are on GRCh38. Supplementary Data 8 | Results from burden analysis of ion channel complexes by ddG. Burden of damaging missense URVs in 16 ion channel protein complexes stratified by ddG. ddG ≥ 1 and ddG ≤ −1 were applied to define destabilizing and stabilizing missense URVs, respectively; |ddG| ≥ 1 comprises both. P values were computed using a Firth logistic regression model with adjustment for sex and ancestry. Supplementary Data 9 | Results from burden analysis of GABAA receptor complex by ddG. Burden of (de)stabilizing missense (|ddG| ≥ 1) URVs in the (α1)2(β2)2(γ2) GABAA receptor complex with respect to its structural ___domain; ECD: extracellular ___domain, TMD: transmembrane ___domain. ddG≥1 and ddG≤-1 are applied to define destabilizing and stabilizing missense URVs, respectively; |ddG| ≥ 1 comprises both. P values were computed using a Firth logistic regression model with adjustment for sex and ancestry. Supplementary Data 10 | Results from exome-wide burden analysis of rare CNVs. ad, GD-based burden of CNVs (a), gene-based burden of CNV deletions (b), CNV deletions plus protein-truncating URVs (c) and CNV duplications (d) with at least one epilepsy or control carrier. For each variant class, burden analyses were performed on the subset of samples that passed CNV calling QC, across four epilepsy groups—1,743 DEEs, 4,980 GGE, 8,425 NAFE and 18,963 epilepsy-affected individuals combined (‘EPI’)—versus 29,804 controls. P values were computed using a Firth logistic regression model with adjustment for sex and ancestry. Supplementary Data 11 | Results from burden analysis of GGE GWAS genes. a, Gene-set-based burden of URVs in 23 genes implicated by GGE GWAS loci. Burden analyses were performed across four variant classes and two epilepsy groups—5,499 GGE and 9,219 NAFE—versus 33,444 controls. b, Gene-based burden of protein-truncating URVs in 14 GGE GWAS genes with enrichment in GGE (‘GGE_logOR’ > 0). P values were computed using a Firth logistic regression model with adjustment for sex and ancestry. Supplementary Data 12 | Results from functional analysis of candidate epilepsy genes. a, List of candidate epilepsy genes with a prenatal (n = 43) or postanal (n = 50) expression preference in the human brain. Ten prenatal genes with a TF function are indicated in ‘prenatal_TF’, and their regulatory targets overlapping with the postnatal genes are listed in ‘postnatal_targets’. b,c, GO terms enriched for the 43 prenatal and 50 postnatal genes (b) and all regulatory target genes linked to the 10 prenatal TFs (c). GO enrichment analysis was performed via the Gene Ontology Enrichment Analysis webserver (http://geneontology.org/). Supplementary Data 13 | Results from burden analysis of NDD genes. a, Gene-set-based burden of URVs in genes implicated by WES of severe DDs (n = 285), ASD (n = 185) and SCZ (n = 32) and on the subsets of mutually exclusive genes (that is, 196 DD-only, 99 ASD-only and 22 SCZ-only genes). b, Gene-based burden of URVs in genes analyzed in a. In a and b, P values were computed using a Firth logistic regression model with adjustment for sex and ancestry. c,d, Protein-truncating and damaging missense variants identified in nine genes that are significant in both this and other NDD WES studies (c) and in KDM6B (d). Supplementary Data 14 | Results from ancestry-specific burden analysis of URVs. a,b, Burden of protein-truncating (a) and damaging missense (b) URVs in each protein-coding gene with at least one epilepsy or control carrier. c, Gene-set-based burden of URVs in established epilepsy genes (n = 171 curated by the GMS panel), constrained genes (n = 1,917 scored by the loss of function observed/expected upper bound fraction (LOEUF) metric) and constrained genes excluding established epilepsy genes (n = 1,813). Burden analyses were performed across six genetic ancestry groups—16,040/25,641, 1,598/2,592, 480/3,106, 1,698/1,215, 926/537 and 237/353 case/control—of Non-Finnish European (NFE), African (AFR), Admixed American (AMR), East Asian (EAS), Finnish (FIN) and South Asian (SAS) samples, respectively. P values were computed using a Firth logistic regression model with adjustment for sex. Supplementary Data 15 | Results from sex-specific burden analysis of URVs. a,b, Burden of protein-truncating (a) and damaging missense (b) URVs in each protein-coding gene with at least one epilepsy or control carrier. c, Gene-set-based burden of URVs in established epilepsy genes (n = 171 curated by the GMS panel), X-linked GMS genes (n = 37), constrained genes (n = 1,917 scored by the LOEUF metric) and constrained genes excluding established epilepsy genes (n = 1,813). Burden analyses were performed for female and male subgroups separately, across four epilepsy groups—868/1,070 DEEs, 3,251/2,248 GGE, 4,818/4,401 NAFE and 11,001/9,978 all-epilepsy combined (‘EPI’) female/male cases—versus 18,143/15,301 female/male controls. P values were computed using a Firth logistic regression model with adjustment for ancestry.

Source data

Source Data Figs. 1–7 and Extended Data Figs 1–6

Source Data Statistical Source Data for Figs. 1–7 and Extended Data Figs. 1–6

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Epi25 Collaborative. Exome sequencing of 20,979 individuals with epilepsy reveals shared and distinct ultra-rare genetic risk across disorder subtypes. Nat Neurosci 27, 1864–1879 (2024). https://doi.org/10.1038/s41593-024-01747-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s41593-024-01747-8

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing