Introduction

Serum alanine aminotransferase (ALT), aspartate aminotransferase (AST), and gamma-glutamyl transferase (GGT) levels are widely used in clinical tests to evaluate liver injury or function. The hepatocellular or cholestatic injury, caused by various conditions, including hepatitis virus infection, malignancy, autoimmune diseases, alcohol intake, obesity, and drugs, leads to the increase of the levels of these serum liver enzymes1,2. Although AST and ALT are abundant in the canalicular membrane of hepatocytes, the enzymes also exist in extrahepatic tissues such as cardiac muscle, skeletal muscle, kidney, and brain, especially AST. Therefore, serum AST levels are also elevated in individuals with myocardial infarction or myositis. In contrast, because the amount of ALT is lower in other tissues, ALT elevation is relatively specific to hepatocellular injury. Elevated serum GGT levels usually reflect bile stasis, but serum GGT levels are also elevated during the detoxification of xenobiotics, which include alcohol3 or in patients treated with enzyme-inducing anticonvulsant drugs4.

Additionally, several genetic variants have been shown to abolish these enzyme activities and significantly affect the results of clinical measurements of liver enzymes5,6, because the serum levels of these enzymes are usually determined by measuring their enzyme activities in clinical practice. Therefore, it is expected that the serum liver enzyme levels are affected by genetic factors related or unrelated to liver injury5,6,7.

During the last two decades, genome-wide association studies (GWAS) have been extensively performed worldwide, and a substantial number of loci were successfully identified to be associated with common diseases and quantitative traits8,9,10. Large-scale GWAS for serum liver enzyme levels have also been performed, which identified 954 loci in European11,12, 521 loci in trans-ancestry13, and 284 loci in Japanese studies10.

The Ryukyu Archipelago is located southwest of the Japanese Islands and comprises dozens of islands. People from the Ryukyu Archipelago have been shown to possess unique genetic backgrounds that differ from those of the mainland Japan14,15,16,17,18. However, no GWAS has been conducted specifically on the Ryukyu population. Because GWAS using genetically isolated populations identified novel loci with larger effect sizes, even in a smaller sample size19,20, GWAS focused on the Ryukyu population is important and necessary to elucidate the genetic factors not only specific to the Ryukyu population but also for other populations.

In this study, we conducted a GWAS for serum liver enzyme levels using the Ryukyu population for the first time from two independent Japanese study groups: Okinawa Bioinformation Bank (OBi)16,17,21 and BioBank Japan (BBJ)22,23,24.

Results

Population structure analyses

The overall study design is shown in Supplementary Fig. 1. We analyzed 10,285 individuals registered in the OBi using principal component analysis (PCA), along with individuals in the 1KGP phase 3 reference panel (AFR, n = 661; AMR, n = 347; EAS, n = 504; EUR, n = 503; SAS, n = 489), and defined 10,136 individuals as the Japanese population (Supplementary Fig. 2b). Subsequent PCA combined with uniform manifold approximation and projection (UMAP) for the Japanese population revealed that the population was further divided into five clusters (Supplementary Fig. 2c). Information on the birthplace of grandparents for each participant indicated that one cluster consisted of people from mainland Japan, the Hondo cluster (OBi Hondo, n = 1421), and the remaining four consisted of people from the Ryukyu archipelago, the Ryukyu cluster (OBi Ryukyu, n = 8715) (Supplementary Fig. 2c).

We also performed PCA on 184,092 participants registered in BBJ and 1KGP Phase 3 and removed 100 individuals defined as outliers not belonging to JPT or CHB (Supplementary Fig. 2d). From the PCA results of the remaining 183,992 individuals in BBJ, we defined two clusters: BBJ Ryukyu and BBJ Hondo (Supplementary Fig. 2e).

GWAS for serum liver enzyme levels in the Ryukyu population

The clinical characteristics of the Ryukyu population brought to GWAS are presented in Table 1. We observed 13 independent loci with GWAS (P < 5 × 10−8) for three serum liver enzyme traits (Fig. 1, Table 2, Supplementary Tables 1, 2, 3), three for ALT (n = 15,224), rs3747207 in PNPLA3, rs3782886 in BRAP, and rs117595134 in HMMR/HMMR-AS1, four for AST (n = 15,203), rs76850691 in GOT1, rs3747207 in PNPLA3, rs369827206 in MRC1, rs147992802 near NAA25, and six for GGT, (n = 14,496), rs4049918 in GGT1, rs5760109 near MIF-AS1, rs58367757 in HNF1A, rs7678352 in ZNF827, rs11066325 in PTPN11, rs7519043 near EPHA2 (Fig. 2, Supplementary Fig. 3). Most of these loci have been previously reported to be associated with serum liver enzyme levels, except for the HMMR/HMMR-AS1 locus, whose association with ALT or other liver enzyme traits has not been previously reported.

Table 1 Characteristics of the participants.
Fig. 1
figure 1

Results of meta-analyses for genome-wide association studies for serum liver enzyme levels in the Ryukyu population. Manhattan plots (left) and Quantile—Quantile plots (right) of GWAS for serum (a) ALT, (b) AST, and (c) GGT levels. Red lines indicate the threshold for the genome-wide significance level (P = 5 × 10−8). Green plots correspond to variants that attained genome-wide significance. A novel locus is shown by red letters.

Table 2 Loci achieving genome-wide significant association (P < 5 × 10–8) with serum liver enzyme levels in the Ryukyu population.
Fig. 2
figure 2

A regional plot for association of HMMR/MMMR-AS1 locus with serum ALT values in the Ryukyu population. The lead variant is shown as purple diamond. The other variants are colored according to the extent of LD estimated in 3256 Japanese WGS samples from the BBJ and 2504 individuals from the 1KGP phase 3.

The minor A-allele of rs117595134 within HMMR/HMMR-AS1 was observed only in the Japanese in the 1000 genomes database, 5.3% in JPT (Supplementary Table 4). In this study, the A-allele frequencies were significantly higher in the Ryukyu population than in the Hondo population (P = 2.59 × 10−6, 6.0%, 6.0%, 2.9%, and 2.8% in OBi Ryukyu, BBJ Ryukyu, OBi Hondo, and BBJ Hondo, respectively; Table 3). Additionally, the absolute values of effect sizes (ß) of the A-allele are also significantly larger in the Ryukyu population than in the Hondo population (P for heterogeneity = 2.16 × 10−4, ß = – 0.133, −0.128, 0.015, −0.012 in OBi Ryukyu, BBJ Ryukyu, OBi Hondo, and BBJ Hondo, respectively, Table 3).

Table 3 Associations of rs117595134 with serum ALT levels in the Ryukyu and the Hondo population.

Replication study for previously reported loci in the Ryukyu population

We conducted replication studies of the loci reported in the Korean Genome and Epidemiology Study (KoGES)25 and UK Biobank11,12. Among the 24 loci associated with ALT in 126 K Korean individuals in KoGES, two loci showed genome-wide significant association (P < 5.00 × 10−8), and nine loci showed a significant association in the Ryukyu population (P < 2.08 × 10−3 = 0.05/24). Similarly, three out of 25 loci for AST and four out of 43 loci for GGT showed genome-wide significant association and eight loci for AST and 13 loci for GGT showed significant associations in the Ryukyu population (P < 2.00 × 10−3 = 0.05/25, P < 1.16 × 10−3 = 0.05/43). In total, 77 of 80 signals (20 of 22 in ALT, 22 of 22 in AST, and 35 of 36 in GGT) showed the same directions of effect (P = 1.21 × 10−4 for ALT, P = 4.77 × 10−7 for AST, and P = 1.08 × 10−9 for GGT, P = 1.41 × 10−19 in total, binomial test) (Supplementary Table 5).

Among the 247 loci associated with ALT in 437 K European individuals in the UK Biobank, one locus showed a genome-wide significant association, and four loci showed a significant association in the Ryukyu population (P < 2.02 × 10−4 = 0.05/247). Regarding European-derived loci for AST in 388 K Europeans and GGT in 437 K Europeans, one locus for AST and 5 loci for GGT showed genome-wide significant association, and 3 of 336 loci for AST and 12 of 371 loci for GGT showed significant association in the Ryukyu population (P < 1.49 × 10−4 = 0.05/336 for AST, P < 1.35 × 10−4 = 0.05/371 for GGT) (Supplementary Tables 6, 7, 8). In total, 541 of 716 signals (142 of 185 in ALT, 174 of 249 in AST, and 225 of 282 in GGT) showed the same directions of effect (P = 1.47 × 10−13 for ALT, P = 3.11 × 10−10 for AST, and P = 8.94 × 10−25 for GGT, P = 2.50 × 10−44 in total, binomial test).

Gene-based analyses using MAGMA

Next, we performed gene-based association analyses using MAGMA using data from the GWAS meta-analyses in the Ryukyu population and identified 22 genes significantly associated with liver enzyme traits: five for ALT, three for AST, and 14 for GGT (P < 2.63 × 10−6 = 0.05/18,980, Supplementary Tables 9, 10, 11, Supplementary Fig. 4). These genes are located within ± 500 kb of the individual loci that have been previously reported in GWAS for serum liver enzyme levels. Of these, the association of two genes, CHCHD10 and C22orf15, with GGT, has not been previously reported by gene-based analyses. We conducted gene-set analyses using MAGMA but could not identify any curated gene sets or Gene Ontology terms significantly associated with ALT, AST, and GGT (Supplementary Tables 12, 13, 14). Tissue expression enrichment analyses using MAGMA revealed that the genes for GGT, identified through gene-based analysis, were significantly enriched in the liver tissue. In contrast, no significant enrichment was observed for genes associated with ALT and AST (Supplementary Tables 15, 16, 17, Supplementary Fig. 5).

Discussion

In this study, we conducted GWAS meta-analyses using the Ryukyu population and found 13 loci with genome-wide significant associations for three serum liver enzyme traits, one novel locus, HMMR/HMMR-AS1 for ALT, and 12 known loci.

GWAS have been extensively performed to identify the genetic loci associated with common diseases or quantitative traits, such as laboratory tests. Thousands of genetic loci have been reported to be associated with serum liver enzyme levels, including ALT, AST, and GGT, through GWAS using nearly a million participants, mostly individuals of European origin. A large-scale GWAS for the liver enzyme traits in the Japanese has also been performed, and additional 130 loci associated with ALT, AST, and GGT have been identified10,23,24.

In this study, we performed GWAS for serum liver enzyme levels, focusing on the Ryukyu population for the first time. Although the Japanese population is considered genetically homogenous, it has been known that there are two genetically distinct groups, Hondo for the mainland Japan and Ryukyu for the Ryukyu archipelago located in the south-western part of Japanese islands, from the results of principal component analyses using information of genome-wide single nucleotide polymorphism genotypes14,15. Additionally, it has been reported that people from the Ryukyu archipelago are further divided into several sub-groups16,17. Therefore, the Ryukyu population has a unique genetic background that differs from other regions. Since GWAS for isolated populations with unique genetic backgrounds have been reported to be useful for identifying novel loci with larger effect sizes even in smaller sample sizes, such as TBC1D4 for type 2 diabetes in Greenlandic Inuits19 or CREBRF for body mass index in Samoans20, it is suggested that GWAS using populations with unique genetic backgrounds, such as the Ryukyu population, is useful and important for identifying additional novel loci even if the sample size is not very large.

Our GWAS meta-analysis of ALT identified a novel SNP locus, rs117595134, within HMMR/HMMR-AS1 on chromosome 5. In this study, the A-allele frequencies are significantly higher in the Ryukyu population than in the Hondo population, and the absolute values of effect sizes (ß) of the A-allele are also significantly larger in the Ryukyu population than in the Hondo population. Therefore, the higher allele frequency and larger effect size for this variant observed in the Ryukyu population indicate that HMMR/HMMR-AS1 is likely to be a Ryukyu population-specific locus associated with ALT measures. We further searched for the whole genome sequencing data from 254 individuals belonging to the Ryukyu population to identify functional variants with high linkage disequilibrium to rs117595134 (r2 ≥ 0.6), but no variant exists within ± 5 Mb from rs117595134, showing a significant deleterious effect evaluated using Combined Annotation Dependent Depletion software (Supplementary Table 18).

This variant is located in the intron of HMMR and HMMR-AS1. HMMR encodes a hyaluronan-mediated motility receptor, also known as CD168 or RHAMM26. The HMMR has been reported to be overexpressed in various tumor tissues, such as breast cancer27 and gastrointestinal cancer28 and implicated in promoting carcinogenesis29. Additionally, HMMR was recently reported as a critical gene associated with both nonalcoholic steatohepatitis and hepatocellular carcinoma (HCC), and silencing HMMR reduces HCC cell proliferation and tumor growth while also regulating cell cycle progression30,31. However, no evidence suggests a role for HMMR in directly or indirectly regulating serum ALT levels. Further studies are required to elucidate the molecular mechanisms through which this locus regulates serum ALT levels.

This study has several limitations. First, although this was the largest GWAS of serum liver enzyme traits in the Ryukyu population, the sample size was insufficient. Therefore, further studies are required to validate the association of the novel locus. Since statistical power to detect significant association of rs117595134 with ALT value in BBJ-Hondo was insufficient (14.4% for α = 0.05, Supplementary Table 19), additional replication sets with larger sample size or with larger effect size would be required to validate the association of this locus. Second, precise information on liver diseases, such as a history of hepatitis B or C viral infection, presence of diseases including metabolic dysfunction-associated fatty liver disease and HCC, is lacking in the OBi participants. Third, differences in the characteristics of participants between OBi and BBJ exist. BBJ is a hospital-based disease cohort, whereas OBi is a population-based study, and the mean age of BBJ participants is higher than that of OBi participants. These differences may have reduced the statistical power of the meta-analysis. Fourth, some databases for in silico analyses have been built from European data, which reduces the statistical power of these analyses.

In conclusion, we performed a GWAS for serum liver enzyme levels in the Ryukyu population for the first time and found several independent loci with genome-wide significant associations, including a novel locus associated with serum ALT. The lead variant located in the intron of HMMR/HMMR-AS1, rs117595134, is specific to the Japanese population, especially the Ryukyu population. Further studies are required to elucidate the biological mechanisms of this locus that contribute to the regulation of serum ALT levels.

Methods

Participants, genotyping, and imputation

We analyzed individuals registered in Okinawa Bioinformation Bank (OBi), which is a population-based study in Okinawa Prefecture established in 201616,17,21. Individual’s values for laboratory tests, including serum ALT, serum AST, and serum GGT; anthropometric measurements, including height, body weight, and body mass index; and lifestyle questionnaire information, including alcohol consumption and smoking status, were obtained from results of health check-up. Genomic DNA was extracted from peripheral leukocytes or saliva using a standard procedure, and SNP genotyping was performed using the Infinium Asian Screening Array v1.0 BeadChip (Illumina, Inc. CA, USA).

We also analyzed participants registered in BioBank Japan (BBJ), a hospital-based cohort study consisting of patients with 47 diseases from 12 medical institutions in Japan22,23,24. Clinical information for laboratory test results, including serum ALT, AST, and GGT; anthropometric measures, including height, body weight, and body mass index; lifestyle questionnaire, including alcohol drinking behavior; and the status of 47 target diseases were obtained from the clinical records. Individuals with liver diseases, including liver cirrhosis, Hepatitis B or C virus infection, hepatocellular carcinoma, gallbladder carcinoma, or cholangiocarcinoma were excluded from the analyses. Genotyping was performed using Illumina Infinium Omni Express, Human Exome, Infinium Omni Express Exome v1.0, or Infinium Omni Express Exome v1.2 (Illumina Inc, CA, USA).

As a quality control (QC), we removed individuals (i) other than Japanese (outliers determined using PCA with 1KGP phase 3 as a reference panel), (ii) sex mismatch between clinical and genotypic information, (iii) sample call rate < 0.98, and (iv) one of the pairs of identical twins or duplicate samples determined using identity-by-descent (PIHAT > 0.95). For QC of SNPs, we excluded SNPs with (i) SNP call rate < 0.98, (ii) deviated genotype distribution from Hardy Weinberg Equilibrium (P ≤ 1.0 × 10−6), (iii) minor allele frequency < 0.01. The QC process was performed using PLINK v1.9.

The genotyped data were phased using Eagle (version 2.3) and imputed using linkage disequilibrium (LD) information for 3,256 Japanese WGS from the BBJ and 2,504 individuals from the 1KGP (phase3v5)32 using Minimac4 software. We excluded variants with low imputation quality (Rsq < 0.7) and low minor allele frequency (minor allele frequency < 0.005).

GWAS and GWAS meta-analyses

To perform GWAS for the values of serum liver enzymes, we standardized serum ALT, AST, and GGT values using the following procedures: (i) we first transformed individual values to common logarithms, (ii) distributions of log-transformed values were evaluated in females and males separately, and samples with above 4SD or below − 4SD were removed, (iii) linear regression analysis was performed by adjusting age, sex, principal components 1 to 10, drinking status (none; 0, sometimes; 1, almost every day; 2 in OBi, no; 0 or yes; 1 in BBJ) and the each disease statuses of the 42 diseases (unaffected;0, affected;1 only for BBJ), and (iv) the residuals for the regression analyses were z-standardized and applied to GWAS as quantitative values (Supplementary Fig. 6). We conducted GWAS for serum liver enzyme levels using a linear mixed model in BOLT-LMM (v.2.3.6)33, which effectively accounts for population stratification and cryptic relatedness, by assuming candidate SNP is modeled as a fixed effect, and environmental effect and polygenic effect by residual SNPs other than tested SNP are modeled as random effects.

We performed meta-analyses using the inverse-variance weighted method with a fixed-effects model and assessed heterogeneity using Cochran’s Q-test with METAL software (v.2020-05-05)34. The genomic inflation factors (λGC) of GWAS meta-analyses in the Ryukyu population for the ALT, AST, and GGT were 1.03, 1.04, and 1.06 respectively (Fig. 1), and those of the LDSC intercepts were 1.0109, 1.0097, and 1.0398 respectively (Supplementary Table 20). These results indicate there was no significant inflation attributable to population stratification and other confounding factors. We defined independent genome-wide significant loci as genomic positions within ± 500 kb from the lead variant. We searched for the genome-wide significant loci of this study in the GWAS catalog, and if there are no variants reported as associated with each serum liver enzyme level in the locus within ± 1 Mb from the lead variants of this study, we determined these loci as novel loci associated with the liver enzyme trait. We annotated rsIDs in the dbSNP database, the closest genes to the lead variants, using ANNOVAR software35.

LD score regression

We conducted LD Score Regression v1.0.136 using GWAS meta-analysis data for ALT, AST and GGT in the Ryukyu population. For the regression, we used the East Asian LD scores and summary statistics of high-quality common SNPs present in the HapMap 3 reference panel for each trait.

Gene-based and gene-set association analyses

Gene-based and gene-set association analyses were conducted using MAGMA v1.08, as implemented in FUMA v1.6.137. We used summary statistics data for the meta-analysis in this study. These included 8,413,895 variants for ALT, 8,412,728 for AST, and 8,395,794 for GGT. The variant information was converted from GRCh38 (hg38) to GRCh37 (hg19) using the lift-over method (LiftOver; https://genome.sph.umich.edu/wiki/LiftOver). Individual variants were mapped to 18,980 protein-coding genes in the hg19 build, and gene-based p-values were calculated. Information on LD from the 1KGP phase 3 EAS was used, and window sizes were set to range from 2 kb upstream to 1 kb downstream of the genes. The statistical significance threshold for the gene-based analysis was set at a Bonferroni-corrected P-value, < 2.63 × 10−6 (0.05/18,980). In the gene-set analysis, 17,009 gene sets (curated gene sets and Gene Ontology terms) from MsigDB v7.0 were assessed, and the significance threshold was set at P < 2.94 × 10−6 (0.05/17,009). We conducted tissue expression enrichment analyses using the MAGMA tool and RNA sequencing data from 54 different tissue types sourced from the GTEx v8.

In silico functional annotation

The functional category was annotated using Combined Annotation Dependent Depletion (CADD) v1.638 to evaluate variant pathogenicity.

Statistical power calculation

Power calculations for the association of rs117595134 within novel locus were given for a range of allele frequencies and effect sizes based on the additive genetic model and using the Quanto (Version 1.2.3) program39.

Softwares

Manhattan and quantile–quantile plots were drawn using R software version 4.0.5 [https://CRAN.R-project.org/], and regional association plots were generated using LocusZoom40.