Introduction

Among the small millets, finger millet (Eleusine coracana L. Gaertn.) stands out due to its significant nutritional value. Rich in calcium, dietary fiber, phytochemicals and essential amino acids1, this allotetraploid crop (2n = 4x = 36, AABB) is mostly grown under rainfed conditions. Although originating in Africa, India is the leading producer of finger millet, with 1.22 million hectares under cultivation, producing 1.7 million tons at a productivity of 1724 kg per hectare, contributing to 60% of global production2. With the increasing impact of climate change, the climate-resilient crops such as finger millet have become indispensable3. The success of finger millet cultivation crucially hinges on seed quality and longevity, necessitating precise methods to assess physiological traits that determine seed quality for propagation and subsequent growth stages4.

Studies on seed longevity traits are important in finger millet, as the crop is primarily grown in rainfed systems. Early seedling vigor and longevity enhance seedling establishment, maintain moisture during grain filling, and ensure long-term viability. Seed longevity, being a quantitative trait, varies within the species and also among different species5. During seed storage, the inevitable aging process results in poor quality of stored seeds that lead to losses to the tune of 25%6,7,8).

Seed longevity and aging directly impact agricultural sustainability, particularly when farmers store seeds for extended periods. Prolonged storage under various traditional conditions can significantly affect seed viability. This reduction in viability leads to decreased germination rates and consequently, lower crop yields. Moreover, with the ongoing challenges of global warming, the decline in seed longevity is expected to become a critical concern for farmers who rely on farm-saved seeds for future planting seasons rather than purchasing new seeds at a higher cost9.

Understanding the genetic mechanisms underlying seed longevity and seedling vigor related traits is essential for developing improved cultivars suitable to diverse environmental conditions. The accelerated aging (AA) test stands out as an efficient method for evaluating seed vigor. AA test provides valuable insights into the storage potential of seed lots and their emergence in the field within a short span10. Studies have shown that results from AA tests provide a more accurate prediction of field emergence under stressful soil conditions compared to standard germination tests11. The process of seed aging, typically involves initial damage to cellular membranes, followed by compromised cellular repair and biosynthetic processes, resulting in a reduced rate of germination, slower seedling growth, heightened susceptibility to environmental stresses, diminished field emergence and ultimately loss of viability6. The AA vigor test finds wide application across different crops, including soybean (Glycine max L. Merrill)12, wheat (Triticum aestivum L. Thell)13 and corn (Zea mays L.)14. Studies on artificial aging mimic the aging process under storage conditions have revealed the intricate genetic and environmental factors influencing seed longevity in wheat15 and barley16.

Advancements in genomics technologies have accelerated the next-generation breeding efforts in crops17. Genomic technologies such as the GBS approach identified 2,977 SNPs across genetically diverse accessions in finger millet and facilitated the mapping of the SNPs associated with seed protein content18. A recent chromosomal-level genome assembly of finger millet, with a genome size of 1.1 Gb, has significantly enhanced the accuracy of genome mapping19 and created an opportunity for a comprehensive understanding of the genetic basis underlying key traits. Genome-wide association studies (GWAS) provide the genetic architecture of traits by analyzing historical recombination events in genetically diverse populations, allowing accurate identification of genomic regions linked to specific traits20. GWAS approach has been extensively used in millets and cereals such as sorghum21, foxtail millet22 pearl millet23,24 and maize25,26.

Understanding the genetic basis of traits related to seed germination and seedling growth is limited in finger millet27, but QTLs and genes were identified in several crops using various approaches. In tobacco, overexpression of the heat shock factor A9, has enhanced the seed longevity28. Two stable QTLs, QlgGR.cas-1 A and QlgGR.cas-2B.2, explaining 6.7–11.4% of the phenotypic variation, were identified in wheat and associated with seed longevity. The QTL-linked FAR1-related sequence 6-like protein, along with synthase 3, dolichyl-diphosphooligosaccharide-protein glycosyltransferase, glutaminyl-peptide cyclotransferase, and alpha-Amy2/53, were found to be associated to seed longevity29. Eight QTLs for seed vigor were identified in rice from a recombinant inbred line population derived from a cross between ZS97 and MH63. Among these, five QTLs (qSV-1, qSV-5b, qSV-6a, qSV-6b and qSV-11) influenced seedling establishment, while three QTLs (qSV-5a, qSV-5c and qSV-8) specifically influenced germination30.

Considering the importance of understanding the genetics of seed longevity traits, our study aims to develop comprehensive phenotypic data for seed vigor and longevity traits using a genetically diverse panel, to identify SNP associated with traits and the genes responsible for seed vigor and longevity in finger millet. The SNPs and genes associated with the seed traits could contribute to the productivity gain and sustainability of this ancient and nutri-rich grain in rainfed agriculture.

Results

Statistics of seed phenotypic traits

The descriptive statistics for each trait assessed, under both control and accelerated aging conditions, revealed significant insights into seed germination and seedling growth (Supplementary Table S1). The frequency distribution curve exhibited a pattern closely resembling a normal distribution for germination rate index after accelerated aging (GRIAA), relative germination rate index after accelerated aging (GRIAAR), mean germination time (days) after accelerated aging (MGTAA), relative mean germination time after accelerated aging (MGTAAR), Shoot length (cm) after accelerated aging (SLAA), relative shoot length after accelerated aging (SLAAR) and relative seedling dry weight after accelerated aging (SDWAAR) indicating their polygenic nature (Fig. 1). AA had a detrimental effect on seed performance, leading to decreased seed germination percentage, germination rates, seedling growth and vigor across all the traits over the respective control (Fig. 2). These observations indicated that seeds subjected to AA experienced impaired metabolic activity, potentially due to oxidative stress and loss of membrane integrity during the aging process. The reduced seedling growth and lower vigor indices observed under AA further reflect the cumulative impact of aging on seed health, affecting both physiological and morphological aspects.

Fig. 1
figure 1

Frequency distribution of seed traits of the finger millet GWAS panel, the X-axis represents the trait values, and the Y-axis represents frequency. The traits are organized as follows: (a) GC, GAA, GAAR; (b) GIC, GIAA, GIAAR; (c) GRIC, GRIAA, GRIAAR; (d) MGTC, MGTAA, MGTAAR; (e) RLC, RLAA, RLAAR; (f) SLC, SLAA, SLAAR; (g) SVI1C, SVI1AA, SVI1AAR; (h) SDWC, SDWAA, SDWAAR; (i) SVI2C, SVI2AA, SVI2AAR.

Fig. 2
figure 2

Boxplot comparison of seed traits, the X-axis represents the trait names, and the Y-axis represents their corresponding values. (a) control vs. accelerated aging (b) relative values of traits after accelerated aging.

Initial germination of seeds (GC) ranged from 85.33 to 97.53%, with a mean of 93.57%. In comparison, germination after accelerated aging (GAA) showed a broader range of 23.26–95.52%, averaging 80.50%. Relative germination after accelerated aging (GAAR) varied significantly from 25.81 to 103.69%, highlighting the impact of accelerated aging on seed viability at different tolerance levels of genotypes. Germination index (GIC) ranged between 619.11 and 777.26, with an average of 721.01. Germination index after accelerated aging (GIAA) dropped considerably, ranging from 54.08 to 582.01, with a mean of 397.80. This represents nearly a two-fold reduction in germination potential, as reflected in relative germination index after accelerated aging (GIAAR), which averaged 55.13%, with some genotypes experiencing a significant decline (Supplementary Table S1).

Mean germination time (MGTC) averaged 1.29 days, with a range of 1.05 to 1.86 days. Mean germination time significantly increased to an average of 4.12 days, with a range of Germination index after accelerated aging 2.38 to 7.30 days after aging treatment. MGTAAR ranged from 229.29 to 472.63%, indicating that a longer period was needed for germination after the aging process. Root length (RLC) averaged 5.93 cm, with values ranging from 3.99 to 8.09 cm. After aging, root length after accelerated aging (RLAA) decreased to an average of 4.37 cm, with a range of 1.04 to 7.35 cm. Relative root length after accelerated aging (RLAAR) varied broadly, spanning from 18.94 to 130.46%, indicating a substantial decrease in radicle growth post-aging, with some genotypes showing a significant contraction. Shoot length (SLC) averaged 2.83 cm, with a range of 2.03 to 3.47 cm. Post-aging, SLAA dropped to an average of 1.99 cm, with values ranging from 0.44 to 2.87 cm, indicating a substantial reduction in shoot length due to aging.

Seedling vigor index-1 (SVI1C) averaged 818.68, with a range of 597.58 to 1066.41. Following aging, Seedling vigor index-1 (SVI1AA) decreased to an average of 516.17, ranging from 56.06 to 913.01. Relative seedling vigor index-1 after accelerated aging (SVI1AAR) ranged from 7.76 to 115.42%, with an average of 64.32%, indicating a considerable decline in seedling vigour. Seedling dry weight (SDWC) averaged 2.07 mg, ranging from 1.40 to 3.08 mg. Seedling dry weight after accelerated aging (SDWAA) slightly increased to an average of 2.16 mg, with a range of 0.58 to 3.20 mg. Relative seedling dry weight after accelerated aging (SDWAAR) varied from 34.64 to 141.75%, with some genotypes showing a substantial increase in dry weight despite the aging treatment. Seedling vigor index-2 (SVI2C) averaged 193.68, with values ranging from 127.33 to 281.24. Seedling vigor index-2 after aging, (SVI2AA) decreased to an average of 174.52, with a range of 24.97 to 256.89.

These results highlight significant variability in seed germination indices and seedling vigor traits under both control and AA conditions. The germination parameters, such as GC, GAA and GAAR, demonstrated substantial differences, reflecting the impact of AA on seed viability, which will help to predict the seed longevity potential of genotypes. The germination indices GIC, GIAA, GRIC, GRIAA and respective relative values (GIAAR and GRIAAR) revealed the speed of germination. Higher relative values for MGTAAR indicated the delay in the germination process of the genotype that reflects the poor vigor status of aged seeds. Similarly, seedling growth metrics, including root and shoot lengths (RLC and SLC) and their relative measures post-aging (RLAAR and SLAAR), measured the tolerance levels of growth under aging treatment.

The variance components showed significant differences for all the traits under control and AA conditions (Table 1). GC had a heritability of 61.22%, while GAA and GAAR showed heritability of 90.03% and 86.80%, respectively. Traits related to germination indices viz., GIC, GIAA, and GIAAR had a heritability of 78.33%, 93.06%, and 91.73%, respectively. Mean germination time (MGTC, MGTAA, MGTAAR), root and shoot lengths (RLC, SLC) and seedling vigor indices (SVI1C, SVI2C) had heritability ranging from 80.63 to 95.08%. These results explained the significant genetic basis of seed germination and seedling growth traits, indicating the potential for genetic enhancement of seed vigor and longevity.

Table 1 Analysis of variance (ANOVA) of 221 finger millet genotypes for 27 seed phenotypic traits.

GIAA showed significant positive correlations with GAA (0.86), GIAAR (0.97), and GRIAA (0.98) (Fig. 3). On the other hand, MGTAA exhibited a negative correlation with traits such as SVI1AA (-0.60) and SVI2AA (-0.28). A significant negative correlation between GAA and MGTAA (-0.55) indicates that seeds with higher germination rates after aging stress tend to germinate early. Similarly, the significant negative correlation between GIAA (-0.87) and GRIAA (-0.86) with MGTAA suggests that early germination is associated with better overall seed performance. These relationships provide valuable insights for breeding strategies focused on improving seed viability, early germination and robust seedling growth under aging conditions. The correlations among multiple traits highlight potential genetic linkages and pathways influencing seed development and viability, which could be critical for breeding programs to enhance finger millet resilience and productivity.

Fig. 3
figure 3

Correlation heatmap of seed traits in finger millet indicating trait relationships.

Statistics of SNP calling

A total of 5,63,270 SNPs were identified using the GBS approach from a panel of 221 genotypes. After eliminating SNPs with over 10% missing values, 1,04,907 SNPs were retained. These SNPs were then filtered by removing more than 10% heterozygosity and a minor allele frequency (MAF) below 3%, resulting in the retention of 11,832 high-quality SNPs for the genome-wide association study (Fig. 4).

Fig. 4
figure 4

Flow diagram describing the in-silico pipeline used for SNP filtering.

The SNP density across the chromosomes was analysed and presented in the heatmap (Fig. 5) (Supplementary Table S2). Chromosome 5 A had the highest SNP density (1054), followed by chromosome 5B (1101) and chromosome 6 A (1014). Chromosome 3B had the lowest SNPs (164), while scaffold regions of the genome contained 50 SNPs.

Fig. 5
figure 5

SNP density across the chromosomes of finger millet illustrating 11,832 SNPs in 1 Mb window size. The X-axis represents chromosome positions (Mb), and the Y-axis shows chromosome number, with red indicating the highest density and green the lowest.

Linkage disequilibrium (LD)

LD was measured across 11,832 SNP markers among 221 finger millet accessions, revealing that the LD block size at LD50 decay was 3.39 Mb (Fig. 6). SNPs within this distance are considered to belong to the same inheritance block.

Fig. 6
figure 6

The linkage disequilibrium (LD) decay plot among 221 diverse finger millet genotypes in GWAS panel, featuring 11,832 SNPs. The X-axis represents the genomic distance in Mb, while the Y-axis displays the values. The red line marks the threshold where linkage LD decayed to 50% of its maximum value.

Principle Component Analysis (PCA)

PCA revealed a relatively uniform distribution of genotypes across regions, with some overlap between populations. African and Asian genotypes are grouped separately, suggesting the presence of subpopulation structure. (Fig. 7a). The PCA based on races showed uniform distribution and wider spread across the first three principal components (Fig. 7b).

Fig. 7
figure 7

Principal components for 221 genotypes illustrating population structure: (a) diverse geographical origins, (b) different races.

Marker trait associations (MTAs)

The GAPIT model selection feature, which integrates both the kinship matrix and PCA, was employed to identify the associations with the traits in a panel of 221 genotypes using 11,832 high-quality SNPs.

A total of 491 MTAs was discovered with a significance -log10(p) value greater than 3. Among traits, SVI2AA had the highest number of significant MTAs (32) and RLAAR had the fewest (4). The SNP associations for each trait under different treatments, along with their p-values are presented in Supplementary Table S3. MTAs were further filtered using a Bonferroni threshold of -log10(p) greater than 5.37, which narrowed down to 54 SNPs across 13 chromosomes. Table 2 provides detailed information on those Bonferroni-corrected MTAs. Furthermore, Manhattan plots displaying the MTAs and Q-Q plots showing the observed versus expected associations of various seed traits, adjusted for population structure, are presented in Figs. 8, 9 and 10.

Table 2 Significant marker trait associations at Bonferroni corrected p value for the traits under study.

FM_SNP_10872 was identified as the most significant marker associated with the trait SDWC. The pleiotropic quantitative trait loci (QTL), FM_SNP_9478 on chromosome 7B, exhibited associations with multiple traits, including GAA, GAAR, GIAA and GIAAR. GAAR, SDWAAR and SVI2AAR shared a common association with FM_SNP_235 on chromosome 2A. Another pleiotropic locus, FM_SNP_9403 explained 41.94% of phenotypic variance (PV) for SVI2C, 19.79% for SDWC and 17.99% for SDWAA. Comprehensive statistics for the significant SNPs identified are provided in Table 2.

Seed germination percentage

Nine significant markers were identified for QTLs associated with seed germination percentage, with three linked to GAA and six to GAAR. FM_SNP_11315 on chromosome 6B was observed in both treated seeds and relative germination after accelerated aging. For GAA, three SNPs on chromosomes 5B, 6B and 7B were identified, with FM_SNP_9478 on 7B being highly significant and explaining 31.52% of PV. For GAAR, six SNPs on chromosomes 1 A, 1B, 2A, 5B, 6B and 7B were identified, with FM_SNP_235 on 2A and FM_SNP_9478 on 7B explaining 28.36% and 18.63% of PV, respectively.

Fig. 8
figure 8

Manhattan plot of genome-wide SNPs of (a) GC, GAA, GAAR (b) GIC, GIAA, GIAAR (c) GRIC, GRIAA, GRIAAR. The X-axis shows genomic positions, and the Y-axis shows − log10(p-values). The grey horizontal line indicates the Bonferroni correction threshold. The corresponding Q–Q plot displays observed versus expected − log10(p-values).

Germination indices

The loci, FM_SNP_9478 and FM_SNP_9582 on chromosome 7B were observed in both treated seeds and the relative germination index after AA treatment. The locus FM_SNP_3796 on chromosome 8A in treated seeds showed the strongest association, explaining 31.12% of PV. For GRIC, only one SNP, FM_SNP_3303 on chromosome 6A, was linked, explaining 46.07% of PV. No significant associations were detected for GRIAA and GRIAAR.

Mean germination time

Significant SNPs associated with mean germination time were identified for control and relative mean germination time under AA treatment. For MGTC, FM_SNP_3303 and FM_SNP_10858 located on chromosomes 6A and 6B explained a PV of 31.86% and 30.14%, respectively. Similarly, FM_SNP_8722 on chromosome 9B showed a significant association and explained 12.03% of PV. For MGTAAR, FM_SNP_9852 on chromosome 5B explained 32.17% of PV.

Fig. 9
figure 9

Manhattan plot of genome-wide SNPs of (a) MGTC, MGTAA, MGTAAR (b) RLC, RLAA, RLAAR (c) SDWC, SDWAA, SDWAAR. X-axis showing genomic positions and the Y-axis showing − log10(p-values). The grey horizontal line indicates the Bonferroni threshold. The corresponding Q–Q plot displays observed versus expected − log10(p-values).

Root and shoot length

FM_SNP_8168 on chromosome 7B was associated with RLC, explaining 27.08% of PV. While, FM_SNP_3851 on chromosome 8 A showed a strong association, accounting for 39.58% of PV. No significant SNP associations were found for RLAA and RLAAR. FM_SNP_3696 on chromosome 3A was identified for SLC, accounted for 32.48% of PV. No significant SNPs were observed for SLAA and SLAAR.

Seedling dry weight

Among all the traits, seed germination percentage revealed the highest number of MTAs, with eight SNPs identified for SDWC and three for SDWAA. Additionally, four SNPs were identified for SDWAAR. The locus FM_SNP_9403 on chromosome 1B was consistently detected in control and treated seeds after the Bonferroni correction. For SDWC, FM_SNP_9403 on chromosome 1B and FM_SNP_10872 on chromosome 6B showed strong statistical significance explaining 19.79% and 21.56% of PV, respectively. Similarly, SDWAA showed significant associations with FM_SNP_4948 on chromosome 3A and FM_SNP_483 on chromosome 5A. These SNPs explained a PV of 37.48% and 30.75%, respectively. FM_SNP_1517 on chromosome 1A as a significant SNP with 70.19% of PV was identified for SDWAAR, highlighting its substantial role in influencing seedling response under AA condition over the control.

Fig. 10
figure 10

Manhattan plot of genome-wide SNPs of (a) SLC, SLAA, SLAAR (b) SVI1C, SVI1AA, SVI1AAR (c) SVI2C, SVI2AA, SVI2AAR. X-axis showing genomic positions and the Y-axis showing − log10(p-values). The grey horizontal line indicates the Bonferroni correction threshold. The corresponding Q-Q plot displays observed versus expected − log10(p-values).

Seedling vigor indices

GWAS model identified three significant SNPs associated with SVI1C, but did not find any significant associations for SVI1AA and SVI1AAR. FM_SNP_6503, located on chromosome 1A, explained 34.31% of PV, while FM_SNP_3851 on chromosome 8A explained 25.99% of PV. Significant SNPs associated with SVI2C, SVI2AA and SVI2AAR were primarily identified under control conditions. FM_SNP_9403 on chromosome 1B explained 41.94% of PV, whereas FM_SNP_8521 on chromosome 9B accounted for 23.58% of PV in SVI2C highlighting their significant influence on the trait. In contrast, SVI2AAR was linked to FM_SNP_235 on chromosome 2A and FM_SNP_9246 on chromosome 6B, explained 46.17% and 10.97% of PV, respectively, under rapid aging.

Several pleiotropic loci, namely, FM_SNP_9403, FM_SNP_9478, FM_SNP_3851, FM_SNP_9852 and FM_SNP_235, were detected for multiple traits. The SNP, FM_SNP_9478 on chromosome 7B was significantly associated with GAA, GAAR, GIAA and GIAAR, suggesting a common genetic basis in predicting seed longevity. The correlation study further confirmed that these four traits were significantly and positively associated. FM_SNP_7145 on chromosome 5B and the SNP FM_SNP_11315 on chromosome 6B were both linked to both GAA and GAAR, indicating the genetic relationship between these traits. These findings explained a strong correlation between the seed traits and the presence of common SNPs, suggesting potential pleiotropy or closely linked genes influencing multiple phenotypic characteristics in finger millet. These hotspots could serve as valuable markers for improving multiple traits through marker-assisted selection.

The high heritability values for traits viz., GIAA (93.06%), GAAR (86.80%) and GAA (90.03%) indicated a strong genetic control of these seed longevity characteristics. Notably, the SNPs associated with these high heritability traits also explained a significant portion of the phenotypic variation for seed storability potential. SNP FM_SNP_9478 on chromosome 7B accounted for 31.52% of the variation in GAA and 19.19% in GIAA. Similarly, FM_SNP_7145 on chromosome 5B explained 22.67% of the variation in GAA and 9.89% in GAAR, while FM_SNP_11315 on chromosome 6B explained 15.15% of the PV in GAA and 7.54% in GAAR. These findings suggested that the traits with high heritability are also those explaining a high degree of phenotypic variance for seed longevity.

In-silico comparative genomics

Significant SNPs associated with seed quality traits were functionally annotated across different monocot species viz., rice (Oryza sativa), maize (Zea mays), foxtail millet (Setaria italica), sorghum (Sorghum bicolor) and switchgrass (Panicum virgatum) to determine their roles in seed development, metabolism and adaptation. Considering the LD span of 3.39 Mb, genes within a 2 Mb region around the significant SNPs in finger millet were annotated. Among the SNPs identified as significantly associated through GWAS, annotations were obtained for 24 SNPs linked to crucial seed traits across various monocot species (Supplementary Table S4). FM_SNP_6503 and FM_SNP_6470 in sorghum were associated with the light-mediated development protein DET1 and probable protein phosphatase 2C3, respectively. FM_SNP_1517 and FM_SNP_9403 were associated with expansin-A2 and piezo-type mechanosensitive ion channels, indicating their roles in root mechanotransduction. Respiratory burst oxidase homolog protein F in sorghum for FM_SNP_51 and auxin transport protein BIG-like in switchgrass for FM_SNP_6154 were identified. DP-dependent glyceraldehyde-3-phosphate dehydrogenase for FM_SNP_3881, beta-amylase for FM_SNP_9582, and auxin transport protein BIG for FM_SNP_6154 were identified as other important genes. This highlights their potential roles in mitigating accelerated aging effects through regulatory mechanisms in seed development.

GO analysis highlighted enrichment in key biological processes such as embryo development ending in seed dormancy and ribosome biogenesis (Fig. 11a). KEGG pathway analysis identified significant pathways such as ribosome, endocytosis, phagosome and proteosome, emphasizing their importance in maintaining seed vigor and longevity, especially under conditions of accelerated aging (Fig. 11b). This comprehensive analysis enhances our understanding of the genetic basis of seed vigor and longevity traits in finger millet and reveals conserved molecular mechanisms shared with related cereal crops.

Fig. 11
figure 11

(a) GO analysis illustrating finger millet’s molecular landscape: Cellular components, biological processes and molecular functions. The Y-axis shows the fold enrichment and the X-axis shows highly enriched GO categories. (b) Dot plot visualization of KEGG pathway enrichment: Fold enrichment (x-axis) vs. pathway name (y-axis). Dot size represents gene count and color indicates p-value threshold.

Discussion

Finger millet, known for its climate resilience and nutritional value, plays a vital role in food security, especially in regions facing challenges such as drought and poor soil conditions31. Seed vigor and longevity are the critical traits that determine the success of crop establishment and yield. These traits are influenced by a complex interplay of genetic, environmental and physiological factors32. High temperatures, humidity and other external conditions can accelerate the seed aging, thereby affecting seed vigor and viability33.

The AA test is designed to simulate long-term storage conditions, allowing us to assess the seed longevity. Seeds with better longevity tend to deteriorate less, while those with poor longevity deteriorate more quickly. Studies across various crops such as rice, maize and Arabidopsis have revealed that seed vigor is controlled by multiple genes, with several QTLs identified that govern traits like germination rate, seedling length and dry weight34. Thirteen novel loci for seed longevity traits were identified using SNP whole genome mapping in wheat5. Advances in functional genomics have shed light on specific genes and pathways involving heat shock proteins, lectin-like receptor kinases and oxidative repair mechanisms that enhance seed vigor under stress conditions34. Despite significant progress, understanding the molecular mechanisms of seed vigor, particularly in finger millet, remains a challenge.

The phenotypic data of the seed longevity and seedling vigor-related traits revealed that the majority of these traits followed a normal distribution pattern, suggesting that they are influenced by genetic and environmental factors. The normal distribution pattern for the traits GIAA, GIAAR, GRIAA, GRIAAR, RLC, RLAA, RLAAR, SLC, SLAA, SLAAR, SVI1C, SVI1AA, SVI1AAR, SVI2C, SVI2AA and SVI2AARsuggests polygenic inheritance, where trait variation arises from the cumulative effects of multiple genes. Each of these genes contributes small, additive effects without displaying dominance, highlighting a complex genetic architecture governing these traits. In contrast, MGTC and GRIC exhibited skewed distributions, indicating a different genetic control, often characterized by the influence of polygenes with a dominance effect35.

The impact of rapid aging on seed germination and seedling growth across multiple traits indicated a consistent decrease in germination percentage, germination rate indices, seedling growth and vigor metrics, highlighting the vulnerability of seeds to stress conditions36,37. These results align with the general understanding that aging induces oxidative stress, leading to cellular damage and reduced metabolic activity in seeds38.

Seed viability status at the time of sowing is the crucial factor in determining crop establishment39. Among all the traits, seed germination percentage indicates the life span potential of a genotype after passing through a certain period of aging during storage. The significant variability observed in germination parameters under control and accelerated aging conditions highlights the sensitivity of seed viability to environmental stress40,41,42. Post aging, the sharp decline in GAA and its relative measure GAAR revealed the pronounced differences in seed longevity potential among the finger millet accessions, as the key factor that contributes to seed vigor during germination and emergence. Furthermore, the post aging reduction in GIAA and GRIAA and their relative values GIAAR and GIAAR indicated poor seed performance in terms of both speed and uniformity of germination which are essential for uniform crop stand and productivity41,43.

The decline in root and shoot lengths after aging emphasized the impaired seedling establishment capacity under stress conditions, limiting nutrient uptake and overall performance44,45,46. The vigor of a seed refers to its ability to quickly and evenly emerge and develop into healthy seedlings. This includes factors such as germination percentage, germination rate, germination index, root length, shoot length, and dry weight during the seedling growth process47,48. The significant decrease in seedling vigor indices (SVI1AA, SVI2AA) after aging reaffirmed the cumulative impact of oxidative stress on metabolic processes critical for seedling growth and development49. The vigorous seeds under artificial aging maintained rapid and homogenous germination whereas, the non-vigorous seeds had a marked decline40,50.

The genome-wide SNP distribution revealed higher SNP densities in the telomeric regions compared to the pericentromeric regions suggesting that the telomeric regions are potential hotspots for gene diversity. While, SNP density was lower in the pericentromeric regions, likely due to reduced recombination rates51. Chromosome 8B exhibited high SNP density in its telomeric regions among other chromosomes. The longer chromosomes, such as 5B, 6A, 6B, and 9B exhibited proportionately higher SNP counts. Chromosome 5B, the largest chromosome with a length of 80.35 Mb19, had the highest number of SNPs (1101), whereas, chromosome 4A, the smallest chromosome with a length of 41.34 Mb19, had fewer SNPs (609).

LD among populations can be influenced by multiple factors, such as population size, non-random mating, genetic drift, selection pressures, mutation rates, pollination type and recombination frequency. Self-pollinating species wheat and rice tend to have longer LD blocks (up to 20 Mb)52. However, limited information on LD decay is available in finger millet53. In finger millet, the LD decayed over a distance of 3.39 Mb, indicating the conservation of alleles. A similar trend of high LD has been observed in other self-pollinated polyploid species, such as wheat (5 Mb54, 9.04 Mb55) and peanut (3.78 Mb56). The slow LD decay suggests that breeding strategies need to focus on haplotypes or larger genetic regions, potentially requiring low-density markers for trait mapping and genomic selection approaches for effective trait improvement in finger millet. On the other hand, the high LD region linked to target traits is helpful for the selection of multiple genes in marker-assisted selection (MAS)57.

The SNP markers and their associated genes provide valuable insights into the complex regulatory networks governing seed longevity. In-silico comparative study of significant SNPs identified through GWAS, with Oryza sativa, Zea mays, Sorghum bicolor, Panicum virgatum and Setaria italica (Fig. 12, Supplementary Table S4) highlights the conserved genetic mechanisms involved in seed aging and stress response across these cereal crops. The orthologous genes across the monocot species suggested that a high degree of evolutionary conservation in maintaining seed longevity and viability under stress conditions58,59,60.

Fig. 12
figure 12

Circos plot visualizing orthologous gene positions in Eleusine coracana (Ec) compared to Oryza sativa (Os), Zea mays (Zm), Sorghum bicolor (Sb), Setaria italica (Si) and Panicum virgatum (Pv).

The annotation of the significant SNPs identified by the GWAS model in finger millet indicated that several of them were associated with functional genes. Light-mediated development protein DET1 (FM_SNP_6503) is the key one, which is involved in seedling development and showed to play important roles in seed dormancy61 and shade avoidance. Probable protein phosphatase 2C3 (FM_SNP_6503) is known to regulate seed dormancy by interacting with key signaling pathways involved in the inhibition of germination. Probable protein phosphatase 2C3 interacts with ANAC060 transcription factor, forming a regulatory network between seed dormancy and germination, ensuring optimal timing for seedling establishment62. The expansin-A2 gene (FM_SNP_1517) is vital for cell wall modification, enhancing seed coat integrity and water uptake during germination63. This is critical for combating seed aging, as maintaining seed coat structure and permeability preserves seed viability and vigor over extended storage periods. Piezo-type mechanosensitive ion channel homologs (FM_SNP_9403) sense mechanical cues that signal seed maturation and response to environmental stresses64. Their presence across cereal species highlights their importance in maintaining seed quality and longevity.

Metabolic regulators like peroxisome biogenesis protein 12 (FM_SNP_90) and malate dehydrogenase (FM_SNP_90) play roles in energy metabolism during seed development and storage65,66, their activities in managing oxidative stress and maintaining metabolic homeostasis are essential for mitigating aging-related imbalances and preserving seed viability. The respiratory burst oxidase homolog protein F (FM_SNP_51) and antioxidant enzymes like glutathione reductase 1 (FM_SNP_4948) safeguard cellular integrity against oxidative stress, a primary contributor to seed aging67. Their roles in detoxifying reactive oxygen species (ROS) and maintaining redox homeostasis are crucial for preserving seed vigor over extended storage periods.

Actin-related proteins (FM_SNP_1366) and auxin transporters (FM_SNP_1366) are regulators of cellular dynamics and hormone transport, influencing seed growth and development7,68. Regulatory proteins, CBL-interacting protein kinase 32 (FM_SNP_7145) and calcium-dependent protein kinase 27 (FM_SNP_11328), regulate seed germination and early seedling development through stress-responsive pathways to ensure adaptive responses during seed aging, enhancing seed quality and storage potential69,70. Protein HIRA (FM_SNP_2924), a positive regulator of seed germination71 and serine/threonine-protein kinase STY46 (FM_SNP_2840), contributes to stress adaptation and early seedling growth69, which is crucial for seed aging resilience.

Spermine synthases (FM_SNP_2799), regulate seed germination, grain size and yield72 and heat shock proteins (FM_SNP_2799); play an important role in metabolic regulation; and stress tolerance mechanisms73, ensuring seeds are primed for optimal growth under diverse conditions and during prolonged storage. Enzymes such as beta-amylases (FM_SNP_9582) and acetyl-CoA carboxylases (FM_SNP_9478) contribute to starch and lipid metabolism, respectively, supporting energy reserves and membrane integrity for seed germination and early seedling growth74. Their roles in energy metabolism and nutrient mobilization are vital for sustaining seed viability and performance during aging processes.

Acetyl-CoA carboxylases associated with pleiotropic SNP, FM_SNP_9478 involved in fatty acid biosynthesis and hormonal signaling pathways, which regulate seed dormancy and influence germination75. These processes are essential for energy storage and membrane integrity in seeds. CBL-interacting protein kinase 32 (FM_SNP_7145) is involved in calcium signaling, crucial for regulating physiological processes, including germination. It interacts with MAP kinase pathways and ABA signaling, key regulators of seed dormancy and stress responses76.

The GO terms and KEGG pathways provide significant insights into the molecular and cellular processes associated with seed aging traits. Enriched GO terms related to biological processes such as embryo development ending in seed dormancy and response to light stimulus highlight critical developmental and environmental response mechanisms essential for seed viability77. The involvement of ribosome biogenesis and protein refolding indicates a strong emphasis on protein synthesis and stress response, which are crucial during seed aging78. The significant enrichment of the proteasomal protein catabolic process suggests active protein turnover and degradation, vital for maintaining cellular homeostasis in aged seeds79. KEGG pathway analysis supports these findings, with significant pathways such as ribosome, endocytosis and proteasome indicating robust protein synthesis, trafficking and degradation activities. These pathways are essential for maintaining seed vigor and longevity80,81). The role of the antioxidant system in seed priming and aging emphasize the importance of redox homeostasis and stress responses in seed longevity and germination82.

Conclusion

The study provides valuable insights into the genetic basis of seed vigor and longevity traits in finger millet, which are crucial for crop establishment and productivity, especially in harsh environments. Accelerated aging experiments revealed significant impacts on seed viability and seedling vigor highlighting the importance of addressing oxidative stress during seed storage. GWAS approach identified key SNPs and genes linked to important seed longevity and seedling vigor traits. GO and KEGG pathway analysis further elucidated the biological processes and metabolic pathways involved, highlighting the roles of genes related to seed vigor and longevity. The identification of MTAs and key genes related to seed longevity offers avenues for improving seed performance in terms of seed vigor and longevity, ensuring high-quality seeds for sustainable production systems. This is particularly important for improving quality seed production and storability potential, as seeds with better germination efficiency and vigor are more likely to withstand the challenges posed by environmental stresses. Integrating genomic insights into breeding can expedite finger millet variety development suited to diverse environmental conditions, enhancing global food and nutritional security and sustainability.

Materials and methods

GWAS panel

A panel consists of 221 genetically diverse finger millet accessions, selected from the mini-core collection of the gene bank at ICRISAT, spanning across four continents namely, Africa (116), Asia (93), Europe (4), North America (2) and unknown origin (6) were selected for the GWAS experiment. The panel encompasses all finger millet races and subraces: vulgaris (143 accessions), plana (40 accessions), elongata (14 accessions), compacta (23 accessions) and one unclassified accession. The panel also includes representations of traditional landraces (191 accessions) and advanced breeding lines (30 accessions). Supplementary Table S5 provides detailed information on these accessions, including sample locations, as well as race and subrace classifications.

Phenotyping

GWAS accessions were phenotyped for nine seed traits under two experimental conditions: control and accelerated aging (AA) treatment with three replications. Seed germination (%), germination index, germination rate index, mean germination time (days), root length (cm), shoot length (cm), seedling vigor index (1), seedling dry weight (mg) and seedling vigor index (2) were phenotyped. For each trait, measurements were recorded for the control, accelerated aging treatment and their relative values (Table 3). As per prior standardization of the AA test for finger millet diverse panel, the test conditions involved a 17-day period in an ‘accelerated aging chamber (Memmert-HPP 108/749, Germany)’ maintained at a humidity of 90% or above and a temperature of 44 ± 1 °C. The control (fresh) seeds and the AA seeds of these 221 accessions were evaluated for seed quality traits following standard seed germination test83.

Table 3 Summary of the quantitative traits analysed in finger millet accessions under different conditions, along with the corresponding formulas.

Phenotypic data was collected for nine traits under control and treatment conditions and their relative values were estimated (Table 3). The germination percentage (G) represents the total number of seedlings at the end of the test on the 8th day. Germination index (GI) was calculated as GI = (8 × n1) + (7 × n2) + . (1 × n8). Where, n1, n2 …. n8 is the number of germinated seeds on the first, second, and subsequent days until the 8th day; 8, 7. and 1 are the weights given to the number of germinated seeds on the first, second and subsequent days, respectively84. Germination Rate Index (GRI) was computed using the formula GRI = G1 / 1 + G2 / 2 + G3 / 3 +…. + Gn / n (%/day)85, where, G1 = Germination percentage × 100 at the first day after sowing, G2 = Germination percentage × 100 at the second day after sowing until the 8th day; 1, 2… and n are the days of first, second … and final count, respectively. Mean Germination Time (MGT) was worked out using the formula MGT = ∑ n.D / n (day)86, where n is the number of seeds germinated the day D. Mean root length (RL) (cm) of five normal seedlings was measured from the collar region to the tip of the primary root. Mean shoot length (SL) (cm) of five normal seedlings was measured from the collar region to the tip of the first leaf. Seedling dry weight (SDW) (mg) was measured after drying the 10 normal seedlings in a hot air oven maintained at 80 °C for 24 h. After completion of drying, immediately, seedlings were transferred to desiccator for half an hour for cooling and then mean dry weight of normal seedlings was noted. The seedling vigor index 1 (SVI1) was calculated multiplying final germination percentage by mean seedling length (root + shoot). The seedling vigor index 2 (SVI2) was calculated multiplying mean germination percentage by mean seedling dry weight (SVI2).

Genotyping

DNA extraction

The genomic DNA from 15 to 20 seeds of 221 finger millet genotypes was isolated using the CTAB (Cetyl Trimethyl Ammonium Bromide) extraction method, as described by Murray and Thompson87, with some modifications. Quantification of DNA was performed using Nanodrop 1000 (Thermo Scientific), measuring absorbance at 260 and 280 nm to assess both quality and quantity. Qualitative analysis of the DNA was carried out by agarose gel electrophoresis, where isolated products were resolved on a 1% agarose gel at 80 V for 30 min. Post-electrophoresis, DNA bands were visualized and documented using a gel documentation system and checks for RNA and protein contamination were conducted.

The DNA samples were normalized for GBS to a final concentration of 50 µg. GBS analysis was conducted on lyophilized DNA samples. The resulting GBS libraries were sequenced using Illumina NovaSeq 6000 version paired-end sequencing technology with 150 bp read lengths from 221 samples. Sequence reads from the sequencer were preprocessed with FASTP to obtain high-quality reads filtered based on the Q20 parameter. These high-quality reads were then aligned against the reference genome Eleusine_coracana_v1.0 (GCA_032690845.1)19 using the Bowtie2 genome assembler. This process generated a SAM file containing valuable mapping information for the variant calling of the aligned reads with the average mapping percentage of 94.9%.

SNP filtering

To call variants, stacks were utilized, and all BAM files were used with the default stacks parameters. The Genome Analysis Toolkit (GATK) was employed to refine the SNP dataset, utilizing its advanced tools for variant analysis to identify biallelic SNPs specifically. A total of 5,63,270 SNPs passed the selection criteria after biallelic filtration. Vcftools tool was used to calculate missing percentages and to identify SNPs with high missing values. After removing SNPs with over 10% missing data and 10% heterozygosity, a minor allele frequency (MAF) threshold of 3% was applied, resulting in the retention of 11,832 high-quality SNPs for GWAS analysis.

Phenotypic data analysis

The adjusted means of replicates were obtained by fitting mixed linear models (MLM). The adjusted means were calculated as best linear unbiased predictions (BLUPs), considering replication and genotypes as random effects. MLM is chosen to estimate BLUPs as they effectively account for fixed and random effects and improve the accuracy of factors. It is useful in field experiments for handling complex data structures, repeated measurements and variance components, providing more reliable and precise estimates than simpler models88. The BLUPs for each genotype and the analysis of variance (ANOVA) were estimated using the R package “lme4”89. The BLUPs were obtained using the following formula90.

$$\:Y_{{ik}} = \mu + R_{i} + G_{k} + \varepsilon _{{ik}}$$

where Yik is the trait of interest; µ is the mean effect; Ri is the effect of the ith replicate; Gk is the effect of the kth genotype; εik is the error associated with the ith replication and the kth genotype, which is assumed to be normally and independently distributed, with mean zero and homoscedastic variance σ2. The broad sense heritability (h2) was estimated using the formula.

$$\:\:\:\:\:{h}^{2}=\:\frac{{\sigma\:}_{g}^{2}}{{\sigma\:}_{g}^{2}+\:{\sigma\:}_{e}^{2}\:/nreps}$$

Where, σ2g is the genotypic variance; σ2e is the error variance and nreps is the number of replications.

Genome-wide association mapping (GWAS)

Genome-wide linkage disequilibrium was calculated using TASSEL v5.091 from the filtered SNP markers. Pairwise comparisons of markers with a sliding window size of 50 markers were used to estimate for linkage disequilibrium (LD) statistic r2. The LD block size was estimated by plotting the r2 value against the distance in base pairs and noting the distance at the half LD decay point. A smoothened LD decay curve was fit to the TASSEL output data using the R script.

The filtered 11,832 high-quality SNPs and BLUPs for each trait were used for the GWAS using the “BLINK” (Bayesian-information and Linkage-disequilibrium Iteratively Nested Keyway) model available in GAPIT v392 with kinship and PCA (https://zzlab.net/GAPIT/gapit_help_document.pdf). The BLINK model is considered superior for identifying marker-trait associations (MTAs) and minimizing false positives to uncover true associations93. This model excels in efficiently managing large-scale genomic data and accurately detecting significant associations by integrating Bayesian and frequentist approaches.

The quality of the association model fitting was assessed using a Q-Q plot, which was drawn by comparing expected versus observed − log10(p) values. MTAs were filtered based on a significance threshold of p-value cut-off at 0.05. To avoid false positives, a stringent selection of MTAs was performed using the Bonferroni correction (p-value cut-off at 0.05/ total number of markers). A Manhattan plot was drawn to represent the MTAs. Additionally, pleiotropic SNPs associated with more than one trait were also identified.

In-silico comparative analysis

In-silico comparative analysis was conducted for the significant SNPs identified through GWAS in finger millet. Cross-species validation was performed using the Basic Local Alignment Search Tool (BLAST). For each SNP, the genomic region spanning 1 Mb upstream and downstream was extracted and formatted into FASTA files, which were then queried against the well-annotated monocot species viz., rice (Oryza sativa), maize (Zea mays), foxtail millet (Setaria italica), sorghum (Sorghum bicolor) and switchgrass (Panicum virgatum). The regions displaying sequence similarity on the chromosomes of monocot species were analysed to identify candidate gene(s). Gene Ontology (GO) enrichment analysis was conducted on these candidate genes to assess their potential roles in seed traits. Subsequently, the Kyoto Encyclopedia of Genes and Genomes (KEGG)94 pathway database was utilized to investigate the biological pathways potentially involved in finger millet seed development. Significant GO and KEGG pathways were identified with the criterion of FDR corrected p value < 0.05. The GO and KEGG analyses were done using the Database for Annotation, Visualization and Integrated Discovery (DAVID) Version 6.8.