Abstract
Proso millet is an important but under-researched and underutilized crop with the potential to become a future smart crop because of its climate-resilient features and high nutrient content. Assessing diversity and marker-trait associations are essential to support the genomics-assisted improvement of proso millet. This study aimed to assess the population structure and diversity of a proso millet diversity panel and identify marker-trait associations for agronomic and grain nutrient traits. In this study, genome-wide single nucleotide polymorphisms (SNPs) were identified by mapping raw genotyping-by-sequencing (GBS) data onto the proso millet genome, resulting in 5621 quality-filtered SNPs in 160 diverse accessions. The modified Roger's Distance assessment indicated an average distance of 0.268 among accessions, with the race miliaceum exhibiting the highest diversity and ovatum the lowest. Proso millet germplasm diversity was structured according to geographic centers of origin and domestication. Genome-wide association mapping identified 40 marker-trait associations (MTAs), including 34 MTAs for agronomic traits and 6 for grain nutrients; 20 of these MTAs were located within genes. Favourable alleles and phenotypic values were estimated for all MTAs. This study provides valuable insights into the population structure and diversity of proso millet, identified marker-trait associations, and reported favourable alleles and their phenotypic values for supporting genomics-assisted improvement efforts in proso millet.
Similar content being viewed by others
Introduction
Traditionally important, climate-resilient and nutrient-rich crops have a significant role to play in the near future to achieve food security and nutrition despite global climate change. Proso millet (Panicum miliaceum L.) is one of the oldest domesticated cereal crops in the world. The earliest records of proso millet occurrence were from China between 10,300 and 8700 cal BP1 and Eastern Europe at 7000 cal BP2. This pattern suggests independent domestication in Central Asia and Eastern Europe, or that they may have originated from domestication within China and then spread westward across the Eurasian Steppe3. Proso millet belongs to a group of small-seeded cereal crops known as small millets. It is also popularly known as broomcorn millet, common millet, panic millet, and hog millet in different parts of the world. Proso millet is grown in Asia, Australia, North America, Europe, and Africa4; however, it is a minor crop globally in terms of its contribution to global production5. Proso millet remains a locally important staple source of food security in semi-arid regions, where other cereals fail, whereas in developed countries it is used for feeding birds and livestock. County-wise, proso millet is cultivated on about 0.82 m ha in Russia, 0.32 m ha in China6, 0.20 m ha in the USA7, 0.03 m ha in India8 and 0.002 m ha in Korea9. The USA is one of the top producers of proso millet and exports 15–20% of its annual production to over 70 countries7.
Proso millet is a C4 allotetraploid crop. Its important characteristic features include short duration (matures in 40–80 days), low water requirements, high drought tolerance, and good adaptability to different environmental conditions. Its grains are highly nutritious and gluten-free, and they contain higher contents of protein, dietary fiber, several minerals, vitamins, and antioxidants than most other cereals10,11. The protein content of proso millet (12.5%) is higher than that of rice (7.9%), maize (9.2%), wheat (11.6%), and other millets, and it is also significantly richer in essential amino acids (leucine, isoleucine, and methionine) than wheat10,12. These climate-resilient and nutrient-rich features of proso millet (and other minor but regionally important crops) have the potential to ensure food security and nutrition, and crop diversification.
Proso millet is an under-researched and underutilized crop compared with other major cereals. Globally, approximately 29,000 germplasm accessions have been conserved in genebanks, and high variability exists13. Based on panicle morphology and shape, the cultivated germplasm of proso millet can be grouped into five races: miliaceum, patentissimum, contractum, compactum and ovatum14. Germplasm diversity could potentially contribute to proso millet improvement, provided that these are subject to systematic evaluation, identification of trait-specific sources, and genomic investigation. Evaluation of germplasm for important traits such as productivity, biotic and abiotic stress tolerance, and grain nutrient traits resulted in the identification of promising accessions for crop improvement15. Genomics-assisted improvement in proso millet is very limited. However, the availability of draft genome sequences of proso millet16,17 provides an opportunity to investigate the diversity, structure, and identification of QTLs using next-generation sequencing approaches. Genome-wide association studies (GWAS) are an important approach for the genetic dissection of complex traits and for identifying marker-trait associations and have been applied in several cereal crops, including rice, wheat, sorghum, and foxtail millet, for many traits, including agronomic, quality, and adaptation traits18,19,20,21,22,23. In proso millet, only two GWAS reports are available for agronomic and seed traits24,25. The present study aimed to (1) assess the diversity and population structure of the global proso millet germplasm collection conserved at the International Crops Research Institute for the Semi-Arid Tropics (ICRISAT) genebank and (2) identify genomic regions associated with productivity and grain nutrients.
Results
Phenotypic variation
A proso millet diversity panel, consisting of 200 lines, was used in this study. These lines originated in 30 countries and represent all five races of proso millet (60.5% miliaceum, 12.5% compactum, 12.0% contractum, 8.5% patentissimum, and 6.5% ovatum). The phenotypic variability of this diversity panel (200 accessions) for agronomic and grain nutrient traits has been described in detail in our previous study26. In brief, the phenotypic evaluation indicated a significant genotypic variance and genotype × year variance for all traits except for basal tiller number, indicating the significant influence of genotype and environment and their interaction on the expression of traits. All but two traits showed high broad-sense heritability (> 0.60) in both years and when combined across years. The exceptions were basal tiller number and grain Fe content, which showed moderate heritability of 0.30–0.6026. In this study, 160 out of 200 accessions were included after filtering for high-quality SNPs. The frequency distribution of key agronomic and grain nutrient traits is presented in Fig. 1, and a complete list of traits investigated is presented in Supplementary Fig. 1.
Genome-wide SNP variation
SNP diversity and population structure
For the final dataset, after filtering, we retained 160 accessions with 5621 SNPs. The SNP counts varied from 204 on chromosome 10 to 476 on chromosome 3 (Table 1). The SNP distribution across the 18 chromosomes is shown in Fig. 2. Analysis of the position and distribution of each SNP locus at the whole-genome level showed that 68% of the SNPs were within 100 kbp of adjacent SNPs. Further, using SnpEff27, each SNP locus was annotated based on its genomic ___location to predict coding effects. It was found that 4.5% of the SNP loci were in the exon regions and 15.6% in intergenic regions, while 43.7% and 36.18% of SNPs were in the downstream and upstream regions of the gene, respectively.
AMOVA (analysis of molecular variance)
The AMOVA indicated the highest contribution of variation within a race (71%) and within the region (66.9%). However, a low but significant contribution was observed between races and regions (4% and 8.4%, respectively) (Table 2). This implies that traditional race classifications based on morphology are weakly correlated with the underlying population genetics.
Genetic distance
The average Modified Roger’s Distance (MRD) of the entire set was 0.268 and ranged from 0.126 to 0.341. Among the races, accessions belonging to the miliaceum race had the highest average distance (0.274), whereas the lowest distance was observed among accessions within the ovatum race (0.201) (Supplementary Table 1). Figure 3 shows the minimum, maximum and median MRD of the entire set within and between the races. Among races, the lowest distance was observed between ovatum and contractum (0.248), followed by ovatum and compactum (0.250), whereas race patentissimum showed a higher distance from other races (0.260–0.272) (Supplementary Table 1).
Population structure
Principal component analysis and ADMIXTURE were used to infer the population structure of the collection, and clear subpopulation structures were observed (Fig. 4a). The first two principal components explained 28% of the total genetic variation and aided in visually differentiating the substructures within Asian accessions. PCA also showed the presence of a substructure within the collection, based on the regions and countries within the regions. A grouping of a small clump of 14 Asian accessions in the top center of the PCA biplot showed that these accessions were diverse from the other Asian accessions; on further observation, 12 of these 14 accessions were of Korean origin, while the remaining two accessions were from China and Germany. In addition, from the biplot, it can be seen that a diverse set of 20 Asian accessions clustered at the bottom-right. Further observation of the countries of sample collection found that most of the accessions were of Indian origin (14 accessions from India) and one accession from Sri Lanka, while other accessions were from Mexico, Syria, and other countries.
Population structure assessment of proso millet collection: (a) PCA biplot of 160 accessions based on SNPs from GBS, (b) rate of change in cross-validation (CV) error between successive k-values (k values ranging from 1 to 10), and (c) model-based population structure in proso millet collection based on ADMIXTURE with K = 5 populations for the 160 accessions.
The hierarchical population structure, using the model-based ADMIXTURE program, was run assuming K = 1 to 10 populations without providing any prior information on population structure. The obtained CV values and the corresponding ΔCV, combined with a line graph using CV errors for each K, showed that the CV error decreased steadily up to K = 5 and increased afterwards. This suggested the presence of five natural subpopulations (K = 5) (Fig. 4b) within our proso millet collection, and a K value of 5 was considered an appropriate population structure. The five populations were named POP1–POP5 (Fig. 4c). The accessions that had population membership of < 0.6 in all five populations were considered admixtures (accessions with genomes of two or more populations).
We assigned individuals to any of the five subpopulations considering the maximum proportion of membership. Accordingly, POP1, POP2, POP3, and POP5 represent most accessions from Asia, whereas POP4 represents most accessions from Europe (18 out of 25 accessions, including two unknown origins, one from America, and four from Asia). Accessions from Korea were grouped as POP2. Accessions of the race miliaceum dominated in all the populations, while the majority of accessions belonging to compactum were in the POP1, ovatum in the POP3, patentissimum in the POP 5, contractum in the POP 1 and POP 3. These distributions show that the proso millet accessions were not structured as per racial groups, while they were structured according to regions and countries within regions (for example, Korea in POP 2, Russia in POP 4), as observed in PCA. Overall, 93 of the 160 accessions had a population membership of > 0.90, while 31 accessions had a population membership of < 0.60. Approximately 17% of accessions (12 accessions) from Asia were admixture (< 0.60), while 37% of accessions from Europe had admixtures with different populations.
Hierarchical clustering based on the calculated MRD showed the presence of five major clusters (Fig. 5). The cluster dendrogram results also agreed with the admixture-based population structure in terms of both the number of populations and the presence of population structure based on regions. Cluster-5 was dominated by Korean accessions and cluster-4 was dominated by Indian accessions. Combining the cluster dendrogram and ADMIXTURE-based population membership, individuals belonging to clusters 1, 4, and 5 had well-defined allelic or membership proportions with fewer admixtures.
Cluster dendrogram of GBS-based SNPs, based on Modified Roger's Distances (innermost colors on the dendrogram represent clusters, shapes at the nodes of the dendrogram represent races, tiles surrounding the dendrogram represent the region, and colored outermost bars represent the ADMIXTURE proportions-based population structure.
Linkage disequlibrium (LD) decay
The whole-genome average maximum r2 value was 0.52 at 5 kpb, which dropped to half that between 50 kbp (0.37) and 75 kbp (0.18), and plateaued after ~ 200 kbp (0.10 at 225 kbp to 0.05 at 30 Mbp) (Fig. 6).
Genome-wide association study (GWAS) on agronomic and nutrient traits
GWAS for agronomic traits identified 121, 95, and 95 marker-trait associations (MTAs) for 2015, 2016, and combined datasets, respectively, using FarmCPU with a p-value cutoff of ≤ 0.0001. Furthermore, when we looked for common MTAs across the three datasets (2015, 2016, and combined), 34 SNPs were found to be significantly associated with agronomic traits in at least two of them. Among these 34 MTAs, four SNPs for inflorescence length (Proso.1_8346815, Proso.14_27820106, Proso.12_34047515, and Proso.12_41890075) (Fig. 7) and one for plant height (Proso.7_1535098) were detected significant across all three datasets. Eighteen of the 34 MTAs were located in genes (Table 3). GWAS on grain nutrient traits identified 24, 37, and 26 MTAs for the 2015, 2016 and combined datasets, but only six were found in at least two of the datasets. Of these, two SNPs (Proso.17_30948407 and Proso.17_5885921, associated with Zn and Fe, respectively) on chromosome 17 were located in genes PM17G09880 and TE311547.
Comparative genomics
Twenty MTAs that were significantly associated with various traits were located within the genes. Although these specific SNPs are probably not causal polymorphisms for these traits, the rapid LD decay in this population (Fig. 6) implies that the genes have a strong probability of being involved. The sequence information of these genes was retrieved from www.genomeevolution.com and compared with related species to check the similarity and gene function. More than 90% similarity was considered to report the genes and their functions in related species (Supplementary Table 2). For example, the SNP Proso.2_14901071 associated with basal tiller number is located on the gene PM02G15460, showing over 90% similarity with genes in the three species, Panicum hallii, Panicum virgatum, and Setaria italica, with gene functions of “putative leucine-rich repeat-containing protein, sporulation-specific protein 15-like, and girdin-like”. The SNP Proso.17_3253916 associated with inflorescene length is located on the gene PM17G03120 showed over 90% similarity with genes in the closely related species namely Panicum hallii, Panicum virgatum, Sateria italica, with gene function of “filament-like plant protein”. The SNP Proso.14_27820106, which is associated with inflorescene length, is located in the gene PM14G15020 and showed over 90% similarity with Panicum hallii, Panicum virgatum, Sateria viridis, Setaria italica, and Zea mays genes with the gene function of “dihydrolipoyllysine-residue acetyltransferase component 4 of the pyruvate dehydrogenase complex, chloroplastic-like” (Supplementary Table 2).
DISCUSSION
The proso millet diversity panel used in this study had an average MRD of 0.268, which varied from 0.126 to 0.341. Among the five races of proso millet, which are primarily classified on the basis of panicle morphology and shape14, accessions belonging to milliaceum had the highest average distance (0.274), whereas accessions of ovatum showed the lowest average distance (0.201). The lowest between-race distance was found between the ovatum with compactum (0.248) and the ovatum with contractum (0.250). Similar results were found when the same panel and the entire proso millet collection were assessed for phenotypic diversity13,26. The three races, namely contractum, compactum, and ovatum, look similar, except for panicle morphology: compact and drooping inflorescence in contractum, cylindrical and erect inflorescence in compactum, and compact and slightly curved inflorescence in ovatum13,14. These three races phenotypically differ from the other two races, milliaceum and patentissimum, which are often difficult to distinguish. Accessions belonging to the race miliaceum are characterized by a large open inflorescence with suberect branches that are sparingly subdivided, whereas those belonging to the patentissimum are characterized by slender and diffused panicle branches.
Understanding the diversity and population structure of germplasm resources is important for their use in crop improvement programs. Population structure analysis revealed the presence of five populations in the proso millet diversity panel, which did not correspond with these race designations. Instead, the populations corresponded well with geography. Four of the populations consisted almost entirely of Asian accessions, indicating greater genetic diversity, whereas almost all the European accessions clustered into a single population. These results support that Asia is the centre of origin and diversity of proso millet, followed by a spread westward across Europe3,13,26. In our previous study on the same subset, diversity and population structure were estimated using morpho-agronomic data, indicating that the accessions of proso millet were structured largely according to geographical region. Accessions originating in Asia and Europe were distinctly grouped, also accessions from Asia showed high diversity (average distance 0.268) relative to those from Europe (average distance 0.225), and high diversity was observed between accessions of Asia and Europe (average distance 0.301)26.
Advances in NGS technologies and the availability of a draft genome for proso millet can accelerate genomics-assisted crop improvement16,17. In proso millet, Rajput et al.28 reported QTLs for morpho-agronomic traits using bi-parental mapping, while there are only two reports available on GWAS for agronomic and seed traits24,25, and no report on grain nutrients. In this study, a diversity set of proso millets representing five races originating from 30 countries was genotyped using the GBS approach. After filtering, 160 accessions originating from 26 countries and 5,621 quality SNPs were used to perform GWAS on agronomic and grain nutrient traits. A total of 40 MTAs were identified: 34 for agronomic traits and six for grain nutrient traits. Nine MTAs (two for flag leaf blade length, five for inflorescence length, and one each for plant height and paniel exsertion) were identified as linked with the phenotypic trait of variation in both years, and five of them showed significant associations in both the years as well as when the years were combined. Long inflorescences and tall plants are among the important traits that are positively associated with higher grain yield in proso millet26. Of the seven SNPs that were associated with inflorescence length, four were associated in both years as well as in the combined data. Among these, four SNPs showed a positive effect (Proso.1_8346815, Proso.12_41890075, Proso.14_27820106, and Proso.17_3253916), whereas Proso.12_34047515 and Proso.9_39130372 had a negative effect on inflorescence length. The four SNPs, Proso.17_3253916, Proso.1_8346815, Proso.14_27820106, and Proso.12_34047515, are located on genes PM17G03120, PM01G10560, PM14G15020, and PM12G21200, respectively, indicating potential candidate genes for yield improvement in proso millet. For plant height, two SNPs were identified, located on chromosomes 7 and 9. The SNP Proso.7_1535098 showed significant association in both years as well as in the combined data, and is located in the gene TE347748 with gene function of “putative pentatricopeptide repeat-containing protein” and “ninja-family protein 8-like” (Supplementary Table 2). Six SNPs were identified for grain nutrient content, of which two were located in the genes. For all the MTAs identified in this study, a box plot showing alleles and their phenotypic values was estimated, which is important for further use of these SNPs in the genomics-assisted improvement of traits (Fig. 8, Supplementary Fig. 2). Sequence similarity of the identified genes was compared with related species, and gene functions were reported, which will help in understanding the genetic basis of phenotypic variation of different traits.
In conclusion, proso millet is a potential crop for food security and nutrition and has various climate resilience and nutritional benefits. However, systematic breeding and genomics-assisted improvements in proso millet are very limited. In this study, the NGS-based genotypic characterization of proso millet revealed a wider diversity within and among races, and proso millet germplasm diversity was structured according to the two geographical regions where proso millet was reported to originate and be domesticated. Genome-wide association mapping identified 40 marker–trait associations for agronomic traits (34) and grain nutrients (6), most of which were located within the genes. The information generated from this study on diversity and marker-trait associations can support the development of allele-specific markers for mining productivity and nutrient traits, and their utilization in genomic-assisted proso millet improvement.
Materials and methods
Phenotyping
Experimental materials and conditions
The experimental material consisted of 200 accessions which includes the core collection (106 accessions)29. Number of accessions in the core collection is less for GWAS therefore we followed same approach by which core collection was established to make a diversity subset of 200 accessions from the entire collection of 849 accessions conserved in the genebank (http://genebank.icrisat.org/). These accessions were planted during the 2015 and 2016 rainy seasons at ICRISAT (Patancheru, Telangana, India; 17° 30′ N latitude, 78° 15′ E longitude and altitude 545 MSL) in an alpha design with two replications on red soil, planted in the third week of July during both years. Sowings were performed on ridges 60 cm apart, and each accession occupied a single row of 4 m in length. Plant-to-plant spacing of approximately 10 cm was maintained by thinning the excess seedlings. Diammonium phosphate was applied at a rate of 100 kg/ha as a basal dose to supply nitrogen and phosphorus. In addition, 100 kg/ha of urea was applied as top dressing. Irrigation and hand-weeding were performed on a need-based basis.
Data collection
Data on 14 agronomic traits were recorded using the descriptors of Panicum miliaceum30. The agronomic traits of days to 50% flowering, days to maturity, and grain yields were recorded on a plot basis, while the other agronomic traits (plant height, basal tillers, flag leaf blade length, flag leaf blade width, flag leaf sheath length, peduncle length, panicle extension, Inflorescence length, number of nodes, and inflorescence primary branch number) were recorded on the main culms of the five representative plants in a plot. Bulked seeds of each accession were used to determine the 100-seed weight. The grain yield per plot was converted into grain yield (kg/ha). A random, well-cleaned grain sample (unhusked) from each accession was used to estimate the grain protein, calcium (Ca), iron (Fe), and zinc (Zn) content at the Charles Renard Analytical Laboratory, ICRISAT, Patancheru, India. Grain Ca, Fe, and Zn contents were assessed following the nitric acid–hydrogen peroxide digestion method, and Ca, Fe, and Zn in the digests were analyzed using inductively coupled plasma-optical emission spectrometry (ICP-OES)31. Protein content in grain samples was determined using the sulfuric acid–selenium digestion method. Total nitrogen (N) was estimated using a Skalar Autoanalyzer, and protein % was calculated as N% × 6.25 conversion factor32.
Phenotypic data analysis
Data were analyzed for each rainy season separately and pooled following Residual Maximum Likelihood (REML)33 in GenStat, 17th edition (http://www.genstat.co.uk) considering genotypes as random and seasons as fixed effects. The significance of seasons was tested using Wald’s statistics34. The Best Linear Unbiased Predictors (BLUPs) were obtained for all the traits for each accession for individual seasons, pooled over two rainy seasons, and used for genome-wide association studies.
Genotyping and SNP calling
DNA extraction and SNP calling from genotyping-by-sequencing (GBS) data have been described in detail in our previous publication35. In brief, DNA was extracted from each accession following the modified CTAB method36, lyophilized, and shipped to the Genomic Diversity Facility at Cornell University for GBS37. GBS library preparation followed the standard method38 using a single PstI restriction enzyme. Samples were multiplexed into two lanes of 95 samples plus one blank for sequencing on an Illumina HiSeq 2500 with single-end 100 bp sequencing. The sequences were mapped to proso millet reference genome Panicum miliaceum (vPm_0390_v1) (https://andgenomevolution.org/coge/SearchResults.pl?s=52484&p=genome)17 using Bowtie v2.2.439. SNPs were called using the GBS v2 pipeline in TASSEL v4.3.6. Raw SNPs were filtered by removing any sites with greater than 20% missing, less than 0.1 proportion heterozygous, and a minor allele frequency of < 0.025. Accessions with more than 20% of their missing sites were filtered out. This resulted in 5621 high-confidence SNPs that were used for GWAS.
Population structure and genetic distance
AMOVA was computed to determine the presence of significant variation in the collection and assess the contribution of different stratifications to diversity. Principal Component Analysis was used to summarize and obtain preliminary knowledge about the diversity within the collection. The hierarchical population structure was estimated using the ADMIXTURE program, which is a model-based estimation of ancestry in unrelated individuals using the maximum-likelihood method40. ADMIXTURE implements a cross-validation (CV) feature that allows, together with the number of iterations to converge, the determination of the number of subpopulations (k values) that best fit the data. After choosing the subpopulation level, individual accessions were assigned to the subpopulation if they had at least 60% membership in that respective population41. We calculated the Modified Roger’s Distances (MRD) between samples42,43 as,
where \({p}_{ij}\) and \({q}_{ij}\) are the allele frequencies of jth and ith markers in the two samples under consideration, \({a}_{i}\) is the number of alleles in the ith marker and \(m\) refers to the number of markers. Clustering of accessions based on the MRD distances was performed using Ward’s D2 hierarchical clustering algorithm44.
Linkage disequilibrium (LD) and genome-wide association mapping
TASSEL 4.0 was used to obtain the LD squared allele frequency correlation (r2) estimates for all pairwise comparisons between intra- and whole-genome SNPs, and visualized by plotting r2 values against physical distance. A non-linear regression curve was used to estimate LD decay45 using R46. The LD decay distance was estimated as the physical distance at which r2 was reduced to half the maximum LD value.
Genome-wide association analysis was performed using a multi-locus model, FarmCPU47. FarmCPU iteratively used the fixed-effect and random-effect models, and significant marker-trait associations (P ≤ 0.0001) were identified. GWAS was performed using the BLUPs of each trait obtained from individual years separately and combined across two years. Markers that showed significant associations with the trait of interest in at least two out of the three datasets (2015, 2016, and combined) were considered for SNP annotation and candidate gene identification using the comparative genome database (https://genomevolution.org/coge/) and the ID52484 and vPm_0390_v1 of Panicum miliaceum17.
Data availability
The raw sequence data is on NCBI’s Sequence Read Archive under accession PRJNA494158. The filtered SNPs and Phenotypic data were deposited to Figshare repository, https://doi.org/10.6084/m9.figshare.26199836.v1. All other supporting data are provided in the Supplementary files.
References
Lu, H. et al. Earliest domestication of common millet (Panicum miliaceum) in East Asia extended to 10,000 years ago. Proc. Natl. Acad. Sci. U. S. A. 106, 7367–7372 (2009).
Hunt, H. V. et al. Millets across Eurasia: Chronology and context of early records of the genera Panicum and Setaria from archaeological sites in the Old World. Veg. Hist. Archaeobot. 17, S5–S18 (2008).
Hunt, H. V. et al. Genetic diversity and phylogeography of broomcorn millet (Panicum miliaceum L.) across Eurasia. Mol. Ecol. 20, 4756–4771 (2011).
Rajput, S. G., Plyler-harveson, T. & Santra, D. K. Development and characterization of SSR markers in proso millet based on switchgrass genomics. Am. J. Plant Sci. 5, 175–186 (2014).
Hunt, H. V. et al. Reticulate evolution in Panicum (Poaceae): The origin of tetraploid broomcorn millet, P. miliaceum. J. Exp. Bot. 65, 3165–3175 (2014).
Diao, X. Production and genetic improvement of minor cereals in China. Crop J. 5, 103–114 (2017).
Habiyaremye, C. et al. Proso millet (Panicum miliaceum L.) and its potential for cultivation in the Pacific Northwest, US: A review. Front. Plant Sci. 8, 1961 (2017).
Bhat, B. V., Tonapi, V. A., Rao, B. D., Singode, A. & Santra, D. Production and utilization of millets in India. In International Millet Symposium and The 3rd International Symposium on Broomcorn Millet (3rd ISBM) (eds. Santra, D. K. & Johnson, J. J.) 24–26 (2018).
Park, C. H. Production and utilization of broomcorn millet in Korea. In International Millet Symposium and The 3rd International Symposium on Broomcorn Millet (3rd ISBM) Program and Abstracts (eds. Santra, D. K. & Johnson, J. J.) 27 (2018).
Saleh, A. S. M., Zhang, Q., Chen, J. & Shen, Q. Millet grains: Nutritional quality, processing, and potential health benefits. Compr. Rev. Food Sci. Food Saf. 12, 281–295 (2013).
Santra, D. K., Khound, R. & Das, S. Proso Millet (Panicum miliaceum L.) breeding : Progress, challenges and opportunities. In Advances in Plant Breeding Strategies: Cereals (eds. Al-Khayri, J., Jain, S. M. & Johnson, D. V) 223–257 (Springer, 2019).
Kalinova, J. & Moudry, J. Content and quality of protein in proso millet (Panicum miliaceum L.) varieties. Plant Foods Hum. Nutr. 61, 45–49 (2006).
Vetriventhan, M., Azevedo, V. C. R., Upadhyaya, H. D. & Naresh, D. Variability in the global Proso millet (Panicum miliaceum L.) Germplasm collection conserved at the ICRISAT Genebank. Agriculture (Switzerland). 9, 112 (2019).
de Wet, J. M. J. Origin, evolution and systematics of minor cereals. In Small Millets in Global Agriculture (eds. Seetharam, A., Riley, K. W., Harinarayana, G.) 19–30 (Oxford & IBH Publishing Co. Pvt. Ltd., 1986).
Upadhyaya, H. D., Vetriventhan, M., Dwivedi, S. L., Pattanashetti, S. K. & Singh, S. K. Proso, barnyard, little and kodo millets. In Genetic and Genomic Resources for Grain Cereals Improvement, vol. 1 (eds. Singh, M. & Upadhyaya, H. D.) 321–343 (Academic Press, 2015).
Shi, J. et al. Chromosome conformation capture resolved near complete genome assembly of broomcorn millet. Nat. Commun. 10, 464 (2019).
Zou, C. et al. The genome of broomcorn millet. Nat. Commun. 10, 436 (2019).
Wang, C. et al. Genome-wide association study of blast resistance in indica rice. BMC Plant Biol. 14, 311 (2014).
Yates, S. et al. Precision phenotyping reveals novel loci for quantitative resistance to Septoria Tritici Blotch. Plant Phenom. 2019, 3285904 (2019).
Jaiswal, V. et al. Genome-wide association study (GWAS) delineates genomic loci for ten nutritional elements in foxtail millet (Setaria italica L.). J. Cereal Sci. 85, 48–55 (2019).
Agrama, H. A., Eizenga, G. C. & Yan, W. Association mapping of yield and its components in rice cultivars. Mol. Breed. 19, 341–356 (2007).
Tadesse, W. et al. Genome-wide association mapping of yield and grain quality traits in winter wheat genotypes. PLoS One 10, 1–18 (2015).
Morris, G. P. et al. Population genomic and genome-wide association studies of agroclimatic traits in sorghum. Proc. Natl. Acad. Sci. U. S. A. 110, 453–458 (2013).
Boukail, S. et al. Genome wide association study of agronomic and seed traits in a world collection of proso millet (Panicum miliaceum L.). BMC Plant Biol. 21, 330 (2021).
Khound, R., Rajput, S. G., Schnable, J. C., Vetriventhan, M. & Santra, D. K. Genome-wide association study reveals marker–trait associations for major agronomic traits in proso millet (Panicum miliaceum L.). Planta 260, (2024).
Vetriventhan, M. & Upadhyaya, H. D. Diversity and trait specific sources for productivity and nutritional traits in the global proso millet (Panicum miliaceum L.) germplasm collection. Crop J. 6, 451–463 (2018).
Cingolani, P. et al. A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff: SNPs in the genome of Drosophila melanogaster strain w1118; iso-2; iso-3. Fly (Austin) 6, 80–92 (2012).
Rajput, S. G., Santra, D. K. & Schnable, J. Mapping QTLs for morpho-agronomic traits in proso millet (Panicum miliaceum L.). Mol. Breed. 36, 37 (2016).
Upadhyaya, H. D., Sharma, S., Gowda, C. L. L., Reddy, V. G. & Singh, S. Developing proso millet (Panicum miliaceum L.) core collection using geographic and morpho-agronomic data. Crop Pasture Sci. 62, 383–389 (2011).
IBPGR. Descriptors for Panicum Miliaceum and P. Sumatrense. (IBPGR, 1985).
Wheal, M. S., Fowles, T. O. & Palmer, L. T. A cost-effective acid digestion method using closed polypropylene tubes for inductively coupled plasma optical emission spectrometry (ICP-OES) analysis of plant essential elements. Anal. Methods 3, 2854–2863 (2011).
Sahrawat, K. L., Kumar, G. R. & Murthy, K. V. S. Sulfuric acid–Selenium digestion for multi-element analysis in a single plant digest. Commun. Soil Sci. Plant Anal. 33, 3757–3765 (2002).
Patterson, H. D. & Thompson, R. Recovery of inter-block information when block sizes are unequal. Biometrika 58, 545–554 (1971).
Wald, A. Tests of statistical hypotheses concerning several parameters when the number of observations is large. Trans. Am. Math. Soc. 54, 426–482 (1943).
Johnson, M., Deshpande, S., Vetriventhan, M., Upadhyaya, H. D. & Wallace, J. G. Genome-wide population structure analyses of three minor millets: Kodo millet, little millet, and proso millet. Plant Genome 12, 190021 (2019).
Mace, E. S., Buhariwalla, H. K. & Crouch, J. H. A high-throughput DNA extraction protocol for tropical molecular breeding programs. Plant Mol. Biol. Rep. 21, 459a–459h (2003).
Elshire, R. J. et al. A robust, simple genotyping-by-sequencing (GBS) approach for high diversity species. PLoS One 6, e19379 (2011).
Wallace, J. G. & Mitchell, S. E. Genotyping-by-sequencing. Curr. Protoc. Plant Biol. 2, 64–77 (2017).
Langmead, B. & Salzberg, S. L. Fast gapped-read alignment with Bowtie 2. Nat. Methods 9, 357–359 (2012).
Alexander, D. H., Novembre, J. & Lange, K. Fast model-based estimation of ancestry in unrelated individuals. Genome Res. 19, 1655–1664 (2009).
Wallace, J. G. et al. The Genetic makeup of a global barnyard millet germplasm collection. Plant Genome 8, (2015).
Wright, S. Variability within and among natural populations. In Evolution and the Genetics of Populations (University of Chicago Press, 1978).
Goodman, M. M. & Stuber, C. W. Races of Maize. 6: Isozyme Variation Among Races of Maize in Bolivia (1983).
Murtagh, F. & Legendre, P. Ward’s hierarchical agglomerative clustering method: Which algorithms implement ward’s criterion?. J. Classif. 31, 274–295 (2014).
Hill, W. G. & Weir, B. S. Variances and covariances of squared linkage disequilibria in finite populations. Theor. Popul. Biol. 33, 54–78 (1988).
R Core Team. R: A Language and Environment for Statistical Computing (R Foundation for Statistical Computing, 2018). https://www.R-project.org/..
Liu, X., Huang, M., Fan, B., Buckler, E. S. & Zhang, Z. Iterative usage of fixed and random effect models for powerful and efficient genome-wide association studies. PLoS Genet. 12, e1005767 (2016).
Acknowledgements
This work was supported by NSF grants DBI-0820619 and IOS-1238014, ICRISAT, and the USDA–ARS. This work has been undertaken as part of the Global Research Program—Accelerated Crop Improvement, and CGIAR Genebank Platform.
Author information
Authors and Affiliations
Contributions
M.V. and H.D.U. selected materials and established the diversity subset for this study; M.V. and D.N. generated and analyzed the phenotypic data; J.W., S.D., M.S.J. and M.V. generated the genomic data and SNP calling; A.V., M.V., L.R. analyzed the genomic data; M.V. wrote the manuscript; K.S., H.D.U. and S.M. provided oversight and direction. All authors had responsibility for editing the manuscript for publication.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher's note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Vetriventhan, M., Upadhyaya, H.D., Deshpande, S. et al. Genome-wide assessment of population structure and association mapping for agronomic and grain nutritional traits in proso millet (Panicum miliaceum L.). Sci Rep 14, 21920 (2024). https://doi.org/10.1038/s41598-024-72319-w
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41598-024-72319-w
Keywords
This article is cited by
-
Unlocking Climate Resilience Through Omics in Underutilized Small Millets
Tropical Plant Biology (2025)
-
Genomic resources, opportunities, and prospects for accelerated improvement of millets
Theoretical and Applied Genetics (2024)