Deterministic and stochastic modelling of impacts from genomic selection and phenomics on genetic gain for perennial ryegrass dry matter yield

Jahufer, M. Z. Z.; Arojju, Sai Krishna; Faville, Marty J.; Ghamkhar, Kioumars; Luo, Dongwen; Arief, Vivi; Yang, Wen-Hsi; Sun, Mingzhu; DeLacy, Ian H.; Griffiths, Andrew G.; Eady, Colin; Clayton, Will; Stewart, Alan V.; George, Richard M.; Hoyos-Villegas, Valerio; Basford, Kaye E.; Barrett, Brent

doi:10.1038/s41598-021-92537-w

Download PDF

Article
Open access
Published: 24 June 2021

Deterministic and stochastic modelling of impacts from genomic selection and phenomics on genetic gain for perennial ryegrass dry matter yield

M. Z. Z. Jahufer¹,
Sai Krishna Arojju¹,
Marty J. Faville¹,
Kioumars Ghamkhar¹,
Dongwen Luo¹,
Vivi Arief²,
Wen-Hsi Yang²,
Mingzhu Sun⁶,
Ian H. DeLacy²,
Andrew G. Griffiths¹,
Colin Eady³,
Will Clayton³,
Alan V. Stewart⁴,
Richard M. George⁴,
Valerio Hoyos-Villegas⁵,
Kaye E. Basford² &
…
Brent Barrett¹

Scientific Reports volume 11, Article number: 13265 (2021) Cite this article

2439 Accesses
12 Citations
1 Altmetric
Metrics details

Subjects

Abstract

Increasing the efficiency of current forage breeding programs through adoption of new technologies, such as genomic selection (GS) and phenomics (Ph), is challenging without proof of concept demonstrating cost effective genetic gain (∆G). This paper uses decision support software DeltaGen (tactical tool) and QU-GENE (strategic tool), to model and assess relative efficiency of five breeding methods. The effect on ∆G and cost ($) of integrating GS and Ph into an among half-sib (HS) family phenotypic selection breeding strategy was investigated. Deterministic and stochastic modelling were conducted using mock data sets of 200 and 1000 perennial ryegrass HS families using year-by-season-by-___location dry matter (DM) yield data and in silico generated data, respectively. Results demonstrated short (deterministic)- and long-term (stochastic) impacts of breeding strategy and integration of key technologies, GS and Ph, on ∆G. These technologies offer substantial improvements in the rate of ∆G, and in some cases improved cost-efficiency. Applying 1% within HS family GS, predicted a 6.35 and 8.10% ∆G per cycle for DM yield from the 200 HS and 1000 HS, respectively. The application of GS in both among and within HS selection provided a significant boost to total annual ∆G, even at low GS accuracy r_A of 0.12. Despite some reduction in ∆G, using Ph to assess seasonal DM yield clearly demonstrated its impact by reducing cost per percentage ∆G relative to standard DM cuts. Open-source software tools, DeltaGen and QuLinePlus/QU-GENE, offer ways to model the impact of breeding methodology and technology integration under a range of breeding scenarios.

Genetic variation and heritability of agronomic traits in a native perennial forage species from drylands: breeding potential of Festuca pallescens

Article Open access 26 February 2025

How partial phenotyping to reduce generation intervals can help to increase annual genetic gain in selected honeybee populations

Article Open access 22 May 2025

A diallel study to detect genetic background variation for FHB resistance in winter wheat

Article Open access 26 February 2024

Introduction

Designing the best structure for a breeding program, within the resources available, are important decisions to be made at the onset of cultivar development^1,2. In forage breeding, there are a range of well proven and commonly used methods based on half-sib and full-sib family recurrent selection^3,4, which have often been adapted to suit specific breeding objectives. Today, genomic selection and marker-aided breeding approaches^5,6,7 together with high throughput, non-destructive phenotyping platforms^8,9,10,11 provide new opportunities for plant breeders to improve the efficiency and accuracy of conventional plant breeding strategies.

The cost and time required for phenotypic assessment limits the efficiency of screening traits in crops and forages. This is particularly the case for evaluating yield or stress related traits such as DM yield, increased tolerance to abiotic stress, or water use efficiency¹². Breeding focused on improving growth is often constrained by destructive methods of biomass measurement across multiple genotypes evaluated in field trials¹³. The application of powerful sensors and digital tools can help in accurate trait quantification¹⁴. In the past five years adoption of these phenomics based technologies and tools in crops and forages has increased markedly^{8,11,14,15,16,17}.

Questions associated with optimisation of breeding strategies are complex, compounded by cost and uncertainty of changing a system, even if the existing system may be delivering less genetic gain, be less nimble or address fewer traits than desired. Decision support software enables simulation of breeding strategy efficacy in response to factors including selection among and within genetic families, different combinations of year, season, site and replicate with associated costs per selection cycle. Such software help breeders design and implement more efficient and effective breeding programs based on predicted genetic gain. To date this has been focused on simulation for inbred species^18,19,20, leaving a gap in obligate outcrossing species such as forage grasses. Recent efforts to address this gap have led to the development of two decision support software packages for plant breeding: the tactical tool DeltaGen²¹ based on deterministic modelling; and the strategic tool QuLinePlus²² a component of QU-GENE¹⁸ based on stochastic modelling. QuLinePlus is a breeding module, specifically designed to simulate full and half sibling breeding programs for outcrossing species and QU-GENE is considered as an engine which generates the simulation inputs. Concurrently, new phenomic tools for electronic automation, acceleration and standardisation of data collection for key traits such as DM yield have been developed⁸, and are being deployed in field breeding programs²³. Genomic prediction models have recently been developed and assessed for complex traits, including DM yield²⁴ and nutritive value²⁵ in perennial ryegrass (Lolium perenne). Understanding the implications of these new technologies on genetic gain over time is pivotal to ensuring their use is optimized.

The availability of Ph and GS tools for forage plant breeding, and of decision support simulation tailored for outbred species, gives rise to the opportunity to examine the relative costs and impact of options for integrating these tools into breeding systems. This paper aims to investigate, using modelling and simulation, the relative efficiency of five half sib (HS) breeding methods based on predicted genetic gain and associated costs to improve DM yield of perennial ryegrass. Simulation was conducted applying tactical deterministic and strategic stochastic decision support modelling software (i.e. DeltaGen and QU-GENE) based on a “mock data” set of 1000 HS families, created by combining DM yield data from two sets of field trials. A key objective of this paper is also to demonstrate the potential application of the combination of DeltaGen/QU-GENE, to expand the scope of tools available to breeders in decision support for breeding program design.

The five HS family breeding methods simulated were: (a) standard phenotypic among half-sib family selection (A_p), (b) among half-sib family selection based on phenomics (A_Ph), (c) standard phenotypic among half-sib family selection and within family genomic selection (A_pW_gs), (d) among half-sib family selection based on phenomics and within family genomic selection (A_PhW_gs), (e) both among and within half-sib family selection based on genomic selection (A_gsW_gs).

Material and methods

Deterministic modelling for estimating genetic gain using HS family breeding strategies

The genetic gain simulation analyses were conducted using the open source software DeltaGen²¹ (available via https://deltagen.agresearch.co.nz/app/deltagen) for single selection cycles. The constructed mock data matrices of 200 HS and 1000 HS families used to generate starting points for the simulation of breeding methods, were compiled using perennial ryegrass DM yield and growth score measurements generated from two sources of multi-year-season-___location trials. The term starting points referred to in this paper, are a common set of HS family means, estimates of additive genetic/interaction variance components and narrow sense heritabilities, applied in the different breeding equations used for deterministic modelling and simulation of the five breeding methods using DeltaGen.

Construction of the 1000 and 200 HS family “mock data” matrices of perennial ryegrass

Prediction of differences in genetic gain ∆G among breeding strategies is based on the application of quantitative genetic models^3,4. These genetic analyses require estimates of population parameters; phenotypic variance (${\sigma }_{P}^{2}$), additive genetic variance (${\sigma }_{A}^{2}$), different components of genotype-by-environment (GE) interaction such as HS family interactions with; year, season and ___location, depending on breeding objectives, and narrow sense heritability (${h}_{n}^{2}$). In our investigation using deterministic simulation, a key criterion for modelling the impact of GS and Ph technologies on the relative efficiency of HS family breeding methods, was to use mock data sets constructed from actual HS family multi ___location- year-season field trials rather than using in silico generated data. These mock data matrices enabled “realistic” estimates of genetic parameters to be generated. It is important to note that these estimates were only used as starting points to conduct simulation of the breeding methods using DeltaGen. Two mock DM yield data matrices, of 1000 HS and 200 HS families, were created by combining data generated from multi-___location, year and season, HS family evaluation studies reported by Faville et al.²⁴ and Arojju et al.²⁶.

The 1000 HS family DM yield mock data set was based on field trials conducted across 2 locations over 2 years and 3 seasons per year. The mock data set was constructed so that each ___location had a 20 row-by-50 column design with 3 replicates. The herbage DM yield data matrix consisting of 200 HS families was constructed from a random sample of 200 families taken from the 1000 HS family matrix. The year ($n$=2), season ($n$=3) and ___location ($n$=2) data associated with each of the 200 randomly sampled HS families, were combined into a 3 replicate, row and column (10 rows- by-20 columns) by ___location, season and year design matrix. A detailed description of construction of the two mock data sets is provided in supplementary material 1 in File S1, and the distribution of the 200 and 1000 HS family DM yield data are presented in Figure S1.

It is important to note that a key assumption used for quantitative genetic analysis and modelling of the mock data, was that both the 200 HS and 1000 HS families were generated from parents randomly sampled from the same random mating population, cross pollinated under isolation. The coefficient of inbreeding ($F$) was assumed to be zero $(F=0)$ and therefore the among HS family (${\sigma }_{f}^{2}$) variation provided an estimate of ¼ ${\sigma }_{A}^{2}$ (additive genetic variation)²⁷.

Data analysis

The two HS family mock data matrices (provided as supplementary material 2) were analysed using Eqs. 1 and 2 below, to derive estimates of additive genetic variation, associated genotype-by-environment (GE) interaction and narrow sense heritability on a family mean basis. A linear mixed model (Eq. 1) using the Residual Maximum Likelihood (REML)^28,29,30 procedure in DeltaGen²¹ was used to estimate genetic parameters to be used for modelling based on breeding equations.

The linear mixed model used for analysis across locations, seasons and years:

$$Y_{{ijklmno}} = M + f_{i} + l_{j} + \left( {fl} \right)_{{ij}} + s_{k} + \left( {fs} \right)_{{ik}} + y_{l} + \left( {fy} \right)_{{il}} + b_{{jklm}} + ~r_{{jklmn}} + c_{{jklmno}} + \varepsilon _{{ijklmno}}$$

(1)

${Y}_{ijklmno}$ is the value of an attribute measured from HS family $i$ in column $o$ and row $n$ of replicate $m$ at ___location $j$ in season $k$ of year $l$, and $i=$ 1,…, ${n}_{f}$; $j=$ 1,…,n_l, $k=1$,…,${n}_{s}$; $l=1$,…, ${n}_{y}$; $m=$ 1,…, ${n}_{b}$; $n=1$,…, ${n}_{r}$; $o=1$,…,${n}_{c}$; where f, $l$, $s$, $y$, $b$, $r$ and $c$, are families, locations, seasons, years, replicates, rows and columns, respectively; $M$ is the overall mean; ${f}_{i}$ is the random effect of family $i$, $N(0, {\sigma }_{f}^{2});$ ${l}_{j}$ is the fixed effect of ___location $j$; ${\left(fl\right)}_{ij}$ is the effect of the interaction between family $i$ and ___location $j$, $N(0,{\sigma }_{fl}^{2})$; ${s}_{k}$ is the fixed effect of season $k$; ${\left(fs\right)}_{ik}$ is the effect of the interaction between family $i$ and season $k$, $N(0,{\sigma }_{fs}^{2})$; ${y}_{l}$ is the fixed effect of year $l$;${\left(fy\right)}_{il}$ is the effect of the interaction between family $i$ and year $l$, $N(0,{\sigma }_{fy}^{2})$; ${b}_{jklm}$ is the random effect of replicate $m$ within ___location $j$, within season $k$, within year $l$, $N(0,{\sigma }_{b}^{2})$; ${r}_{jklmn}$ is the random effect of row $n$ in replicate $m$ within ___location $j$, within season $k$, within year $l$, $N(0,{\sigma }_{r}^{2})$; ${c}_{jklmno}$ is the random effect of column $o$ in row $n$ in replicate $m$ within ___location $j$, within season $k$, within year $l$, $N(0,{\sigma }_{c}^{2})$; ${\varepsilon }_{ijklmno}$ is the residual effect for family $i$ in row $n$ and column $o$ in replicate $m$ in ___location $j$ in season $k$, during year $l$, $N(0,{\sigma }_{\varepsilon }^{2})$.

The estimates of variance components generated from Eq. 1, were used to generate an estimate of narrow sense heritability, ${h}_{n}^{2}$, for herbage DM yield on a family mean basis using Eq. 2.

$$h_{{n~}}^{2} = \frac{{\sigma _{f}^{2} }}{{\sigma _{f}^{2} + \frac{{\sigma _{{fl}}^{2} }}{{n_{l} }} + \frac{{\sigma _{{fs}}^{2} }}{{n_{s} }} + \frac{{\sigma _{{fy}}^{2} }}{{n_{y} }} + \frac{{\sigma _{\varepsilon }^{2} }}{{n_{s} n_{y} n_{r} }}}}$$

(2)

where $n$, number of and f, l, s, y and r, are families, locations, seasons, years and replicates, respectively.

DeltaGen generated estimates of narrow sense heritability (${h}_{n}^{2}$) on a HS family mean basis across locations, seasons and years³¹.

Breeding methods and associated prediction equations

The five breeding methods assessed in this study were:

(a)
Standard phenotypic among HS family selection (A_p): Elite HS families are selected based on phenotypic performance within or across environments. Equal numbers of remnant seed from each selected HS family are randomly sampled. The selected individuals become the parents of the next generation. The prediction equation for genetic gain⁴,
$$\Delta G={k}_{f}c\frac{\frac{1}{4}{\sigma }_{A}^{2}}{{\sigma }_{PF}}$$
(3)
where, ${k}_{f}$, is among family selection intensity; $c$, parental control factor; ${\sigma }_{A}^{2}$, additive genetic variance;${\sigma }_{PF}$, the phenotypic standard deviation among families. For HS family selection c = 0.5 as selection is on female gametes only²⁷.
(b)
Among HS family selection using LiDAR (light detection and ranging) based phenomics (A_Ph): as per (a), elite HS families are selected but based on phenotypic data collected using a LiDAR based phenomic tool⁸. The precision of genetic gain estimated using breeding Eq. (3) will depend on the accuracy of the phenomics method applied.
(c)
Standard phenotypic among HS family selection and within HS family genomic selection (A_pW_gs): This breeding method consists of the steps: (i) select elite HS families, using a predetermined selection pressure, based on standard phenotypic measurements as per (a); (ii) equal numbers of remnant seed from each selected HS family are randomly sampled, seedlings established and individually genotyped, individual genomic-estimated breeding values (GEBV’s) determined using genomic prediction, and the best seedlings within each family based on their GEBV’s are selected on a predetermined selection pressure; (iii) the selected individuals become the parents of the next generation. The genomic prediction model used in step (ii) is itself derived using standard phenotypic measurements from the HS field trial and DNA marker data from the maternal parents of the HS families^24,32,33. The equation for predicting genetic gain,
$${\mathrm{A}}_{p}{\mathrm{W}}_{gsY}={k}_{f}{c}_{f}\frac{\frac{1}{4}{\sigma }_{AY}^{2}}{{\sigma }_{PF}}+{k}_{w}{c}_{w}{h}_{X}{r}_{A-XY}\frac{\surd 3}{2}{\sigma }_{AY}$$
(4)
where, ${\mathrm{A}}_{p}{\mathrm{W}}_{gsY}$ is the predicted ∆G for trait Y using a combination of standard phenotypic among HS family selection and within HS family genomic selection; ${\sigma }_{AY}^{2}$, additive genetic variance for primary trait Y; ${\sigma }_{AY}$, standard deviation of additive genetic variance for primary trait $Y$; $\sqrt{\frac{3}{4}}=\frac{\sqrt{3}}{2}$; ${\sigma }_{PF}$, among HS family phenotypic standard deviation; ${k}_{f}$ and ${k}_{w}$, among and within family selection intensity, respectively; ${c}_{f}$ (= 0.5) and ${c}_{W}$ (= 0.5) among and within family parental controls, respectively; ${h}_{X}$ _, square root of heritability for secondary trait $X$; ${r}_{A-XY}$, genomic prediction accuracy – Pearson’s correlation between Phenotypic Estimated Breeding Values – BLUP’s $Y$ and GEBV’s $X$. In GS, the assumption is ${h}_{X}$ =1^34,35. Please note that the subscript $gsY$ in Eqs. 4 and 5 is for genomic selection ($gs$) for trait $Y$. This is based on correlated response where, trait $Y$ (the true breeding value of an individual) is indirectly selected based on selection for trait $X$ (the GEBV of that individual).
(d)
Among HS family selection using LiDAR based phenomics (Ph) and within family genomic selection (A_PhW_gs): as per (c) except that elite HS families are selected based on data collected using a LiDAR based Ph tool.
(e)
Both among and within HS family selection based on genomic selection (A_gsW_gs): This breeding method is implemented in cycle 2 using the HS families generated using the A_pW_gs method (c). In this scenario, following application of A_pW_gs, a second cycle of selection is completed using only genomic prediction. To achieve among family selection based on GEBV’s, a predetermined number of seeds is randomly sampled from each HS family and the seedlings genotyped. The marker information is used in the already-established genomic prediction model to generate GEBV estimates and a mean GEBV for each family is determined. Based on the required selection pressure, HS families are selected on their GEBV means. Within-family selection of the chosen HS families, using a predefined selection pressure, is applied to family members based on their individual GEBV estimates, as per (c). The selected individuals form the parents of the next generation. The equation used for predicting genetic gain²¹,
$${\mathrm{A}}_{gsY}{\mathrm{W}}_{gsY}={k}_{f}{c}_{f}{h}_{X}{r}_{A-XY}\frac{1}{2}{\sigma }_{AY}+{k}_{w}{c}_{w}{h}_{X}{r}_{A-XY}\frac{\surd 3}{2}{\sigma }_{AY}$$
(5)
where the terms in the above equation have been defined in equation 4.

Genomic selection

For the purposes of this exercise, the cost of conducting GS was defined as the cost per GEBV. The costing was based on a non-commercial research laboratory, the Forage Genetics GBS facility at AgResearch, and includes consumables and time. All steps from extracting DNA through genotyping-by-sequencing (GBS) to generate a GEBV are addressed in supplementary material 1, Table S1: DNA isolation, GBS library development, GBS library sequencing, bioinformatic processing and genotype calling from raw GBS data, and genomic prediction (GEBV determination). Seedling grow-out and tissue sampling were not included as it is expected tissue samples for DNA isolation would be provided. The GBS process, including data filtering steps, is largely as described by Faville, et al.²⁴ except that it is conducted at a 384-plex scale (376 samples plus four blanks and four positive controls per library) instead of 96-plex (94 samples plus one blank and one positive control).

Phenomics, sampling and field trial operational costs

The Ph referred to in this paper is a LiDAR based mobile platform for non‑invasive vegetative biomass and growth rate estimation in perennial ryegrass. The accuracy of 0.90 was determined from LiDAR based volumetric estimates compared against fresh weight and dry weight data across different ages of plants, seasons, stages of regrowth, sites, and row plot (two 2 m rows 15 cm apart) configurations²³. The costs (New Zealand $) per sample; based on perennial ryegrass herbage DM derived via harvest and via Ph were $7.50 and $0.87, respectively. These sampling costs were obtained from multi-year, season and ___location field trials, based on row plots, which generated the original data used to build the 1000 HS family data matrix. The cost information is provided in supplementary material 1, Table S2.

Stochastic modelling of multiple selection cycles using HS family breeding strategies

Stochastic modelling of the three breeding strategies A_p, A_pW_gs and A_gsW_gs were conducted using QuLinePlus²², available via https://sites.google.com/view/qu-gene, with modifications to the software to implement GS. The objective of this modelling was to generate trends of response to selection over multiple breeding cycles, based on populations of similar size used in deterministic modelling.

The breeding strategies; A_p, A_pW_gs and A_gsW_gs were simulated across ten selection cycles using QuLinePlus software. The three strategies were compared for percentage of genetic gain per year (%ΔG), cumulative genetic gain (ΣΔG), allele fixation rate, genetic variance and prediction accuracy across multiple selection cycles. Each generated value is an average 250 iterations.

Generation of the initial training population for simulation in QuLinePlus

Two sets of training populations consisting of 196 and 980 HS families, hereafter referred to as Sim200 and Sim1000, were simulated. Beginning with experimentally derived information on phenomic and associated genomic data from a commercial breeding population consisting of 98 plants, QuLinePlus software was used to generate 98 HS families. These HS families were used to simulate their performance for DM yield for three years across three locations. From each of the evaluated families two random individuals were drawn to generate a training population of 196 plants (Sim200) and ten random individuals were drawn to generate a second training population of 980 plants (Sim1000). It is important to note that the 200 and 1000 HS families used in deterministic modelling were related, the former a random sample from the latter.

Defining the genetic model and trait architecture

A recombination map with seven linkage groups was constructed based on a genetic map³⁶ using the genomic markers from the initial population (98 individuals). In total, 1807 segregating markers were assigned to the seven linkage groups, out of which 474 were QTLs with an additive gene model. The allelic effect of each QTL was estimated based on genome-wide association analysis for DM yield implemented using GAPIT³⁷.

Narrow sense heritabilities on a family mean basis, estimated from mixed model analysis in DeltaGen using data for mean DM yield of the simulated 200 HS and 1000 HS families, were converted to single plant-based heritabilities (considering 30 plants per plot).

Genomic prediction

A genomic prediction model was generated using marker and phenotypic data from the training populations, Sim200 and Sim1000. The GEBVs for each individual plant were estimated using a standard BLUP procedure using the R package rrBLUP³⁸. The heritability estimate of GEBVs was considered as 0.95, rather than 1.00, to account for genotyping error.

Breeding strategies

The breeding strategies A_p, A_pW_gs and A_gsW_gs were simulated in QuLinePlus using the modelling options available for each method.

(a)
Among half-sib family phenotypic selection (A_p): Using Sim200 and Sim1000 training populations, the A_p strategy was performed by randomly intermating all individuals to generate 196 and 980 HS families. These HS families were evaluated for DM yield in two environments for two years using three replicates and 30 plants per plot. From these trials, among family selection pressures of 20% and 2% were applied to select the best families. To restore the initial number of parents in the training population (196 and 980) for the next selection cycle, random samples of 5 (2.77%) and 50 (27.77%) individuals from the best 20% and 2% HS families were taken and used as parents to generate progeny for the next selection cycle. Each selection cycle was completed in three years (years 1 and 2—field evaluation and selection, and year 3—random mating of selected parents and half-sib family generation).
(b)
Among half-sib family phenotypic selection and within family genomic selection (A_pW_gs): In this breeding strategy, HS families were phenotypically evaluated as described in (a) and among family selection pressures of 20% and 2% were applied based on phenotype. However, within family selection was based on GEBVs estimated using the rrBLUP genomic prediction model. The GEBVs were ranked and the top 5 (2.77% within family selection pressure) or 50 (27.77% within family selection pressure) individuals from the best 20% and 2% families, respectively, were selected as parents for the next cycle. Each selection cycle was completed in three years (years one and two—field evaluation and selection, and year three—random mating of selected parents and half-sib family generation).
(c)
Among and within half-sib family selection based on genomic selection (A_gsW_gs): The A_gsW_gs strategy was similar to A_pW_gs, with the only difference at the stage of among family selection, which was based on GEBVs rather than phenotypic measurements. Among and within family selection pressures were the same as the A_pW_gs strategy, with 20% and 2% among family selection pressure and 2.77% and 27.77% within family selection pressure. It is important to note that this breeding strategy was conducted in one year.

Simulation output

Percentage genetic gain (%∆G) for all three breeding strategies was based on BLUP mean differences between selection cycles. The percentage genetic gain per year was calculated by dividing total predicted %∆G by the number of years per cycle. Cumulative genetic gain (Σ∆G) was calculated as the BLUP mean in each cycle relative to that in cycle zero. Allele fixation rate was computed within the QuLinePlus software to determine the percentage of fixed alleles in each cycle for the trait under selection. Genomic prediction accuracy was computed as the Pearson correlation coefficient between the true breeding value and their GEBVs. Percentage change of genetic variance across selection cycles was computed relative to cycle zero, considered as 100%.

Results

Deterministic modelling

Variance component analysis indicated significant (P < 0.05) additive genetic variation (${\sigma }_{A}^{2}$) for herbage dry matter (DM) yield among the HS families within each of the two mock data matrices, consisting of 200 and 1000 entries (Table 1). There were significant (P < 0.05) family-by-season (${\sigma }_{AS}^{2}$) and family-by-___location (${\sigma }_{AL}^{2}$) interactions estimated from the 200 HS matrix but family-by-year (${\sigma }_{AY}^{2}$) was not significant (P > 0.05) in this dataset. The estimates of family-by-season (${\sigma }_{AS}^{2}$), family-by-___location (${\sigma }_{AL}^{2}$) and family-by-year (${\sigma }_{AY}^{2}$) interactions were all significant (P < 0.05) for the 1000 HS family DM yield matrix. HS family narrow sense heritability (${h}_{n}^{2}$) estimates for DM yield, based on family mean performance across years, seasons and locations, were moderate for both data matrices, the 1000 HS family derived estimate being higher (Table 1). The estimates were within the range reported for perennial ryegrass^24,39,40. The estimated genetic parameters, from REML analysis, presented in Table 1 were used as starting points in the different breeding equations to predict genetic gain.

Table 1 Ranges, medians, means (based on BLUP estimates), variance components and associated standard error (± SE) and family mean narrow sense heritability for herbage DM yield estimated from the 200 and 1000 HS family data; evaluated across two sites over 2 years and 3 seasons per year.

Full size table

Predicted genetic gain based on the 200 HS family mock data matrix

Applying the among HS family phenotypic selection method (A_p), at a selection pressure of 20%, to the seasonal DM yield data, predicted a 1.43% increase in DM yield above the population mean of 200 HS families (Table 2). Using phenomics with an accuracy of 0.90 for estimating DM yield generated from A_Ph, a 1.29% increase in DM yield was predicted. The cost per %ΔGc using A_p was higher than that when A_Ph was applied. However, using A_Ph reduced the cost per %ΔGc by 25%. The final number of selected parents decreased from 40 at 20% selection pressure to 4 individuals at 2% selection pressure (Table 2).

Table 2 Simulated predictions of percentage genetic gain per cycle (3 years) of selection (%ΔGc), at different selection pressure (%), for perennial ryegrass dry matter (DM) yield and associated cost per percentage genetic gain ($ per % gain), from half-sib (HS) family phenotypic selection among 200 families evaluated across 2 years, 3 seasons and 2 locations.

Full size table

The results of predicted ∆G presented in Tables 3 and 4 were generated from two cycles of HS family selection for increasing herbage DM yield. Cycle 1 (C1) (Table 3) was based on the A_pW_gs method in which the same among and within family selection pressures of 20%/10% (among/within family), 10%/5%, 5%/1% and 2%/1%, were applied at each level of accuracy r_A (0.26, 0.36, 0.46). Cycle 1 consisted of 3 years. Cycle 2 (C2) (Table 4) used genomic selection for both the among and within family selection (A_gsW_gs) applying selection pressures of 20%/10% and 10%/5%. Cycle 2, a single year, was based on 100 HS families each taken from the 2000 HS and 500 HS families generated in C1 at selection pressures of 20%/10% and 10%/5%, respectively. In addition to r_A values of 0.26, 0.36 and 0.46, an additional level of accuracy, r_A of 0.12, was applied in C2 to model a scenario of declining accuracy due to decay of genetic relationships (Table 4).

Table 3 Simulation cycle 1 (C1) of selection for seasonal DM yield among 200 HS families of perennial ryegrass based on among HS family phenotypic selection, using data from a 2 year field trial, 3 seasons per year across 2 locations followed by the application of genomic selection (GS) within the selected HS families.

Full size table

Table 4 Cycle 2 (C2) is based on selection among and within HS family progeny generated from polycrossing the elite parents identified in cycle 1.

Full size table

In C1 (Table 3), the predicted %ΔG per cycle rose as r_A was increased. In this cycle, the combination of increasing among HS family phenotypic selection pressure (2%) and within family genomic selection pressure (1%), at high r_A (0.46), resulted in the highest ΔG of 6.35%, based on data from the A_p seasonal herbage DM yield sampling method. The cost per percentage gain was a low $27,397 compared to the high $111,819 at among and within HS family selection pressures of 20% and 10%, respectively, at r_A of 0.26. However, while the low selection pressures and r_A resulted in the selection of 400 parents, the high selection pressure and r_A lead to identifying only 4 parents (Table 3). As expected, the predicted ΔG per cycle, under the A_Ph strategy phenomics assessments of seasonal herbage DM in C1, was slightly lower than A_p, but there was a substantial increase in cost efficiency per % genetic gain (Table 3).

Cycle 2 (Table 4) was based on genomic selection among and within (A_gsW_gs) 100 HS families generated from selected parents from C1 at 20%/10% and 10%/5% selection pressure and at r_A values of 0.26, 0.36 and 0.46. In C2, ΔG was estimated relative to the new mean of HS progeny generated from C1, within each combination of % selection pressure and r_A. The predicted ΔG per cycle ranged from 2.05% at r_A 0.26 to 3.59% at r_A 0.46, at among and within family selection pressures of 20%/10%. Applying higher selection pressures (10%/5%) at r_A values of 0.26 and 0.46, increased the range of predicted ΔG per cycle to 2.44% and 4.27%, respectively. Predicted ΔG combined across selection C1 and C2 resulted in totals ranging from 4.93% to 7.58% at r_A 0.26 and 0.46, respectively, at among and within selection pressures of 20%/10% (Table 4). At the same r_A values of 0.26 and 0.46 at among and within HS family selection pressures of 10%/5%, the predicted ΔG combined across cycles C1 and C2 were 5.93% and 9.06%, respectively. Comparing the costs per percentage predicted ΔG, combined across cycles C1 and C2, the total of $148,418 at r_A 0.26 and 20%/10% among and within family selection pressures was over double the cost, $71,711 at r_A 0.46 at 10%/5% selection pressures. The number of elite parents selected on GEBV’s at the end of C2 were 200 and 50 based on the among and within family selection pressures of 20%/10% and 10%/5%, respectively (Table 4). These results assumed that the r_A values, 0.26, 0.36 and 0.46, did not change from C1 to C2. The effect of possible genomic accuracy decay on ΔG in C2 was assessed using a r_A of 0.12. This resulted in diminished ΔG and increased costs ($) per % ΔG. However, even at the reduced r_A of 0.12, the additional predicted ΔG in the single year of C2 provided an increase in total gain across cycles C1 and C2.

Predicted genetic gain based on the 1000 HS family mock data matrix

The trend of response to selection for DM yield based on the 1000 HS families was the same as that observed from the 200 HS family simulations. Using DM cut phenotypic data (A_p) the predicted ΔGc ranged from 1.90 at 20% among HS family phenotypic selection pressure to 3.62% at 1% selection pressure (Table 5). The large population of HS families resulted in higher numbers of parental half-sibs being selected, compared to the 200 HS family dataset. This ranged from 200 parental HS families at 20% selection pressure to 10 families at 1%. The application of phenomics based DM assessment (A_Ph) clearly indicated its impact on reducing costs ($) per %ΔGc by 58%.

Table 5 Simulated predictions of percentage genetic gain per cycle (3 years) of selection (%ΔGc) and cost per percentage of predicted genetic gain ($ per % gain), using among half-sib (A_p) family selection, for dry matter (DM) yield based on the 1000 HS family data (evaluated across years, seasons and locations), from DM cuts (A_p) and phenomics based phenotyping (A_Ph).

Full size table

Analyses involving 1000 HS families (Tables 6 and 7) generated results analogous to those from the 200 HS families (Tables 3 and 4). The application of the A_pW_gs breeding strategy to the 1000 HS families in C1 resulted in predicted ΔGc ranging from 3.69% to 8.10% at among/within selection pressures of 20%/10% and 2%/1% at r_A values of 0.26 and 0.46, respectively. While the predicted ΔGc from applying A_PhW_gs at similar combinations of selection pressures and r_A values was lower, there was a considerable reduction in cost ($) per %ΔG that ranged from 10 to 40%. The number of selected parental HS families in C1 at the among and within family selection pressure combinations of 5%/1% and 2%/1%, resulted in selection of less than 100 HS families, 50 and 20, respectively (Table 6).

Table 6 Simulation cycle 1 (C1) of selection for seasonal DM yield among 1000 HS families of perennial ryegrass based on among family phenotypic selection, using data from a field trial conducted across 2 locations, 2 years with 3 seasonal measurements per year.

Full size table

Table 7 Cycle 2 (C2) is based on genomic selection (GS) applied to among and within HS family progeny generated from polycrossing the elite parents identified in cycle 1.

Full size table

Genomic among and within family selection was conducted in C2, in a single year, on 100 HS families from C1 generated at 20%/10% and 10%/5% selection pressure at r_A values of 0.26, 0.36 0.46. Trends in ΔGc mirrored those observed with the 200 HS family dataset, but the level of gain was consistently higher. The A_gsW_gs applied resulted in predicted ΔGc ranging from 2.53% at r_A 0.26 to 4.41% at r_A 0.46 at among and within family selection pressures of 20%/10% (Table 7). At selection pressures of 10%/5% and r_A values of 0.26 and 0.46, predicted ΔGc cycle increased to 3.00% and 5.23%, respectively. Combining %ΔG across C1 and C2 resulted in total ΔG ranging from 6.22% to 9.48% at r_A 0.26 and 0.46, respectively, at selection pressures of 20%/10%. At the same r_A values of 0.26 and 0.46 and selection pressures of 10%/5%, the predicted %ΔG combined across selection C1 and C2 was 7.49% and 11.33%, respectively. The total combined costs per %ΔG across cycles 1 and 2, were $273,392 at r_A equal to 0.26 at 20%/10% among and within family selection pressures was over twice the cost, $113,901 at r_A 0.46 at 10%/5% selection pressures. The elite parents selected on GEBV’s at the end of cycle 2 were 200 individuals and 50 individuals based on the selection pressures of 20%/10% and 10%/5%, respectively (Table 7). As in the 200 HS family analysis, the effect of a possible scenario of r_A decay on ΔG in C2 was assessed using a r_A of 0.12, (Table 7). The outcomes were similar to those observed in the 200 HS analysis.

Predicted annual %ΔG for seasonal DM yield resulting from deterministic modelling of the three breeding strategies; A_p, A_pW_gs and A_gsW_gs, based on data from the 200 and 1000 HS families clearly indicate the lower response to selection resulting from the among HS family phenotypic selection (A_p) breeding strategy (Fig. 1A,B). There was an increase in annual %ΔG when GS was combined with the A_p breeding method. The predicted annual %ΔG improved with increasing selection pressure and r_A. The breeding strategy A_gsW_gs resulted in the highest predicted annual %ΔG at all among and within family selection pressures and r_A. It is also clear that when r_A decreased to 0.12 in C2 there was a reduction in predicted annual %ΔG, as would be expected. However, even at a r_A value of 0.12 in C2, the predicted annual %ΔG was higher than that predicted for HS in C1 at all selection pressures (Fig. 1A,B).

Stochastic modelling

Genetic gain

The percentage annual genetic gain (%ΔG) for DM yield in Sim200 and Sim1000 training populations were different for each breeding strategy (Fig. 2). The trends in genetic gain between two training populations (Sim200 and Sim1000) were similar, however, the variation in genetic gain across multiple iterations was higher in Sim200 compared to the Sim1000 training population. Among different breeding strategies, the highest annual genetic gain (%ΔG) was achieved for A_gsWS_gs, followed by A_pW_gs and A_p strategies under both 20% and 2% selection pressures. The differences in %ΔG between the breeding strategies were less evident after the second selection cycle (Fig. 2). When comparing among family selection pressures, the highest genetic gain for all three strategies was observed under 2% selection pressure compared to 20% selection pressure. While the variability in genetic gain across multiple iterations was consistently higher under 2% selection pressure compared to that at 20%, this was more evident for the A_gsW_gs strategy. The highest ΣΔG in each selection cycle was achieved in A_pW_gs strategy, followed by A_p and A_gsW_gs (Supplementary material 1, Figure S2).

Fixation rate of favourable alleles

The influence of selection pressure and training population size on allele fixation rates were observed, with the latter being more pronounced (Fig. 3). The fixation rate was highest in the Sim200 training population under 2% selection pressure and the lowest rate was observed in Sim1000 under 20% selection pressure. In both the training populations, under 20% among family selection pressure, the fixation rate steadily increased across the selection cycles. However, under 2% selection pressure the alleles were fixed more rapidly from selection cycle 3 for all three breeding strategies. Among three breeding strategies, the allele fixation rate followed similar patterns and the differences in fixation rate were more noticeable in the later selection cycles (Fig. 3).

Genetic variance and prediction accuracy

The percentage of genetic variance present within the Sim200 and Sim1000 training population decayed rapidly from cycle 0 to cycle 1 for all three breeding strategies. The patterns were similar under the two different selection pressures (Fig. 4). In Sim200 training population, among three breeding strategies, the lowest genetic variance was observed for A_gsW_gs followed by A_pW_gs and A_p, similar trends were observed in Sim1000 training population. The prediction accuracy at cycle 0 was 0.3 in Sim200 and 0.25 in Sim1000 and was reduced to 0.12 (60% decrease in prediction accuracy) and 0.07 (72% decrease in prediction accuracy) in selection cycle 1 (Fig. 5). There was no difference in the prediction accuracies between the two different selection pressures. The accuracy steadily decreased from cycle 2 and at the end of cycle 10 was a low 0.03 in Sim200 and 0.002 in Sim1000 training populations (Fig. 5).

Discussion

Deterministic modelling

Choosing and optimizing an appropriate breeding strategy to suit a specific cultivar development goal, will require the comparison of different breeding methods based on predicted genetic gains³. While a key criterion for determining the success of a breeding method is genetic gain, cost is also a major factor. Especially in commercial breeding programs, where cost efficiency of a breeding strategy, and access to commercial opportunities, will often determine its adoption.

With increasing pressure on plant breeders to develop new productive cultivars with adaptation to changing environments and meeting consumer demands, new selection and phenotyping technologies must be implemented to enhance genetic advance⁴¹. However, incorporation of these technologies into conventional field breeding programs are often challenged as being expensive to implement. Application of quantitative genetic deterministic²¹ and stochastic modelling²² can be used to evaluate the relative efficiency, genetic gain and associated cost, of conventional breeding strategies in combination with the integration of new selection technologies.

As mentioned in the introduction, HS and FS family breeding strategies are commonly used in forage grass cultivar development programs. In this paper, we present results from deterministic modelling using DeltaGen and stochastic modelling via QU-GENE/QuLinePlus, to predict ∆G and calculate cost per predicted %∆G for DM herbage yield of perennial ryegrass. Deterministic analysis was conducted using a common set of starting points based on estimates of quantitative genetic parameters, from analysis of the 200 HS and 1000 HS family mock data matrices generated using actual field trial DM yield (kg ha^-1). We also evaluated the effect of using phenomics (Ph) on the cost per cycle of selection. The narrow sense heritability values 0.31 and 0.36, estimated for herbage DM yield for the 200 HS and 1000 HS families, respectively, were within the range previously reported^24,42,43. For mean herbage DM yield across years, seasons and locations, and based on a three year selection cycle, the predicted annual %∆G per selection cycle from the A_p strategy varied from 0.48 to 0.83 (200 HS families) and 0.63 to 1.21 (1000 HS families), depending on selection pressure. These increases are in the general range reported by Humphreys⁴⁴ 0.38%, Easton, et al.⁴⁵ 0.4% to 0.5%, Wilkins and Humphreys⁴⁶ 0.5% to 0.6%, Woodfield⁴⁷ 0.25% to 0.73% and Harmer, et al.⁴⁸ 0.76%. However, at high selection pressures applying A_p in large HS populations, such as the 1000 family dataset, annual increases of 1.21% can be achieved. With every percentage increase in ∆G, cost efficiency improved. Although the predicted %∆G using data from Ph based DM assessments (at an accuracy of 0.90), was lower than predictions using DM yield from herbage cuts, there was a considerable decrease in the cost per %∆G. This was especially evident when comparing DM measurement costs between the A_pW_gs and A_PhW_gs breeding methods based on herbage cuts and LiDAR phenomic assessments, from the 1000 HS families. There was a considerable reduction in cost ($) per %ΔG that ranged from 10 to 40%.

Annual increases in %∆G of over 2% per year will be required in crops such as maize (Zea mays), rice (Oryza sativa), wheat (Triticum aestivum), and soybean (Glycine max L.), to meet future global food demands⁴⁹. For perennial ryegrass and other out crossing forage species, achieving a 2% annual gain in DM yield using A_p methods, which exploit only a quarter of the total available additive genetic variation²⁷, is an unrealistic objective. Accessing the ¾ additive variation within HS families will make a significant contribution to increasing annual %∆G^4,50. Low genetic gains in forage breeding programs are due, in part, to an inability to satisfactorily exploit within family genetic variation^50,51. Historically, application of within family selection in forage grasses could only be based on measurements conducted on random samples of individual, spaced plants, representing the selected HS families. Casler⁵⁰ presented a range of examples in forage grass breeding for and against using spaced plants. Hayward and Vivero⁵² and Lazenby and Rogers⁵³ indicated poor genetic correlation between individual spaced plant vigour and sward DM yield. Applying GS based on prediction models constructed using multi-year-season-___location phenotypic data from sown rows or plots makes within family selection for DM yield and other sward traits feasible. The benefits to ∆G of using GS typically focus on the promise of reducing generation interval. However, in addition, the application of genomic prediction to generate GEBV’s for large numbers of random individuals sampled from elite HS families, offers a means to also improve ∆G by increasing the efficiency of within family selection for sward DM yield. Deterministic simulation of the A_pW_gs breeding strategy using DeltaGen, based on the 200 HS and 1000 HS families, clearly indicated the advantage of applying GS for within family selection. For both HS populations, increasing within family selection pressure, from 10 to 1% or increasing r_A from 0.26 to 0.46 increased %∆G per cycle. For both the 200 HS and 1000 HS family populations, at 1% within family GS, %∆G per cycle for DM yield was 6.35 and 8.10, respectively. Based on the three-year selection cycle assumed in this study, these increases equate to annual %∆G of 2.11 and 2.70, comfortably above the 2% ∆G per annum target. As expected, with increasing genetic gain the cost %∆G decreased.

As alluded to above, in addition to enabling within family selection, a key advantage in using GS is reducing the length of a selection cycle and the cost %∆G^54,55. This was further demonstrated in our deterministic simulation by applying a wholly GS second selection cycle, A_gsW_gs, to the HS families generated following C1 in both the 200 HS and 1000 HS family scenarios. The fact that A_gsW_gs enabled selection of another generation of elite parental seedlings based on GEBV’s in the space of one year, following C1 (3 years), resulted in high annual %∆G in cycle 2. Annual %∆G resulting from A_gsW_gs was higher than both the A_p and A_pW_gs breeding strategies at all selection pressures and r_A’s. This additional year based on A_gsW_gs following the 3 years of the field-based strategy (C1) increased total %∆G.

It is important to note that the deterministic simulation results discussed assumed that the r_A values 0.26, 0.36 and 0.46 in C1 continued to hold in C2. There is always the probability of the correlation between GEBV’s and phenotype derived BLUP’s or true breeding values, decreasing due to decay in r_A, due principally to declining genetic relatedness between training and selection populations⁵⁶, following polycrossing of C1 selected parents. The use of a r_A of 0.12 was introduced into simulation in C2 to investigate this scenario. Simulation results indicated that even at the r_A value of 0.12 in C2, the predicted annual %ΔG using A_gsW_gs, was higher than that predicted for A_p in C1 at all selection pressures.

Stochastic modelling

While genetic gain is a key determinant of the merit of a breeding strategy, information on fixation rate of favourable alleles and the rate of decay of genetic variance and r_A are vital, especially when designing long-term recurrent selection breeding programs. For example, knowing at what stage in the recurrent selection process to recalibrate the prediction model when using GS is essential. Application of strategic decision support software tools such as QU-GENE, provide a platform for plant breeders to assess the progress, over multiple selection cycles, for ∆G and associated genetic parameters such as; additive genetic variance, fixation rates and r_A. In crop species, several studies have explored stochastic modelling to understand long-term trends of implementing GS in breeding programs^{19,57,58,59,60}. In forages, Lin, et al.⁵⁵ and Esfandyari, et al.⁶¹ used stochastic modelling to assess the impact of GS in commercial and FS breeding programs.

In the current study, stochastic modelling using QU-GENE/QuLinePlus provided graphical trends for; ∆G, allele fixation rate, genetic variance and r_A, over 10 cycles of selection in response to three breeding strategies; A_p, A_pW_gs and A_gsW_gs, when applied to either 200 HS or 1000 HS families of perennial ryegrass. In our simulations, the greatest %∆G observed for GS (A_pW_gs and A_gsW_gs) breeding strategies compared to A_p across two different selection pressures and populations sizes (Fig. 2). These results are due to precise selection of genotypes, for DM yield, within a family when implementing GS, compared to random selection of individuals from within selected families when using the A_p strategy. This further supports the findings of Esfandyari, et al.⁶¹ who in the context of full sib (FS) family breeding strategies, reported that accurate selection of single plants, based on GS, establishes a strong genetic correlation in terms of performance between single plants and plots, leading to increased ∆G.

The GS breeding strategies applied in C1 outperformed the A_p strategy in terms of predicted %∆G. However, in later cycles, these differences were less pronounced, due to a decline in r_A in the selection cycles post C1. These trends observed in our simulations were similar for both the Sim200 and Sim1000 HS families, mainly due to similar size and number of additive genetic effects used in the recombination map for simulating the breeding strategies. Generally, with larger training populations, r_A increases^6,62 or remains constant relative to smaller populations⁵⁸. However, in our simulations there was a difference in r_A following each selection cycle. In the Sim1000 HS families the observed r_A values were relatively lower compared to the Sim200 HS family simulations. While the r_A disparity between the populations is unclear, it should be noted that, Sim200 and Sim1000 were two independently simulated populations evaluated under different GE simulation systems, as described in “Material and Methods”. Populations evaluated under different GE systems, irrespective of the size of training population, produce different r_A^63,64. In forages, r_A is a result of genetic linkage and linkage disequilibrium (LD)^24,65,66. As selection cycles progress the genetic linkage between the training and selection population declines, resulting in poor r_A as observed in our simulations (Fig. 5). To maintain higher r_A across multiple selection cycles, training populations need to be updated with new elite material or by including the families from previous selection cycles^58,61. The implication of this strategy in HS breeding systems is yet to be investigated and would be considered in future studies.

Genetic variance within breeding populations provides the foundation for cultivar development⁶⁷. In our simulations, there was a rapid decline in the percentage of genetic variance from selection C0 to C1, particularly for the A_pW_gs and A_gsW_gs methods and in the later selection cycles there was a steady decline in all three breeding strategies (Fig. 4). The rapid decline from C0 to C1 in the GS strategies is the result of utilizing both among (1/4 σ²_A) and within (3/4 σ²_A) HS family additive genetic variation, and this was reflected in the %∆G (Fig. 2). In addition, the polygenic architecture of the DM yield trait may be an underlying factor. Muleta, et al.⁵⁸ reported similar trends, notably a rapid early decline in genetic variance, for a polygenic trait through their simulation of genomic assisted recurrent selection in sorghum (Sorghum bicolor). Cycles of continuous recurrent selection increase the frequency of favourable alleles and at the same time increase the probability of fixation of deleterious alleles. Selection pressure and population size have big impact on the allele fixation rates (Fig. 3). Our simulations demonstrated that smaller populations and high selection pressure will lead to higher allelic fixation rates earlier in the selection cycles. Allelic fixation rate is the direct measure of inbreeding in a population^27,68,69. With decreasing population size and higher selection pressures the potential for inbreeding depression increases^55,68,69. In cross pollinating species such as perennial ryegrass, which consists of heterozygous and heterogenous populations, inbreeding depression effects can be severe⁷⁰. Our results re-emphasize the importance of larger training populations for GS implementation in order to maintain optimum levels of inbreeding rates and the genetic variance in a recurrent selection program.

In addition to having information such as heritability of key traits, the possibility of predicting the rate of decrease of their genetic variances and associated allele fixation rates, over multiple selection cycles, as shown in our study for HS families, also simulation studies by Esfandyari, et al.⁶¹ for FS families and Lin, et al.⁵⁵ in a commercial breeding program, will enhance decisions on choice of breeding pool to achieve specific cultivar development goals, by deploying cost effective breeding strategies.

Conclusion

Optimizing breeding program inputs for rate and cost-efficiency of genetic gain can be informed by simulation. Using mock data matrices constructed from empirically derived data, we demonstrated short- and long-term impacts of breeding strategy and integration of key technologies including genomic selection and phenomics on rate of predicted genetic gain for dry matter yield, a key economic trait, in perennial ryegrass. Our findings indicate these technologies offer substantial improvements in the rate of gain, and in some cases improved cost-efficiency per unit gain. The value of GS in exploiting within family additive genetic variation to increase genetic gain was demonstrated using both deterministic and stochastic simulation.

The application of GS in both among and within HS family selection in C2, provided a significant boost to total annual genetic gain across both cycles (C1 = 3 years and C2 = 1 year), even at low GS accuracy r_A of 0.12. Despite some reduction in genetic gain, using phenomics (LiDAR based mobile platform) to assess seasonal DM yield clearly demonstrated its impact by reducing cost per percentage gain relative to standard DM cuts.

The open-source software tools, DeltaGen and QU-GENE, offer ways to query and model the impact of breeding methodology and technology integration under a range of breeding scenarios and inputs in out crossing species including pasture species. This software expands the scope of tools available to breeders in decision support for breeding program design. The analyses reported in this paper can also be extended to major crop species using the genetic modelling capability for self-pollinating species, developed in both DeltaGen and QU-GENE.

Data availability

The datasets generated in this study are included as supplementary information files.

Code availability

Software links to perform the deterministic and stochastic modelling were provided within this article.

References

Moll, R. H. & Stuber, C. W. In Advances in Agronomy Vol. 26 (ed. Brady, N. C.) 277–313 (Academic Press, 1974).
Google Scholar
Milligan, S. B., Gravois, K. A., Bischoff, K. P. & Martin, F. A. Crop effects on broad-sense heritabilities and genetic variances of sugarcane yield components. Crop Sci. https://doi.org/10.2135/cropsci1990.0011183X003000020020x (1990).
Article Google Scholar
Fehr, W. R. Principles of Cultivar Development: Crop Species (Macmillan Publishing Company, 1987).
Google Scholar
Casler, M. D. & Brummer, E. C. Theoretical expected genetic gains for among-and-within-family selection methods in perennial forage crops. Crop Sci. 48, 890–902. https://doi.org/10.2135/cropsci2007.09.0499 (2008).
Article Google Scholar
Collard, B. C., Jahufer, M., Brouwer, J. & Pang, E. An introduction to markers, quantitative trait loci (QTL) mapping and marker-assisted selection for crop improvement: the basic concepts. Euphytica 142, 169–196 (2005).
Article CAS Google Scholar
Jannink, J.-L., Lorenz, A. J. & Iwata, H. Genomic selection in plant breeding: From theory to practice. Brief. Funct. Genom. 9, 166–177. https://doi.org/10.1093/bfgp/elq001 (2010).
Article CAS Google Scholar
Meuwissen, T. H. E., Hayes, B. J. & Goddard, M. E. Prediction of total genetic value using genome-wide dense marker maps. Genetics 157, 1819–1829 (2001).
Article CAS PubMed PubMed Central Google Scholar
Ghamkhar, K. et al. Real-time, non-destructive and in-field foliage yield and growth rate measurement in perennial ryegrass (Lolium perenne L.). Plant Methods 15, 72 (2019).
Article PubMed PubMed Central Google Scholar
Shorten, P. R., Leath, S. R., Schmidt, J. & Ghamkhar, K. Predicting the quality of ryegrass using hyperspectral imaging. Plant Methods 15, 63. https://doi.org/10.1186/s13007-019-0448-2 (2019).
Article PubMed PubMed Central Google Scholar
Gebremedhin, A. et al. Development and validation of a phenotyping computational workflow to predict the biomass yield of a large perennial ryegrass breeding field trial. Front. Plant Sci. 11, 689 (2020).
Article MathSciNet PubMed PubMed Central Google Scholar
Miao, C. et al. Semantic segmentation of sorghum using hyperspectral data identifies genetic associations. Plant Phenom. 2020, 4216373. https://doi.org/10.34133/2020/4216373 (2020).
Article Google Scholar
Araus, J. L. & Cairns, J. E. Field high-throughput phenotyping: the new crop breeding frontier. Trends Plant Sci. 19, 52–61 (2014).
Article CAS PubMed Google Scholar
Montes, J. M., Technow, F., Dhillon, B. S., Mauch, F. & Melchinger, A. E. High-throughput non-destructive biomass determination during early plant development in maize under field conditions. Field Crop Res. 121, 268–273. https://doi.org/10.1016/j.fcr.2010.12.017 (2011).
Article Google Scholar
Roitsch, T. et al. Review: New sensors and data-driven approaches—A path to next generation phenomics. Plant Sci. 282, 2–10. https://doi.org/10.1016/j.plantsci.2019.01.011 (2019).
Article CAS PubMed PubMed Central Google Scholar
Perez-Sanz, F., Navarro, P. J. & Egea-Cortines, M. Plant phenomics: an overview of image acquisition technologies and image data analysis algorithms. GigaScience https://doi.org/10.1093/gigascience/gix092 (2017).
Article PubMed PubMed Central Google Scholar
Tardieu, F., Cabrera-Bosquet, L., Pridmore, T. & Bennett, M. Plant phenomics, from sensors to knowledge. Curr. Biol. 27, R770–R783. https://doi.org/10.1016/j.cub.2017.05.055 (2017).
Article CAS PubMed Google Scholar
Zhao, C. et al. Crop phenomics: Current status and perspectives. Front. Plant Sci. https://doi.org/10.3389/fpls.2019.00714 (2019).
Article PubMed PubMed Central Google Scholar
Podlich, D. W. & Cooper, M. QU-GENE: A simulation platform for quantitative analysis of genetic models. Bioinformatics 14, 632–653. https://doi.org/10.1093/bioinformatics/14.7.632 (1998).
Article CAS PubMed Google Scholar
Iwata, H. & Jannink, J. L. Accuracy of genomic selection prediction in barley breeding programs: A simulation study based on the real single nucleotide polymorphism data of barley breeding lines. Crop Sci. 51, 1915–1927 (2011).
Article Google Scholar
Yabe, S., Iwata, H. & Jannink, J.-L. A simple package to script and simulate breeding schemes: The breeding scheme language. Crop Sci. 57, 1347–1354. https://doi.org/10.2135/cropsci2016.06.0538 (2017).
Article Google Scholar
Jahufer, M. & Luo, D. DeltaGen: A comprehensive decision support tool for plant breeders. Crop Sci. 58, 1118–1131 (2018).
Article Google Scholar
Hoyos-Villegas, V. et al. QuLinePlus: Extending plant breeding strategy and genetic model simulation to cross-pollinated populations—case studies in forage breeding. Heredity 122, 684–695. https://doi.org/10.1038/s41437-018-0156-0 (2019).
Article CAS PubMed Google Scholar
George, R., Barrett, B. & Ghamkhar, K. Evaluation of LiDAR scanning for measurement of yield in perennial ryegrass. J. N. Zeal. Grassl. 81, 55–60 (2019).
Article Google Scholar
Faville, M. J. et al. Predictive ability of genomic selection models in a multi-population perennial ryegrass training set using genotyping-by-sequencing. Theor. Appl. Genet. 131, 703–720 (2018).
Article CAS PubMed Google Scholar
Arojju, S. K., Cao, M., Jahufer, M. Z., Barrett, B. A. & Faville, M. J. Genomic predictive ability for foliar nutritive traits in perennial ryegrass. G3 Genes Genomes Genet. 10, 695–708 (2020).
CAS Google Scholar
Arojju, S. K. et al. Multi-trait genomic prediction improves predictive ability for dry matter yield and water-soluble carbohydrates in perennial ryegrass. Front. Plant Sci. https://doi.org/10.3389/fpls.2020.01197 (2020).
Article PubMed PubMed Central Google Scholar
Falconer, D. S., Mackay, T. F. & Frankham, R. Introduction to quantitative genetics (4th edn). Trends Genet. 12, 280 (1996).
Article Google Scholar
Harville, D. A. Maximum likelihood approaches to variance component estimation and to related problems. J. Am. Stat. Assoc. 72, 320–338 (1977).
Article MathSciNet Google Scholar
Patterson, H. D. & Thompson, N. R. Recovery of inter-block information when block sizes are unequal. Biometrika 58, 545–554. https://doi.org/10.1093/biomet/58.3.545 (1971).
Article MathSciNet MATH Google Scholar
Patterson, H. D. & Thompson, R. Maximum likelihood estimation of components of variance. In Proceedings of the 8th International Biometrical Conference. 197–207 (1975).
Nyquist, W. E. & Baker, R. Estimation of heritability and prediction of selection response in plant populations. Crit. Rev. Plant Sci. 10, 235–322 (1991).
Article Google Scholar
Annicchiarico, P. et al. Accuracy of genomic selection for alfalfa biomass yield in different reference populations. BMC Genom. 16, 1020 (2015).
Article Google Scholar
Grinberg, N. F. et al. Implementation of genomic prediction in Lolium perenne (L.) breeding populations. Front. Plant Sci. 7, 133 (2016).
Article PubMed PubMed Central Google Scholar
Dekkers, J. Marker-assisted selection for commercial crossbred performance. J. Anim. Sci. 85, 2104–2114 (2007).
Article CAS PubMed Google Scholar
Dekkers, J. Prediction of response to marker-assisted and genomic selection using selection index theory. J. Anim. Breed. Genet. 124, 331–341 (2007).
Article CAS PubMed Google Scholar
Byrne, S. L. et al. A synteny-based draft genome sequence of the forage grass Lolium perenne. Plant J. 84, 816–826 (2015).
Article CAS PubMed Google Scholar
Lipka, A. E. et al. GAPIT: genome association and prediction integrated tool. Bioinformatics 28, 2397–2399. https://doi.org/10.1093/bioinformatics/bts444 (2012).
Article CAS PubMed Google Scholar
Endelman, J. B. Ridge regression and other kernels for genomic selection with R package rrBLUP. Plant Genome 4, 250–255 (2011).
Article Google Scholar
Jafari, A., Connolly, V. & Walsh, E. Genetic analysis of yield and quality in full-sib families of perennial ryegrass (Lolium perenne L.) under two cutting managements. Irish J. Agric. Food Res. 2, 275–292 (2003).
Google Scholar
Oconnor, J. R., Jahufer, M. Z. Z. & Lyons, T. Examining perennial ryegrass (Lolium perenne L.) persistence through comparative genetic analyses of two cultivars after nine years in the field. Euphytica 216, 36. https://doi.org/10.1007/s10681-020-2568-1 (2020).
Article CAS Google Scholar
Cobb, J. N. et al. Enhancing the rate of genetic gain in public-sector plant breeding programs: Lessons from the breeder’s equation. Theor. Appl. Genet. 132, 627–645 (2019).
Article PubMed PubMed Central Google Scholar
Rhodes, I. The relationship between productivity and some components of canopy structure in ryegrass (Lolium spp.): III. Spaced plant characters, their heritabilities and relationship to sward yield. J. Agric. Sci. 80, 171–176. https://doi.org/10.1017/S002185960005718X (1973).
Article Google Scholar
Fè, D., Pedersen, M. G., Jensen, C. S. & Jensen, J. Genetic and environmental variation in a commercial breeding program of perennial ryegrass. Crop Sci. 55, 631–640 (2015).
Article Google Scholar
Humphreys, M. The contribution of conventional plant breeding to forage crop improvement. In Proceedings of the 18th International Grassland Congress’. Winnipeg and Saskatoon, Canada. 8–17. (1997)
Easton, S., Amyes, J., Cameron, N., Green, R., Norriss, M. & Stewart, A. Pasture plant breeding in New Zealand: where to from here?. In Proceedings of the conference-New Zealand Grassland Association, 173–180 (2002).
Wilkins, P. & Humphreys, M. Progress in breeding perennial forage grasses for temperate agriculture. J. Agric. Sci. 140, 129–150 (2003).
Article CAS Google Scholar
Woodfield, D. Genetic improvements in New Zealand forage cultivars. In Proceedings of the conference-New Zealand grassland association. 3–8. (1999)
Harmer, M., Stewart, A. & Woodfield, D. Genetic gain in perennial ryegrass forage yield in Australia and New Zealand. J. N. Zeal. Grassl. 78, 133–138 (2016).
Article Google Scholar
Ray, D. K., Mueller, N. D., West, P. C. & Foley, J. A. Yield trends are insufficient to double global crop production by 2050. PLoS ONE 8, e66428. https://doi.org/10.1371/journal.pone.0066428 (2013).
Article ADS CAS PubMed PubMed Central Google Scholar
Casler, M. Among-and-within-family selection in eight forage grass populations. Crop Sci. 48, 434–442 (2008).
Article Google Scholar
Simeão Resende, R. M., Casler, M. D. & de Resende, M. D. V. Genomic selection in forage breeding: Accuracy and methods. Crop Sci. 54, 143–156. https://doi.org/10.2135/cropsci2013.05.0353 (2014).
Article Google Scholar
Hayward, M. & Vivero, J. Selection for yield in Lolium perenne. II. Performance of spaced plant selections under competitive conditions. Euphytica 33, 787–800 (1984).
Article Google Scholar
Lazenby, A. & Rogers, H. Selection criteria in grass breeding. II. Effect, on Lolium perenne, of differences in population density, variety and available moisture. J. Agric. Sci. 62, 285–298 (1964).
Article Google Scholar
Lin, Z. et al. Genetic gain and inbreeding from genomic selection in a simulated commercial breeding program for perennial ryegrass. Plant Genome https://doi.org/10.3835/plantgenome2015.06.0046 (2016).
Article PubMed Google Scholar
Lin, Z. et al. Optimizing resource allocation in a genomic breeding program for perennial ryegrass to balance genetic gain, cost, and inbreeding. Crop Sci. 57, 243–252. https://doi.org/10.2135/cropsci2016.07.0577 (2017).
Article Google Scholar
Habier, D., Fernando, R. L. & Dekkers, J. C. M. The impact of genetic relationship information on genome-assisted breeding values. Genetics 177, 2389–2397. https://doi.org/10.1534/genetics.107.081190 (2007).
Article CAS PubMed PubMed Central Google Scholar
Daetwyler, H. D., Hayden, M. J., Spangenberg, G. C. & Hayes, B. J. Selection on optimal haploid value increases genetic gain and preserves more genetic diversity relative to genomic selection. Genetics 200, 1341–1348 (2015).
Article PubMed PubMed Central Google Scholar
Muleta, K. T., Pressoir, G. & Morris, G. P. Optimizing genomic selection for a sorghum breeding program in Haiti: A simulation study. G3 Genes Genomes Genet. 9, 391–401. https://doi.org/10.1534/g3.118.200932 (2019).
Article Google Scholar
Müller, D., Schopp, P. & Melchinger, A. E. Persistency of prediction accuracy and genetic gain in synthetic populations under recurrent genomic selection. G3 Genes Genomes Genet. 7, 801–811. https://doi.org/10.1534/g3.116.036582 (2017).
Article Google Scholar
Denis, M. & Bouvet, J.-M. Efficiency of genomic selection with models including dominance effect in the context of Eucalyptus breeding. Tree Genet. Genomes 9, 37–51. https://doi.org/10.1007/s11295-012-0528-1 (2013).
Article Google Scholar
Esfandyari, H., Fè, D., Tessema, B. B., Janss, L. L. & Jensen, J. Effects of different strategies for exploiting genomic selection in perennial ryegrass breeding programs. G3 Genes Genomes Genet. https://doi.org/10.1534/g3.120.401382 (2020).
Article Google Scholar
Lorenz, A., Smith, K. & Jannink, J. L. Potential and optimization of genomic selection for Fusarium head blight resistance in six-row barley. Crop Sci. 52, 1609–1621 (2012).
Article Google Scholar
Heslot, N., Akdemir, D., Sorrells, M. E. & Jannink, J.-L. Integrating environmental covariates and crop modeling into the genomic selection framework to predict genotype by environment interactions. Theor. Appl. Genet. 127, 463–480. https://doi.org/10.1007/s00122-013-2231-5 (2014).
Article PubMed Google Scholar
Sun, J. et al. High-throughput phenotyping platforms enhance genomic selection for wheat grain yield across populations and cycles in early stage. Theor. Appl. Genet. 132, 1705–1720 (2019).
Article CAS PubMed Google Scholar
Fè, D. et al. Genomic dissection and prediction of heading date in perennial ryegrass. BMC Genom. 16, 921 (2015).
Article Google Scholar
Byrne, S. L. et al. Using variable importance measures to identify a small set of SNPs to predict heading date in perennial ryegrass. Sci. Rep. 7, 1–10 (2017).
Article CAS Google Scholar
Bhandari, H., Nishant Bhanu, A., Srivastava, K., Singh, M. & Shreya, H. A. Assessment of genetic diversity in crop plants. An overview. Adv. Plants Agric. Res. 7, 00255 (2017).
Google Scholar
Busch, J. Inbreeding depression in self-incompatible and self-compatible populations of Leavenworthia alabamica. Heredity 94, 159–165 (2005).
Article CAS PubMed Google Scholar
Leimu, R., Mutikainen, P., Koricheva, J. & Fischer, M. How general are positive relationships between plant population size, fitness and genetic variation?. J. Ecol. 94, 942–952. https://doi.org/10.1111/j.1365-2745.2006.01150.x (2006).
Article Google Scholar
Bean, E. W. & Yok-Hwa, Chen. An analysis of the growth of inbred progeny of Lolium. J. Agric. Sci. 79, 147–153 (1972). https://doi.org/10.1017/S0021859600025478

Download references

Acknowledgements

The authors wish to acknowledge Dr Tony Conner, AgResearch, and Dr Ed Butler, Partnerships Manager, for their mentorship and support that ensured the success of Pastoral Genomics Plus. Authors like to acknowledge the valuable comments provided by Dr David Hume, AgResearch to improve the manuscript.

Funding

This work was funded by Pastoral Genomics Plus (PSTG1501), a joint venture co-funded by DairyNZ, Beef + Lamb New Zealand, Dairy Australia, AgResearch Ltd, New Zealand Agriseeds Ltd, Grasslands Innovation Ltd, DEEResearch, and the Ministry of Business, Innovation and Employment (New Zealand).

Author information

Authors and Affiliations

Grasslands Research Centre, AgResearch Ltd, Palmerston North, 4442, New Zealand
M. Z. Z. Jahufer, Sai Krishna Arojju, Marty J. Faville, Kioumars Ghamkhar, Dongwen Luo, Andrew G. Griffiths & Brent Barrett
School of Agriculture and Food Sciences, Faculty of Science, The University of Queensland, Brisbane, QLD, 4072, Australia
Vivi Arief, Wen-Hsi Yang, Ian H. DeLacy & Kaye E. Basford
Barenbrug NZ Ltd, Christchurch, 7571, New Zealand
Colin Eady & Will Clayton
Kimihia Research Centre, PGG Wrightson Seeds Ltd, Christchurch, 7676, New Zealand
Alan V. Stewart & Richard M. George
Department of Plant Science, McGill University, Montreal, QC, H3A 0G4, Canada
Valerio Hoyos-Villegas
The Centre for Complex Analytics and Visualisation, AutoStat Institute, Melbourne, VIC, 3000, Australia
Mingzhu Sun

Authors

M. Z. Z. Jahufer
View author publications
Search author on:PubMed Google Scholar
Sai Krishna Arojju
View author publications
Search author on:PubMed Google Scholar
Marty J. Faville
View author publications
Search author on:PubMed Google Scholar
Kioumars Ghamkhar
View author publications
Search author on:PubMed Google Scholar
Dongwen Luo
View author publications
Search author on:PubMed Google Scholar
Vivi Arief
View author publications
Search author on:PubMed Google Scholar
Wen-Hsi Yang
View author publications
Search author on:PubMed Google Scholar
Mingzhu Sun
View author publications
Search author on:PubMed Google Scholar
Ian H. DeLacy
View author publications
Search author on:PubMed Google Scholar
Andrew G. Griffiths
View author publications
Search author on:PubMed Google Scholar
Colin Eady
View author publications
Search author on:PubMed Google Scholar
Will Clayton
View author publications
Search author on:PubMed Google Scholar
Alan V. Stewart
View author publications
Search author on:PubMed Google Scholar
Richard M. George
View author publications
Search author on:PubMed Google Scholar
Valerio Hoyos-Villegas
View author publications
Search author on:PubMed Google Scholar
Kaye E. Basford
View author publications
Search author on:PubMed Google Scholar
Brent Barrett
View author publications
Search author on:PubMed Google Scholar

Contributions

M.Z.J., M.F., K.G., A.G. and B.B. conceived the study. M.Z.J. performed the deterministic modelling analysis and wrote the initial manuscript. S.K.A. performed the stochastic modelling analysis and S.K.A. and V.A. interpreted the results. M.Z.J., S.K.A., M.F., K.G., D.L., V.A., W.Y., M.S., I.D., A.G., C.E., W.C., A.S., R.G., V.H.V., K.B. and B.B. contributed to the interpretation of the results. M.Z.J., S.K.A., V.A.. and K.B. played a key role in finalizing the manuscript.

Corresponding author

Correspondence to M. Z. Z. Jahufer.

Ethics declarations

Competing interests

Authors MZJ, SKA, MF, KG, DL, AG, BB are employed by AgResearch, a New Zealand Crown Research Institute. Authors CE and WC are employed by Barenbrug Agriseeds Ltd and authors AS and RG are employed by PGG Wrightson Seeds Ltd. Neither funders nor the industry partners were involved in the design of this study.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Supplementary Information.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Jahufer, M.Z.Z., Arojju, S.K., Faville, M.J. et al. Deterministic and stochastic modelling of impacts from genomic selection and phenomics on genetic gain for perennial ryegrass dry matter yield. Sci Rep 11, 13265 (2021). https://doi.org/10.1038/s41598-021-92537-w

Download citation

Received: 05 January 2021
Accepted: 11 June 2021
Published: 24 June 2021
DOI: https://doi.org/10.1038/s41598-021-92537-w

This article is cited by

The value of early root development traits in breeding programs for biomass yield in perennial ryegrass (Lolium perenne L.)
- M. Malinowska
- P. S. Kristensen
- T. Asp
Theoretical and Applied Genetics (2025)
Genomic selection shows improved expected genetic gain over phenotypic selection of agronomic traits in allotetraploid white clover
- O. Grace Ehoche
- Sai Krishna Arojju
- Andrew G. Griffiths
Theoretical and Applied Genetics (2025)
Machine learning after a decade: is it still a missing keystone in genomic-based plant breeding?
- Mohsen Yoosefzadeh-Najafabadi
- Alencar Xavier
- Mohsen Hesami
Artificial Intelligence Review (2025)

Subjects

Abstract

Similar content being viewed by others

Genetic variation and heritability of agronomic traits in a native perennial forage species from drylands: breeding potential of Festuca pallescens

How partial phenotyping to reduce generation intervals can help to increase annual genetic gain in selected honeybee populations

A diallel study to detect genetic background variation for FHB resistance in winter wheat

Introduction

Material and methods

Deterministic modelling for estimating genetic gain using HS family breeding strategies

Construction of the 1000 and 200 HS family “mock data” matrices of perennial ryegrass

Data analysis

Breeding methods and associated prediction equations

Genomic selection

Phenomics, sampling and field trial operational costs

Stochastic modelling of multiple selection cycles using HS family breeding strategies

Generation of the initial training population for simulation in QuLinePlus

Defining the genetic model and trait architecture

Genomic prediction

Breeding strategies

Simulation output

Results

Deterministic modelling

Predicted genetic gain based on the 200 HS family mock data matrix

Predicted genetic gain based on the 1000 HS family mock data matrix

Stochastic modelling

Genetic gain

Fixation rate of favourable alleles

Genetic variance and prediction accuracy

Discussion

Deterministic modelling

Stochastic modelling

Conclusion

Data availability

Code availability

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher's note

Supplementary Information

Supplementary Information.

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

The value of early root development traits in breeding programs for biomass yield in perennial ryegrass (Lolium perenne L.)

Genomic selection shows improved expected genetic gain over phenotypic selection of agronomic traits in allotetraploid white clover

Machine learning after a decade: is it still a missing keystone in genomic-based plant breeding?

Search

Quick links