Introduction

Amidst the escalating challenges posed by global climate degradation and an ever-increasing population, a primary concern for nations worldwide is ensuring food security (Santos et al.20). Since commercial cultivation of genetically modified crops began in 1996, the area dedicated to their cultivation has shown a consistent increase. From 1996 to 2016, the cultivation area evolved from 1.7 million hectares to an impressive 190.4 million hectares, providing substantial economic, societal, and ecological benefits (James13). In recent years, the strategic incorporation of superior exogenous genes into diverse crops through genetic engineering has proven to be an efficient method for breeding superior cultivars (Qaim18).

The introduction of genetically modified crops has invariably placed their safety under extensive scrutiny and discussion (Domingo et al.6). In response, numerous countries worldwide have implemented strict legal frameworks and established dedicated regulatory bodies to oversee the deployment of genetically modified crops (Falkner et al.7). The careful selection of genetically modified varieties for commercial cultivation comes from a rigorous selection process among many transformation events, subjecting them to thorough scrutiny and safety assessments from production to regulatory approval (Bradford et al3.). In the context of genetically modified crop screening and safety evaluation, molecular evidence that proves the integration of exogenous genes into the recipient genome is essential. Submission of comprehensive molecular evidence and information, including sequence details of inserted exogenous genes, copy numbers, insertion sites, and adjacent sequences, is imperative for compliance with established standards (Li et al.15).

During the integration of exogenous genes into the recipient genome, the insertion of plasmid DNA can cause mutations or breaks in the native genes of the recipient plant genome (Wilson et al28.). This may lead to gene silencing or activation, potentially forming new proteins or deactivating existing ones, resulting in unexpected changes that affect the quality and safety of genetically modified crops (Latham et al14.). These changes require the submission of detailed evidence to safety assessment authorities for thorough examination and approval (Cellini et al5.). Currently, traditional methods to characterize exogenous gene integration sites mainly use PCR technology, including techniques such as Genome Walking (GW), Thermal Asymmetric Interlaced PCR (TAIL-PCR), Inverse PCR (I-PCR), and T-linker PCR (Shu et al22.). These methods have effectively been used to examine the molecular features of integration sites in various genetically modified crops like Arabidopsis (Tan et al.25), maize (Spalinskas et al24.; Liu et al16.), rice (Fraiture et al8.), and tomato (Yang et al.32). While these techniques can identify the number of copies of inserted exogenous genes and the presence of plasmid backbones, they have limitations. The accuracy of these methods can be affected by factors such as primer selection, particularly in cases of complex genome rearrangements at the insertion site (Holst-Jensen et al12.). Furthermore, when transgenic elements are similar to sequences in the recipient genome, the effectiveness of traditional methods can be reduced, making it difficult to identify insertion sites and nearby sequences using PCR-based methods (Yang et al.31).

In recent years, DNA sequencing technology has evolved rapidly, marked by significant increases in sequencing capacity along with a decrease in costs. This evolution has made sequencing technology more accessible and widespread (Satam et al.21). A significant advancement in this field is represented by third-generation sequencing technologies, such as PacBio’s SMRT and Oxford Nanopore Technologies’ nanopore single-molecule sequencing. The primary advantage of these technologies is their ability to produce long reads, which are crucial for the comprehensive mapping and characterization of complex genomic regions (Lu et al.17). These long reads not only enhance the depth and breadth of genome sequencing but also improve the accuracy of detecting and characterizing the insertion sites of exogenous genes in transgenic plants (Heather and Chain11). Studies show that deep sequencing can comprehensively cover complex genomes (Wang et al.27), laying the groundwork for precise whole-genome research (Ajay et al1.). Different from Southern hybridization, PCR technology, and previous generations of sequencing, the key feature of third-generation sequencing is its ability to sequence single molecules, removing the need for PCR amplification during sequencing (Van Dijk et al.26). This method ensures high standardization, reliable repeatability, and superior accuracy, effectively and accurately detailing the insertion of exogenous genes, changes in the recipient genome, and the detection of plasmid scaffolds (Goodwin et al.10). Despite the vast, complex, and highly repetitive nature of the maize genome, the use of this technology in detailing the molecular characteristics of transgenic maize is still limited, presenting significant challenges (Cade et al.4).

In a previous study, two separate transgenic maize transformation events, ND4401 and ND4403, were created using the ZmNRT1.1 A nitrate gene, which is recognized for its ability to withstand low nitrogen conditions. The goal was to provide strong molecular evidence to deeply investigate these two events, aiming to develop new transgenic maize varieties that can tolerate low nitrogen levels. To enhance the safety evaluation of these events, third-generation nanopore single-molecule sequencing technology was used to accurately determine the insertion sites and surrounding sequences of the exogenous genes in the ND4401 and ND4403 events. From the sequences obtained near the exogenous genes, specific PCR primers were crafted for each transformation event and used to detect these transgenic maize events. This study highlights the effectiveness of nanopore single-molecule sequencing technology in identifying molecular characteristics in transgenic plants.

Results

Genomic southern blot analysis of transgenic maize events

In the genomic Southern blot analysis, single bands exceeding 7.5 kb were detected across three successive generations for transformation events ND4401 and ND4403, indicating the integration of a single copy of the transgene into the maize genome. No hybridization bands were observed in the negative control (y822), corroborating the specificity of the transgene insertion (Figs. 1 and 2).

Fig. 1
figure 1

Schematic diagram of the probe and the cleavage site on the inserted fragment.1: Target genes: maize ubi promoter, ZmNRT1.1 A coding region, T-NOS terminator; 2: Screening marker genes: P35S promoter, bar gene coding region, T-35 S terminator.

Fig. 2
figure 2

Southern blot detection of the bar gene in transgenic maize events ND4401 and ND4403. (A) M represents Trans15K DNA Marker; P represents positive plasmid (pCAMBIA1301-ZmNRT1.1 A-bar); 1 represents genomic DNA of T4 transgenic maize event ND4401 completely digested with Sac I; 2 represents genomic DNA of T5 transgenic maize event ND4401 completely digested with Sac I; 3 represents genomic DNA of T6 transgenic maize event ND4401 completely digested with Sac I; 4 represents WT. (B) M represents Trans15K DNA Marker; P represents positive plasmid (pCAMBIA1301-ZmNRT1.1 A-bar); 1 represents genomic DNA of T4 transgenic maize event ND4403 completely digested with Xma I; 2 represents genomic DNA of T5 transgenic maize event ND4403 completely digested with Xma I; 3 represents genomic DNA of T6 transgenic maize event ND4403 completely digested with Xma I; 4 represents WT. Note: T4, T5, and T6 represent the fourth, fifth, and sixth generations, respectively, of self-pollinated transgenic plants. WT refers to the wild-type maize line y822.

Resequencing and integration site analysis of transgenic maize events

To delve deeper into the precise full-length DNA sequence of the T-DNA inserted into the maize genome and the flanking sequences at the integration sites, we utilized Oxford Nanopore Technologies’ third-generation sequencing. The sequencing depth was 10-fold with an average read length of 20 kb, peaking at 60 kb, and an accuracy rate of 85%. Through BLAST alignment of Nos gene sequences from the sequencing output, contigs of approximately 14 kb and 10 kb were identified containing the target vector sequences. This enabled us to pinpoint the genomic integration sites and their flanking sequences for both ND4401 and ND4403 events. The acquired sequences were then aligned against the publicly available maize genome database, confirming the single-copy insertion of the transgenes at chromosomes 5 and 3, respectively. The alignment results were consistent with the single-copy insertion deduced from the Southern blot analysis (Fig. 3).

Fig. 3
figure 3

Primer binding sites for the T-DNA integration sites and flanking sequences of the two transgenic maize events. AB Integration sites and primer binding sites for the flanking sequences of the T-DNA inserted into chromosome 5 of transgenic maize ND4401; CD Integration sites and primer binding sites for the flanking sequences of the T-DNA inserted into chromosome 3 of transgenic maize ND4403; * denotes omitted bases; underlined regions represent genomic sequences on both sides; unmarked regions indicate T-DNA sequences; arrows indicate primer binding sites. Blue in AC represents insert vector sequences, pink represents genomic sequences, gray represents lost sequences, and the same colors in BD are highlighted as upstream and downstream of the same primer.

In ND4401 and ND4403, we aligned the contig with the target vector sequence to determine the gene insertion sites and the sequences of the flanking regions at both ends of the insert. For the two transformation events, sequencing yielded 625 bp from the 5’ end of the insert and 132 bp from the 3’ end, as well as 647 bp from the 5’ end and 356 bp from the 3’ end. These two types of genomic flanking sequences were then subjected to BLAST nucleotide-nucleotide alignment against the publicly available maize genome database (http://ensembl.gramene.org/Zea_mays/Info/Annotation/#assembly). This analysis confirmed that the insertion sequences of the two transformation events were integrated into the maize chromosome 5 and chromosome 3, respectively. The feedback maps from the alignment are shown in Fig. 3.

PCR amplification of flanking sequences adjacent to T-DNA integration sites

In light of the sequencing outcomes, primers were designed for the PCR detection of flanking sequences on both the left and right sides of the transformation events ND4401 and ND4403 (Table 1). These primers aimed to amplify products encompassing both maize genomic sequences and T-DNA sequences. The electropherograms clearly exhibit the target bands solely within the transformation events ND4401 and ND4403, conspicuously absent in the non-transgenic maize line y822 (Fig. 4).

Table 1 Primer sequences and purpose used in the research.
Fig. 4
figure 4

PCR amplification of the T-DNA flanking sequences of transgenic events ND4401 and ND4403. LB represents the left border flanking sequence isolation; RB represents the right border flanking sequence; M represents the DNA marker DL2000; BK represents the blank control; y822 represents the negative control.

Subsequent sequencing of the fragments derived from Fig. 3A discerns that the left flanking sequence of ND4401 spans a length of 1069 bp (Fig. 3A). Commencing from the 5’ terminus, positions 1–331 correspond to maize genomic sequences (Chr5: 38326912–38327242 bp), whereas positions 332–1069 encapsulate the T-DNA sequence. The right flanking sequence of ND4401 extends to 689 bp, with positions 1–585 representing the T-DNA sequence, and positions 586–689 depicting maize genomic sequences (Chr05: 38327270-3832773 bp). The integration of exogenous T-DNA induces a 27 bp deletion within the recipient maize genome (Chr05: 38327243–38327269 bp). Substantiating this, BLAST nucleotide-nucleotide alignment of the obtained 5’ end 625 bp and 3’ end 132 bp genomic flanking sequences against the maize public genome database unequivocally confirms the insertion site in the non-coding region of chromosome 5 in maize. Crucially, this insertion does not disrupt gene-coding regions, affirming the integrity of maize genome functionality, with the excised 27 bp constituting non-coding region sequences.

Similarly, sequencing of the fragments sourced from Fig. 3B unveils that the left flanking sequence of ND4403 encompasses 897 bp (Fig. 3B). Starting from the 5’ end, positions 1–507 correspond to maize genomic sequences (Chr03: 7773225–7773731 bp), while positions 508–897 encapsulate the T-DNA sequence. The right flanking sequence of ND4403 extends to 804 bp, with positions 1–458 representing the T-DNA sequence, and positions 459–804 depicting maize genomic sequences (Chr03: 7773807–7774152 bp). The introduction of exogenous T-DNA manifests an 75 bp deletion within the recipient maize genome (Chr03: 7773732–7773806 bp). Validation through BLAST nucleotide-nucleotide alignment of the obtained 5’ end 647 bp and 3’ end 356 bp genomic flanking sequences against the maize public genome database definitively establishes the insertion site in the non-coding region of chromosome 3 in maize. Importantly, this insertion has no impact on gene-coding regions, thus preserving the normal biological functionality of the maize genome. The omitted 75 bp sequences represent non-coding region sequences.

Specific PCR detection of transgenic maize transformation events

Specific primers were carefully designed for both the left and right integration sites of the T-DNA within the genomes of ND4401 and ND4403. A variety of control materials were selected for thorough specificity testing, including maize varieties y822 and KN5585, soybean variety Williams 82, rice variety Jigeng 88, and cotton variety Lumenyan 15. DNA was extracted from five different tissues/organs of ND4401 and ND4403—roots, stems, leaves, male spikes, and seeds—and PCR amplification was conducted using two sets of maize transformation event-specific primers. The PCR results were clear, showing distinct bands only in the transgenic tissues of maize ND4401 and ND4403, and not in the control materials. This confirmed the exceptional specificity of the primer pairs in identifying the transgenic maize events ND4401 and ND4403 (Fig. 5).

Fig. 5
figure 5

Specific primer PCR detection of transgenic events ND4401 and ND4403. LB represents the left border flanking sequence isolation; RB represents the right border flanking sequence; M represents the DNA marker DL2000; 1 represents transgenic maize roots; 2 represents transgenic maize stems; 3 represents transgenic maize leaves; 4 represents transgenic maize male spikes; 5 represents transgenic maize seeds; 6 represents maize variety y822; 7 represents maize variety KN5585; 8 represents cultivated soybean variety Williams 82; 9 represents rice variety Jigeng 88; 10 represents cotton variety Lumianyan 15.

Discussion

While creating genetically modified varieties with breeding value, it is paramount to ensure that the integration of exogenous genes neither compromises the original traits of recipient crops nor hinders the efficient expression of desired traits (Li et al.15). Thus, determining the integration sites and flanking sequences of exogenous genes in the recipient genome holds profound significance for the safety assessment of genetically modified crops. Since the advent of the first-generation DNA sequencing technology in 1977, substantial progress has occurred in gene sequencing technologies (Xu et al.29). Although the current landscape favors second-generation short-read sequencing technology, the rapid development and recent applications of third-generation sequencing technology underscore its potential across diverse fields. While genomic resequencing techniques have found application in various crops, the analysis of flanking sequences in transgenic events remains underexplored (Zeng et al.33).

In this study, we leveraged Oxford Nanopore Technologies’ third-generation sequencing, a nanopore single-molecule sequencing technique, to discern integration sites and flanking sequences of exogenous genes in two independent transgenic maize events. This methodological choice is pivotal for robust safety assessments of genetically modified crops and effective tracking of transformation events. the insertion sites of transgenic maize events ND4401 and ND4403 were determined as Chr05:38327236–38,327,270 bp and Chr03:7866622–7,866,705 bp, respectively. Despite genomic sequence deletions at both insertion sites, ND4401 and ND4403 were strategically inserted into non-coding regions of the maize genome. Capitalizing on these sequencing results, we devised specific detection primers for these maize transformation events, facilitating swift identification and discrimination.

As the area dedicated to genetically modified crops grows, the need for detailed research and the ongoing improvement of regulatory frameworks has become essential (Rozas et al19.). Safety assessments of transformation events, which result from the integration of exogenous genes, are now more thorough (Xu and Zhang30; Atherton2). It is not only important to confirm the effectiveness of transgenic crops’ target genes, but also crucial to accurately identify where genes integrate into the genome and to gather comprehensive information about the inserted sequences (Singh et al23.). Consistent with findings from Giraldo et al. (9), our methodology also facilitates the detection of unexpected small insertions, a critical aspect of thorough safety assessments for genetically modified organisms. The methodology used in this study provides a new strategy for identifying the validity of transgenic plants in terms of molecular characterization.

Materials and methods

Vector and plant materials

The recombinant expression vector pCAMBIA1301-ZmNRT1.1 A-bar was engineered using the pCAMBIA1301 vector as a backbone. This construct was engineered to express the ZmNRT1.1 A gene under the regulation of the maize ubiquitin (pZmUbi) promoter, while the bar selection marker was under the control of the cauliflower mosaic virus 35 S (pCaMV35S) promoter. Transformation events for maize ND4401 and ND4403 were developed and maintained by our research team. The maize variety y822, soybean variety Williams82, and rice variety JiGeng88 were provided by Hongxiang Seed Company, the Soybean Research Institute and the Rice Research Institute of the Jilin Academy of Agricultural Sciences, respectively.

Plant DNA extraction

Cultivated maize, soybean, and rice were grown in controlled pot conditions. Upon full leaf expansion, approximately 100 mg of leaf tissue was harvested and pulverized under liquid nitrogen. Genomic DNA was extracted utilizing the cetyltrimethylammonium bromide (CTAB) method. Concentration and purity assessments of the isolated DNA were performed using a Nanodrop UV spectrophotometer, and the DNA integrity was verified through 1% agarose gel electrophoresis.

Southern blot assay

For Southern blotting, leaf samples from stable transgenic maize lines ND4401 and ND4403, alongside non-transgenic control Y822, were processed. Roughly 30 µg of genomic DNA was thoroughly digested with the restriction endonucleases Sac I and Xma I (New England Biolabs Inc., USA). The DNA fragments were then blotted onto a HybondTM-N + nylon membrane (GE Amersham, RPN303B, USA). A PCR-amplified bar gene segment served as the probe, with primers detailed in Table 2. PCR reactions were optimized in a 25 µL mixture containing 2× PCR Mix (AS111, TransGen Biotech, Beijing), primers, genomic template, and nuclease-free water, followed by a standard amplification protocol. Post-PCR, the products were resolved on a 1% agarose gel, and the specific bands were excised and purified. Probe labeling and hybridization followed the kit’s protocol (Labeling and Detection Starter Kit I, Roche Applied Science, USA), with hybridization at 42 ℃ for 12–16 h. Post-hybridization washes were performed, and the membrane developed using BCIP/NBT substrate until distinct bands were observed.

Table 2 Prediction of hybridization fragment size between transformant restriction endonuclease fragment and different probes.

Single-molecule nanopore sequencing

The resequencing of the maize transgenic events ND4401 and ND4403 was contracted to the Jiangsu Academy of Agricultural Sciences. Nanopore sequencing, a method developed by Oxford Nanopore Technologies (ONT), was utilized. This technology is based on monitoring changes in electrical resistance as individual nucleotides pass through a nanoscale pore. Each nucleotide (A, C, G, T) induces a unique change in electrical resistance due to its specific chemical properties, enabling the accurate determination of the sequence of bases. The sequencing process includes three main steps: preparation of the DNA library, the sequencing itself, and data interpretation. Sequencing data underwent a BLAST search against the Nos gene sequence, and contigs were aligned to the vector sequence to pinpoint the precise insertion loci and flanking boundary sequences.

Boundary sequence PCR validation

Primers flanking the transgene insertion sites were designed, producing PCR amplicons that included native maize genomic sequences and adjacent T-DNA elements. Sequencing services were procured from Kumei company, and the resulting sequences were aligned with both the inserted T-DNA and the reference maize genome.

Specific PCR detection

Following the PCR confirmation of the lateral sequences for the transformation events ND4401 and ND4403, a pair of event-specific PCR detection primers for maize were devised. The amplified products were subjected to analysis via 1% agarose gel electrophoresis.