Abstract
The composition of natural substances varies with plant species and cultivation environment factors, which is also a complex problem. A total of 127 substances of agarwood essential oils (AEOs) extracted by hydro-distillation were identified by GC-MS analysis. Among the components obtained from AEOs, sesquiterpenes and small molecule aromatic substances were the main components, and there were significantly fewer chromones. The aromatic compound 4-phenyl-2-butanone was the only common component. The VIP value and S-plot generated by the OPLS-DA model based on the comparison of regional groups or pairwise genotypes showed up to 26 potential markers at VIP > 1. The more common components of agarwood, such as sesquiterpenes α-guruene, agarospirol, guaiol, γ-eudesmol and chromone 2-phenylethyl-4H-chromen-4-one, contributed the most to the VIP value. Supervised OPLS-DA was better than that of PLS-DA, providing a reference for the quality evaluation of AEOs. This method emphasizes providing more information and obtaining additional information when combined with appropriate multivariate modeling and effective visualization of specific labeled metabolites for identification.
Similar content being viewed by others
Introduction
The indigenous Chen-Xiang of China is a resinous part of Aquilaria sinensis(Lour.) Gilg in the Thymelaeaceae family. It is mainly produced in Guangdong, Guangxi, Hainan, and Fujian provinces1. The genus Aquilaria consists of more than 21 species, of which all species of Aquilaria and Gyrinops appear in the Appendix II list of the Convention on International Trade in Endangered Species of Wild Fauna and Flora, since 2004 (Amendments to appendices I and II of CITES, 2004). Aquilaria malaccensis (A. agallocha Roxb.) is mainly produced in Malaysia, India, Laos, Cambodia, Thailand, and Taiwan island2. In Vietnam, Aquilaria crassna (Kỳ Nam, Trầm Hương, Dó Bầu) is the most important variety and is also widely distributed in Cambodia and Thailand3. Agarwood is widely harvested extensively to obtain aromatic oils through a distillation process. The oils have traditionally been used in perfumes in the Middle East and have been widely used in advanced perfumes, toiletries, fragrance additives, and other biotechnology products4. The oleoresin component only exists in withered and dying Aquilaria trees. In recent years, due to the increasing demand and commercial value of agarwood, the trade in agarwood has intensified, leading to the destruction of natural agarwood forests.
As a crucial aspect of aromatherapy, the extraction methods of essential oils have a significant impact on their components. According to the definition of International Organization for Standardization (ISO) (ISO/D1S9235.2), essential oil is a product made by water or steam distillation, citrus peel machining or natural materials. In addition, the methods of CO2 supercritical extraction, subcritical fluid extraction or organic solvent extraction are also called agarwood extracts or agarwood oils5. The main components are sesquiterpenes, phenylethyl chromone derivatives, and aromatic compounds6,7. Chromone components can be used as important indicator components for the quality evaluation and identification of agarwood. It is generally believed that essential oils with a high content of chromone compounds have better quality8,9. Among the methods of extracting AEOs, steam distillation is also common. The extraction device is simple, the product is natural and pollution-free, but only low-boiling-point components can be extracted, and characteristic components such as chromones are lost, resulting in a low yield of essential oil10. Different extraction methods, plant species, and locational areas of agarwood oils lead to difference in market price performance, which makes the essential oils adulterate3. Research on AEOs mainly focuses on the identification of chemical components, but comparison of compounds is not enough to support the establishment of a quality evaluation system for agarwood11.
Metabolomic data usually include a wide range of dynamic changes in metabolite concentrations, due to the geographical phenology and processing methods of plant varieties and production areas. Although the data obtained by GC-MS can be compared with the common components of AEOs through sampling from a wide range of sources, these common components of AEOs still cannot be used as indicators for statistical verification. Our work focused on seven regions of AEOs by hydro-distillation from different habitats and species. Different chemical constituents were identified by GC-MS fingerprint and multivariate statistical analysis, including partial least squares-discriminant analysis (PLS-DA), orthogonal partial least squares-discriminant analysis (OPLC-DA) and SPSS cluster analysis12,13. This multivariate statistical method is used to handle the complex data generated by GC-MS. It helps in identifying characteristic chemical markers and distinguishing samples from different regions and species14,15,16. The methods together provide a comprehensive approach to analyzing AEOs, ensuring accurate identification and differentiation of their chemical components. These components were identified by using NIST general database retrieval and literature review, providing reference for the overall quality evaluation of AEOs.
Results and discussion
GC-MS analysis of volatile components and common chemical compounds
According to the GC-MS standard mass spectrometry database NIST2020, the volatile components of AEO samples from different habitats and 3 species were analyzed, as shown in Table 1. The 2-(2-phenylethyl) chromone compounds were determined based on the mass spectrometry characteristics and fragmentation patterns summarized in literature17, combined with the characteristics of ion fragments in this study. A total of 127 compounds with more than 85% similarity were identified from the essential oils of 7 regions, accounting for 28.6–74.6% of the total volatile components (Table 1)(Supplement file Figure S1). According to different chemical structures, these volatile components are classified into sesquiterpenes (0.3–70.4%), aromatic compounds (0.6–24.9%), aliphatic compounds (0–8.4%), 2-(2-phenylethyl) chromones (0–12.0%) and others (0.3–7.6%) (Fig. 1). Sesquiterpenes had a total of 73 components, with the most abundant being sesquiterpenes. Aromatic groups followed, consisting of 12 components in total. Aliphatic groups and 2-(2-phenylethyl) chromones were the least, with 10 and 4 components, respectively.
In the AEOs extracted by hydro-distillation, due to the high temperature of water vapor in the extraction process, or the influence of the solvent used in the analysis, the 2-(2-phenylethyl) chromone component often did not appear11. However, the AEOs obtained by supercritical fluid extraction and microwave-assisted extraction usually contain Flidersiachromone, 6-methoxy-2-(2-phenylethyl) chromone, 4 H-1-benzopyran-4-one chromone, and a few other semi-volatile 2-(2-phenylethyl) chromones11,17. In this study, more aromatic components could be extracted from hydro-distillation AEO samples by pretreatment with 95% ethanol in GC analysis, and excellent chromone components could be detected18.
Low molecular weight aromatic compounds are important components of AEOs, and they are frequently regarded as the primary source of aroma in AEOs. More aromatic compounds were detected in the resinous agarwood, were absent from the non-resinous parts and confirmed as characteristic of the resinous parts19. AEOs contained abundant fatty acids, possibly affecting the complex process of resin accumulation, prolonging the accumulation time, and resulting in a longer formation time being required for agarwood oil yield4. Based on the aliphatic relative content of samples (Fig. 1), these were ranked as S3 Taiwan (8.38%) > S2 Hainan (2.34%) > S7 Cambodia (1.71%) > S1 Guangxi (1.64%) > S6 Vietnam B (0.68%) > S5 Vietnam A (0.15%) > S4 Malaysia (0%). We also considered in practical operations that the lower the fatty acid content, the more important it was to evaluate the quality of agarwood.
The phytocomplexity of the AEOs signifies the production of a multitude of plant–fungus mediated secondary metabolites as chemical signals for natural ecological communication. Table 1 shows an aromatic compound, 4-phenyl-2-butanone, as the only common component. A similar component, 3-phenyl-2-butanone, also appeared in hydro-distilled essential oil of A malaccensis and A. sub-integrafrom Malaysia, Thailand, and Cambodia20. This common 4-phenyl-2-butanone was presented in A. malaccensis represents an important basis for plant–fungal metabolic analysis chemistry in wild plants and in vitro plantlets7,21. We consider that the S2 sample should contain a small amount or no agarwood, as most of its heartwood was directly extracted by hydro-distillation or might have been added with unknown chemical essences. Therefore, six common components could be detected in the other 6 AEOs except the S2 sample, namely, 1,1,4,5,6-pentamethyl-2,3-dihydro-1H-indene (aromatic compound), viridiflorol (sesquiterpenoid compound), bis(2-ethylhexyl)phthalate (aromatic compound, plasticizer), in addition to 5-(2-methylpropyl)-nonane and 2,6,10-trimethyl-dodecane (others). One terpenoid of particular interest is viridiflorol, a known common fragrance molecule of agarwood22. Viridiflorol has shown moderate antibacterial activity against Mycobacterium tuberculosis, the causative agent of tuberculosis, in an in vitro assay. It is also produced by the endophytic root fungus Serendipita indica and exhibits antifungal activity against Colletotrichum truncatum23. It was particularly surprising that bis(2-ethylhexyl) phthalate, diethyl phthalate, and dibutyl phthalate were detected in this study (Table 1). These components are often used as plasticizers, condensing agents, anti-wear agents, and gas chromatographic stationary liquids for polyvinyl chloride resins. The plasticizers are mixed with some food oils to reduce product costs and should not be used as effective components of AEOs. Studies have shown that excessive intake of these plasticizers can have adverse effects on human reproduction, development, and the cardiovascular system24. The total content of plasticizer added in the S4 sample accounted for about 23.9%. The quality of S4 essential oil was poor, and some samples were also detected but low, which might be due to the accumulation of plants themselves, or pollution caused by GC analysis25.
Sesquiterpenoids are natural terpenoids containing 15 carbon atoms in a molecule composed of three isoprene units11. In addition to the viridiflorol, by comparing various samples, we have identified the following 6 common components that were worth noting: elemol (sesquiterpene),γ-eudesmol (sesquiterpene), (-)–aristolene (sesquiterpene), agarospirol (sesquiterpene), 2(3H)-naphthalenone, 4, 4a, 5,6,7,8-hexahydro-4a, 5-dimethyl-3- (1-methyllidene)-, (4ar cis)- (sesquiterpene) and 2-phenyl-4 H-chromen-4-one (chromone derivative) (Table 1). According to modern pharmacology, sesquiterpene components of agarwood have good biological activity in the central nervous system, respiratory system, and digestive system, etc. Elemol is a natural product that sesquiterpenoid has a role as a fragrance, showing modest antioxidant, anti-inflammatory, and antiproliferative activities of the essential oil of Cymbopogon nardus26,27. Plants with aromatic properties have multiple chemical components in their essential oils, such as the main component of Blepharocalyx salicifolius, which was the viridiflorol and eudesmane sesquiterpenes28,29. In the fungus-mediated fermentation of resinous agarwood, the most significant finding was the appearance of key agarwood sesquiterpenes such as agarospirol, γ-eudesmol, (−)-aristolene28. A sesquiterpenoid 2-(3H)-naphthalenone, 4, 4a, 5, 6, 7, 8-hexahydro-4a, 5-dimethyl-3- (1-methylidene)-, (4ar cis)-, which has not been found in other agarwood literatures. However, it showed higher performance in the comparison of common components in this study, and its relative content was higher than other components.
One of the main active components of agarwood, chromone, has been isolated and found to have 240 different subunits. It has anti-inflammatory and anti-tumor properties, neuroprotective effects, and inhibitory effects on acetylcholinesterase, tyrosinase, and glucosidase30. It is worth noting that 2-(2-phenylethyl) chromones often do not appear in the essential oil extracted by hydro-distillation. However, among the four chromones analyzed in this study, the common component, 2-phenyl-4H-chromen-4-one, was presented in the 5 regions, and this chromone component has not been found in other literatures. There were significant differences in the relative content of each component, which might lead to differences in special flavors of agarwood from different habitats.
Multivariate statistical analysis of PLS-DA and OPLS-DA
PLS-DA is the deformation of PLS, used to establish classification and is suitable for supervised discriminant analysis methods with small intergroup differences14. It is applied to prediction and descriptive modeling, as well as selecting discriminative variables, determining the chemical compositions from different genotypes and product regions, and automatically generating more important principal components15,16. The PLS-DA model displayed clear separation among the 7 regions and 3 genotypes of AEOs (Fig. 2a). The software automatically generated R2X (cum) = 0.848, R2Y (cum) = 1, and Q2(cum) = 0.854 for predictive ability. In a previous agarwood study, HPLC chromatograms were used in combination with multivariate statistical screening to establish the identification methods for wild and cultivated agarwood. The Fisher linear recognition model and the PLS-DA recognition model were established31. This study established PLC-DA based on GC-MS data (Fig. 2a), Q2 > 0.5 indicating a strong predictive ability. The result showed that there were significant differences in the volatile components of AEOs from different habitats. In addition, the discrimination of different genotypes was provided with a certain effect. The Permutation validation in SIMCA 14.1 software was used to verify the fitting of PLS-DA (Fig. 2b). Through 200 iterations of permutation testing, the model results showed that the Y-axis intercept was all less than 0, indicating that the PLS-DA model validation results were fitting and reliable. Alternative, the Hotelling’s T2 analysis also verified that all samples were within the 95% confidence interval32,33, validation results provided a more evaluation of model performance. (Supplementary file Fig. S2).
The PLS-DA obtained scatter plot, (a) classification of AEOs in three genotypes; (c) classification of CAN and OCA areas, and 200 permutations of PLS-DA model validation results (b, d). The scatter plot of OPLS-DA, (e) classification of CAN and OCA areas, and the verification results of 200 permutation of OPLS-DA models (f). S: A.sinensis, M: A.malaccensis, C: A.crassna, CNA: AEOs from China (S1-S3 samples). OCA: AEOs from outside of China (S4-S7 samples). R2: Coefficient of determination, Q2: squared cross-validation. The Hotelling’s T2 analysis showed that all samples were within the 95% confidence interval, and no samples beyond this confidence interval were found, so there was no significance.
Supervised methods offer another approach to classification enhancing the discrimination between specimens by minimizing variance34. In this study, PLS-DA was utilized to classify AEOs generated in China (CNA) and non-China (OCA). The model indicated that S4 - S7 were dispersed OCA, regardless of the first principal component or the second principal component. AEOs (S1-S3) produced in CNA showed a quadrant of aggregation (Fig. 2c). The software automatically generated R2X (cum) = 0.84, R2Y (cum) = 1, Q2 (cum) = 0.981, suggesting that the difference in volatile components of AEOs in OCA was significantly higher than that of CNA. Through 200 iterations of permutation testing, the model results showed that the Y-axis intercept was all less than 0, indicating that the PLS-DA model validation results were fitting and reliable (Fig. 2d). However, the model was not effective for screening differential volatile markers (Fig. 2c), so we conducted OPLS-DA to analyze the strategy of identifying these markers.
Supervised OPLS discriminant analysis (DA) was applied to identify the volatile markers for AEOs from different habitats. OPLS has excellent external prediction ability as well as a better visualization effect compared with PLS14. In the OPLS-DA scatter plot (Fig. 2e), the R2X, R2Y, and Q2 of S1-S3 AEO samples from China and S4-S7 samples from outside China were 0.84, 1, and 0.997, respectively. The samples were located on both sides of the positive and negative axes of the first principal component with the X-axis was at 0, indicating that the volatile components of AEOs produced in CNA (S1-S3) could effectively distinguish the two quadrants from AEOs (except S5, S4-S7) in other regions, and the genotype and relative content of AEOs were different. The 200 permutation tests were conducted to verify the OPLS-DA model, and the Hotelling’s T2 analysis showed that all samples were within the 95% confidence interval.
In the multivariate statistical analysis, S5 was an outlier phenomenon, with the difference components being the largest compared with the representativeness of other samples. The difference components of S5 could be used as the volatile markers of A. crassna. Although the PLS-DA genotype discrimination effect was good, the regional characteristics were not obvious in the quadrant, and the SIMCA software could not present the variable influence on projection (VIP). The difference between OPLS-DA groups was maximized, and the difference within the groups was weakened, which was more suitable for the separation of samples between groups35. Therefore, the VIP diagram was presented for further analysis.
OPLS-DA model screening the volatile markers of AEOs
The VIP value and S-plot evaluation method were employed to identify the key components contributing to the grouping of AEOs. The S-plot, a scatter plot combining covariance and correlation loading profiles resulting from an OPLS-DA model was utilized36. Variables with a VIP greater than 1 were deemed statistically significant and served as important markers of the model37,38. The VIP values (Fig. 3a) and S-plot (Fig. 3b) generated by the OPLS-DA model revealed 26 components with a VIP value > 1 (Table 2). These potential component values far away from the origin of S-plot represented variables that contributed a lot to the classification and were more reliable than the near origin components as potential markers to distinguish the AEOs from different producing regions. Statistical tests such as SPSS was carried out on significant variables to make the model acceptable. It is worth noting that sesquiterpenes and chromones are the index components of agarwood, sesquiterpenes α-gurjunene (VIP = 4.86), agarospirol (VIP = 2.86), alloaromadendrene (VIP = 2.49), (-)-aristolene (VIP = 2.37) and chromone 2-phenylethyl-4H-methylene-4-one (VIP = 2.63) were significantly VIP > 2 components between CNA and OCA. While α-gurjunene exhibited the highest VIP value, especially in “Hui-An” agarwoods. OPLS-DA analysis revealed, but this component did not appear in the CNA (S1-S3) group (Table 2, No. 19). The main factor affecting the VIP value was the detection of 40.82% content in S5, but this outstanding VIP value was also easy to distinguish among complex plant metabolites. Moreover, most aromatic components carry distinct aromas associated with AEOs39. Although bis(2-ethylhexyl) phthalate raises concerns due to its toxicity, and it remains uncertain whether it originated from AEOs or pyrolysis during extraction24,25. Due to the detection of 16.4% content of bis(2-ethylhexyl)phthalate in S4 sample, it also affected the VIP value of inter genomic comparison analysis. However, this could also clearly highlight the identification of this component.
AEOs from CNA contained more guaiol, and OCA (Malaysia, Vietnam, and Cambodia) contained more α-gurjunene. The total relative contents of differential components between the two producing regions were higher in OCA. Notably, two sesquiterpenes, α-gurjunene and agarospirol, stood out in the S-polt diagram, being distanced from the origin and the main compound groups (Fig. 3b). Specifically, α-gurjunene (VIP = 4.86) significantly influenced the grouping of AEO samples and was positively correlated with the grouping of AEOs. Prior studies have employed OPLS-DA model to discriminate between the A.sinensisand its subspecies “Chi-Nan” and to identify potential distinguishing components. Notably, sesquiterpenes, particularly guaiane and eudesmane derivatives, were considered key markers contributing to their odoriferous properties37. Similarly, the sesquiterpenes in AEOs also exhibited significant differences, indicating their potential as characteristic components.
OPLS-DA model comparing the pairwise genotypes of AEOs
In this study, OPLS-DA effectively modeled two or more classes. In addition to the CNA and OCA analyses mentioned above, three agarwood genotypes, A. sinensis, A.malaccensis and A.crassna, were classified and compared according to the pairwise genotypes (Fig. 4). The OPLS-DA model analyzed different components among different producing regions. The model results indicated that when comparing A.sinensis and A.malaccensis, R2X (cum) = 0.689, R2Y (cum) = 1, Q2 (cum) = 0.86 ; when comparing A.sinensis and A.crassna, R2X (cum) = 1, R2Y (cum) = 1, Q2 (cum) = 1, when comparing A.malaccensis and A.crassna, R2X (cum) = 1, R2Y (cum) = 1, Q2 (cum) = 1, indicating that the models could describe most of the GC-MS data and possessed good predictive ability. The volatile components of AEOs exhibited certain similarities within the same genotypes, but differences existed between different genotypes.
VIP values and S-polt diagrams were used to screen the differential chemical components contributing the most to the pairwise genotype group (Fig. 5). The VIP value results showed that there were 25 components with a VIP value > 1 between A. sinensis and A. malaccensis, with sesquiterpene guaiol (VIP = 2.55) being the largest contribution component between the two genotypes (excluding bis(2-ethylhexyl) phthalate) (Fig. 5a). Comparing A. sinensis and A. crassna, 25 components with VIP value > 1 were identified, with α-gurjunene (VIP = 5.28) being the largest contribution component (Fig. 5c). Between A. malaccensis and A. crassna, 22 components with VIP value > 1 were found, with α-gurjunene (VIP = 5.03) being the largest contribution component (Fig. 5e).
The VIP value diagram (left) and S-polt diagram (right) of PLS-DA model were compared the pairwise genotypes of the AEOs. (a and b) A.sinensis and A.malaccensis, (c and d) A.sinensis and A.crassna, (e and f) A.malaccensis and A.crassna. The arrows exhibit significant differences in composition contribution between two-pair genotypes. Plasticizers should not be the component of AEOs, so the contributing component should be eliminated.
The results of the differential component analysis revealed that AEOs from A. sinensis contained relatively more guaiol and 2-phenethyl-4H-chromen-4-one, whereas those from A. malaccensis contained more sesquiterpene 2- (4a, 8-dimethyl-2,3,4,5,6,8a-hexahydro-1 H-naphthalen-2-yl)propan-2-ol (Fig. 5a and b). Additionally, when comparing AEOs of A. sinensis and A. crassna, aside from the differences in guaiol, A. sinensis AEOs also contained more agarospirol and 2-phenethyl-4H-chromen-4-one, while the AEOs of A. crassna contained more sesquiterpenes α-gurjunene and alloaromadendrene (Fig. 5c and d). Compared with AEOs in pairwise genotypes, A. malaccensis and A. crassna, the α-gurjunene exhibited significant differences in composition contribution (Fig. 5e and f), with AEOs of A. malaccensis containing more sesquiterpene 2-(4a, 8-dimethyl-2,3,4,5,6,8a-hexahydro-1H-naphthalen-2-yl)propan-2-ol, while AEOs of A. crassna contained more sesquiterpene γ-eudesmol. These contribution classifications of pairwise genotypes could serve as potential markers to distinguish AEOs of different species. The results demonstrated that the production regions of AEOs could be better distinguished based on chemometrics.
Analysis of such multivariate data requires methodology capable of handling both the contribution to the OPLS model, i.e., concentration variant, and correlation to the OPLS model, i.e., concentration invariant36. Statistical SPSS test (P < 0.05) was carried out on significant variables to make the model acceptable. Based on the above OPLS-DA results, the unique phytochemical characteristics of various species may be related to the genetic information of primitive plant germplasm or endophytic fungi. The current strategy focuses on this complex problem, emphasizing the strategy of obtaining additional information when appropriate multivariate modeling is combined with appropriate and effective visualization of specific marker metabolites to identification.
Conclusion
AEOs have a significant international market through hydro-extraction, particularly in Muslim regions. For the first time, we utilized GC-MS to delineate the chemical fingerprints of AEOs in three primary genotypes: A. sinensis, A malaccensis and A.crassna, and analyzed the differences in aroma components across various production regions. Metabolomics data typically encompass vast dynamic ranges in metabolite concentration. Here, we reveal distinctive differences in sesquiterpenes, chromone and its derivatives, and low-molecular-weight aromatic compounds. A total of 127 compounds were identified from the AEOs, with sesquiterpenes comprising the majority, totaling 73 components. The aromatic compound 4-phenyl-2-butanone was the sole common component among the seven samples. Additionally, there were 7 common components with a higher occurrence of sesquiterpenes and chromone: viridiflorol; elemol; γ-eudesmol; (-)–aristolene; agarospirol; 2(3H)-naphthalenone, 4, 4a, 5,6,7,8-hexahydro-4a, 5-dimethyl-3- (1-methyllidene)-, (4ar cis)- and 2-phenyl-4H-chromen-4-one. It was particularly surprising that plasticizers bis(2-ethylhexyl) phthalate, diethyl phthalate and dibutyl phthalate were detected in this study. The total content of plasticizers added in S4 sample accounted for about 23.9%, considering the poor quality of S4 essential oil. Other samples exhibited low levels of detection, likely due to contamination during GC analysis.
PLS-DA and OPLS-DA methods were employed for multivariate statistical analysis of the differential chemical components between different genotypes and habitats. The results demonstrated that the AEOs from different habitats could be effectively classified and identified based on GC-MS combined with chemometrics. In OPLS-DA, 26 differential markers including 17 sesquiterpenes, 2 chromones and 3 aromatics, were identified according to VIP value. The VIP value and S-plot generated by the comparison of regional groups (CNA and OCA) in the OPLS-DA model showed a total of 26 potential markers in VIP > 1, and a total of up to 25 potential markers were generated by comparison of two genotypes. The components of agarwood such as α-gurjunene, agarospirol, guaiol, γ-eudesmol and 2-phenethyl-4H-chromen-4-one were searched and summarized in the literature related to agarwood, which contributed the most to the VIP value. The unique phytochemical characteristics of agarwood may be related to the interactive information of original plant germplasm or invasive microorganisms. The current strategy focuses on this complex issue. By using multivariate statistical analysis, the indicator components can be scientifically highlighted, even if additional chemicals are added to reduce product costs such as plasticizers. Therefore, the strategy emphasizes providing more information and obtaining additional information when appropriate multivariate modeling is combined with appropriate and effective visualization of specific marker metabolites for identification.
Experimental section
Plant materials
Seven regions of AEOs were collected from Guangxi, Hainan and Taiwan for the China areas, and from Vietnam, Cambodia and Malaysia producing areas for Southeast Asia. Essential oils were obtained through water distillation or steam at the production regions and local shop purchase. Six samples were randomly selected from each planting region. See Table 3 for the source information.
GC–MS analyses of AEOs
Accurately weigh 30 mg of essential oil in a 5 mL EP tube, then add 2 mL of ethyl acetate solution (China National Pharmaceutical Group Chemical Reagent Co., Ltd., China) to dissolve. Shook well and let it stand for 2 h. Extract 1mL of the essential oil solution and filter it through a 0.45 μm filter membrane, preparing it for gas chromatography-mass spectrometry analysis.
The compositions of the essential oils were analyzed by GCMS-QP2010 Plus (Shimadzu, Tokyo, Japan), equipped with an SH-Rxi-5Sil MS Cap. column (30 m × 0.25 mm i.d., 0.25 μm film thickness; Shimadzu, Japan). The temperature program was as follows: initial temperature 90 °C for 2 min, then increased by 2 °C min−1 to 150 °C and held for 5 min, and then increased by 2 °C min−1 to 280 °C and held for 5 min. The other parameters were as follows: injection temperature, 250 °C; ion source temperature, 230 °C; EI, 70 eV; carrier gas, He at 1 ml min−1; injection volume, 1 ml; spilt ratio, 1:20; solvent delay of 2.5 min and mass range, m/z 50–550. Quantification was obtained from percentage peak areas from the gas chromatogram. Identification of individual compounds was carried out using the NIST2020 (National Institute of Standards and Technology, US. Department of Commerce) Registry of Mass Spectral Database to search the compounds of authentic references. Chromatographic results expressed as area percentages were calculated with a response factor of 1.0.
Methodological examination
Precision test
In S1–S7 regions of AEOs from different sources (Table 3), randomly selected one region, such as S1. Out of the six samples in each region, equal amounts were drawn and thoroughly mixed to form one sample. The test solution was prepared according to the above preprocessing description, and GC-MS analysis was conducted under the above chromatographic and mass spectrometric conditions. Following the same process, the analysis was repeated six times on the mixed S1 sample. The six data points were compared using the Similarity Evaluation System for Chromatographic Fingerprint of Traditional Chinese Medicine (Version 2012) (Chinese Pharmacopoeia Commission, China), and a similarity of no less than 0.99 indicated fine precision of the instrument.
Repeatability test
For the repeatability test, samples of AEOs from the same source (such as S1) were used. Six samples were made according to the steps description above. The weighing of each sample had to be precise. GC-MS analysis was conducted as described above. The six data points were compared using the Similarity Evaluation System for Chromatographic Fingerprint of Traditional Chinese Medicine, and a similarity of no less than 0.99 indicated good repeatability of the method.
Stability test
Any sample solution from S1-S7 was randomly selected, and the selected sample (such as S1) was dissolved into a tube of solution following the preprocessing steps. The solution was stored for different times: 2, 4, 6, 8, 12, and 24 h for GC-MS analysis. The six data points were compared using the Similarity Evaluation System for Chromatographic Fingerprint of Traditional Chinese Medicine, and a similarity of no less than 0.99 indicated that the test solution was stable within 24 h.
Data processing
Each experiment was repeated three times. Based on the NIST2020 database, the volatile components of the samples were qualitatively analyzed by mass spectrometry. Peak area normalization was used to calculate the relative percentage content. Substances with a similarity greater than 85% were identified as potential chemical components of AEOs. Using the software of Similarity Evaluation System for Chromatographic Fingerprint of Traditional Chinese Medicine (Version 2012) with a time window of 0.2, automatic matching was performed through multi-point correction using the median method. The similarity and common peaks between each sample and the reference map were calculated, and a GC-MS fingerprint map was constructed.
Multivariate analysis
The SIMICA14.1 software (Umetrics Co., Sweden) for multivariate data analysis was used. The compound data was normalized, and then the software performed diversified statistical analysis through PLS-DA analysis and OPLS-DA modules. PLS-DA and OPLS-DA were introduced for discrimination and derivation of potential markers (VIP score > 1)36. Finally, the cluster analysis was carried out in combination with SPSS27.0 data processing software. The univariate statistical analysis was introduced to confirm those differentially expressed features (p < 0.05). The cluster analysis used between-cluster linkage, and the Euclidean distance was used as a sample measure to determine the difference between the producing regions and species of AEOs.
Data availability
Data availabilityAll data generated or analyzed during this study are included in this published article and its supplementary information file.
References
Hu, Z. K., Chen, Y., Yan, T. T. & Li, G. Y. Differences between Qi-Nan agarwood and ordinary agarwood in secretion properties and antibacterial activity. Sci. Sil Sin. 59, 136–144. https://doi.org/10.11707/j.1001-7488.LYKX20210536 (2023).
Persoon, G. A. & Beek, H. H. Growing ‘The wood of the gods’: agarwood production in Southeast Asia. In: (eds Snelder, D. J. & Lasco, R. D.) Smallholder Tree Growing for Rural Development and Environmental Services: Lessons from Asia. Springer, Netherlands, Dordrecht. 245–262. (2008).
Nguyen, H. T. et al. Multi-platform metabolomics and a genetic approach support the authentication of agarwood produced by Aquilaria crassna and Aquilaria malaccensis. J. Pharmaceut Biomed. 142, 136–144. https://doi.org/10.1016/j.jpba.2017.04.047 (2017).
Chen, H. Q. et al. Comparison of compositions and antimicrobial activities of essential oils from chemically stimulated agarwood, wild agarwood and healthy Aquilaria sinensis (Lour.) Gilg trees. Molecules 16, 4884–4896. https://doi.org/10.3390/molecules16064884 (2011).
Agarwood extract. Forestry Industry Standard of the People’s Republic of China (LY/T 3361 – 2023). National Forestry and Grassland Administration. China Standards Press. (2023). https://www.spc.net.cn
Li, W. et al. Five new eudesmane-type sesquiterpenoids from Chinese agarwood induced by artificial holing. Fitoterapia 100, 44–49. https://doi.org/10.1016/j.fitote.2014.11.010 (2015).
Wong, Y. F., Chin, S. T., Perlmutter, P. & Marriott, P. J. Evaluation of comprehensive two-dimensional gas chromatography with accurate mass time-of-flight mass spectrometry for the metabolic profiling of plant–fungus interaction in Aquilaria malaccensis. J. Chromatogr. A. 1387, 104–115. https://doi.org/10.1016/j.chroma.2015.01.096 (2015).
Yang, D. L. et al. Fragrant agarofuran and eremophilane sesquiterpenes in agarwood ‘Qi-Nan’ from Aquilaria sinensis. Phytochem Lett. 8, 121–125. https://doi.org/10.1016/j.phytol.2014.03.003 (2014).
Yan, T. T., Chen, Y., Wang, Q., Shang, L. L. & Li, G. Y. Quality analyses of domestic agarwood (Aquilaria sinensis) in China by LC-MS combining with Chemometrics. China Wood Ind. 33, 14–18. https://doi.org/10.19455/j.mcgy.20190604 (2019).
Geng, T. Y., Luo, L. Y., Zeng, L., Wang, X. H. & Duan, Y. Z. On effect of different extraction methods on preparation of agarwood essential oil. J. Southwest. China Normal Univ. (Natural Sci. Edition). 45, 59–67. https://doi.org/10.13718/j.cnki.xsxb.2020.02.009 (2020).
Chen, X. Q. et al. Chemical composition and potential properties in mental illness (anxiety, depression and insomnia) of agarwood essential oil: a review. Molecules 27, 4528–4545. https://doi.org/10.3390/molecules27144528 (2022).
Nguyen, H. T. et al. UPLC-QTOFMS based metabolomics followed by stepwise partial least square-discriminant analysis (PLS-DA) explore the possible relation between the variations in secondary metabolites and the phylogenetic divergences of the genus Panax. J. Chromatogr. B. 1012–1013. https://doi.org/10.1016/j.jchromb.2016.01.002 (2016).
Guo, D. Y. et al. Chemical composition analysis and discrimination of essential oils of Artemisia Argyi Folium from different germplasm resources based on electronic nose and GC/MS combined with chemometrics. Chem. Biodivers. 20, e202200991. https://doi.org/10.1002/cbdv.202200991 (2023).
Bylesjo, M., Rantalainen, M., Cloarec, O. & Nicholson, J. K. OPLS discriminant analysis: combining the strengths of PLS-DA and SIMCA classification. J. Chemometr. 20, 341–351. https://doi.org/10.1002/cem.1006 (2006).
Lee, L. C., Liong, C. Y. & Jemain, A. A. Partial least squares-discriminant analysis (PLS-DA) for classification of high-dimensional (HD) data: a review of contemporary practice strategies and knowledge gaps. Analyst 143, 3526–3539. https://doi.org/10.1039/C8AN00599K (2018).
Huang, Y. M. et al. Comparison of a novel PLS1-DA, traditional PLS2-DA and assigned PLS1-DA for classification by molecular spectroscopy. Chemometr Intell. Lab. 209104225. https://doi.org/10.1016/j.chemolab.2020.104225 (2021).
Mei, W. L. et al. Characterization and determination of 2-(2-phenylethyl)chromones in agarwood by GC-MS. Molecules 18, 12324–12345. https://doi.org/10.3390/molecules181012324 (2013).
Zhang, Y. et al. Temporal characteristics of agarwood formation in Aquilaria sinensis after applying whole-tree agarwood-inducing technique. Chin. Herb. Med. 15, 37–44. https://doi.org/10.1016/j.chmed.2022.07.003 (2023).
Takamatsu, S. & Ito, M. Correction to: Agarotetrol: a source compound for low molecular weight aromatic compounds from agarwood heating. J. Nat. Med. 73, 685. https://doi.org/10.1007/s11418-019-01310-x (2019).
Hashim, Y. Z. H. Y., Ismail, N. I. & Abbas, P. Analysis of chemical compounds of agarwood oil from different species by gas chromatography mass spectrometry (GCMS). IIUM Eng. J. 15, 55–60. https://doi.org/10.31436/iiumej.v15i1.469 (2014).
Chung, J. P. & Chen, K. S. Developing an in vitro quasi-symbiotic culture system of Aquilaria malaccensis with potentially beneficial microbes on agarwood resinous formation. Plant. Cell. Tiss Org. Cult. 133, 193–202. https://doi.org/10.1007/s11240-017-1372-7 (2018).
Yan, T. T., Yang, S., Chen, Y., Wang, Q. & Li, G. Y. Chemical profiles of cultivated agarwood induced by different techniques. Molecules. 24, (1990). https://doi.org/10.3390/molecules24101990 (2019).
Ntana, F. et al. Hamberger, B. A sesquiterpene synthase from the endophytic fungus serendipita indica catalyzes formation of viridiflorol. Biomolecules 11, 898–914. https://doi.org/10.3390/biom11060898 (2021).
Luo, Q. et al. Global review of phthalates in edible oil: an emerging and nonnegligible exposure source to human. Sci. Total Environ. 704, 135369. https://doi.org/10.1016/j.scitotenv.2019.135369 (2019).
Kdimy, A. et al. Isolation of two plasticizers, bis(2-ethylhexyl)terephthalate and bis(2-ethylhexyl) phthalate, from Capparis spinosa L. Leaves. Chem. Biodivers. 20, e202300903. https://doi.org/10.1002/cbdv.202300903 (2023).
Pellegrini, M. C., Alonso-Salces, R. M., Umpierrez, M. L., Rossini, C. & Fuselli, S. R. Chemical composition, antimicrobial activity, and mode of action of essential oils against Paenibacillus larvae, etiological agent of American foulbrood on Apis mellifera. Chem. Biodivers. 14, e1600382. https://doi.org/10.1002/cbdv.201600382 (2017).
Bayala, B. et al. Chemical composition, antioxidant, anti-inflammatory and antiproliferative activities of the essential oil of Cymbopogon nardus, a plant used in traditional medicine. Biomol. Concepts. 11, 86–96. https://doi.org/10.1515/bmc-2020-0007 (2020).
Sen, S., Dehingia, M., Talukdar, N. C. & Khan, M. Chemometric analysis reveals links in the formation of fragrant biomolecules during agarwood (Aquilaria malaccensis) and fungal interactions. Sci. Rep. 7, 44406. https://doi.org/10.1038/srep44406 (2017).
Furtado, F. B. et al. Chemical composition and bioactivity of essential oil from Blepharocalyx Salicifolius. Int. J. Mol. Sci. 19, 33. https://doi.org/10.3390/ijms19010033 (2018).
Mi, C. N. et al. 2-(2-phenylethyl)chromone derivatives: promising α -glucosidase inhibitors in agarwood from Aquilaria filaria. Phytochem 181, 112578. https://doi.org/10.1016/j.phytochem.2020.112578 (2021).
Shang, L. L., Chen, Y., Yan, T. T., Wang, Q. & Li, G. Y. Identification of wild agarwood and cultivated agarwood by HPLC coupled with multivariate data analysis. Chem. Ind. For. Prod. 38, 33–41. https://doi.org/10.3969/j.issn.0253-2417.2018.06.005 (2018).
Hotelling, H. The generalization of Student’s ratio. Ann. Math. Stat. 2, 360–378 (1931).
Srivastava, M. S. & Awan, H. M. On the robustness of Hotelling’s T2-test and distribution of linear and quadratic forms III sampling from a mixture of two mult1 variate normal populations. Commun. Sta-Theor M. 11, 81–107. https://doi.org/10.1080/036109282088282191982 (1982).
Farag, M. A., Dokalahy, E. U., Eissa, T. F., Kamal, I. M. & Zayed, A. Chemometrics-based aroma discrimination of 14 Egyptian mango fruits of different cultivars and origins, and their response to probiotics analyzed via SPME coupled to GC – MS. ACS Omega. 7, 2377–2390 (2022). http://pubs.acs.org/journal/acsodf
Trrgg, J. & Wold, S. Orthogonal projections to latent structures (OPLS). J. Chemometr. 16, 119–128. https://doi.org/10.1002/cem.695 (2002).
Wiklund, S. et al. Visualization of GC/TOF-MS-based metabolomics data for identification of biochemically interesting compounds using OPLS class models. Anal. Chem. 80, 115–122. https://doi.org/10.1021/ac0713510 (2008).
Yu, M. et al. Remarkable phytochemical characteristics of Chi-Nan agarwood induced from new-found Chi-Nan germplasm of Aquilaria sinensis compared with ordinary agarwood. Int. J. Anal. Chem. Article ID. 5593730, 10pages. https://doi.org/10.1155/2021/5593730 (2021).
Wang, W., Jin, S. & Guo, Y. L. Exploration of a method of distinguishing different Nongxiang Tieguanyin tea grades based on aroma determined by GC-MS combined with chemometrics. Molecules 24, 1707. https://doi.org/10.3390/molecules24091707 (2019).
Chen, H. Q. et al. Chemical constituents of agarwood originating from the endemic genus Aquilaria plants. Chem. Biodivers. 9, 236–250. https://doi.org/10.1002/cbdv.201100077 (2012).
Acknowledgements
This research received financial support from Fujian Provincial Department of Science and Technology, China under Contract No. 2022N5004 and 2023S0036.
Author information
Authors and Affiliations
Contributions
SZ Qian, YM Jiang and WX Zhang conceived here search. SZ Qian and QL Yan performed the research. JP Chung wrote the paper. WX Zhang and JP Chung revised the paper. DH Wu provided the materials and resources for the research. WX Zhang and JP Chung performed the integrity of the work.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Qian, SZ., Jiang, YM., Yan, QL. et al. Visualization OPLS class models of GC-MS-based metabolomics data for identifying agarwood essential oil extracted by hydro-distillation. Sci Rep 15, 5421 (2025). https://doi.org/10.1038/s41598-025-85976-2
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41598-025-85976-2