Introduction

The liver, often described as the body’s metabolic hub and detoxification powerhouse, is essential for maintaining overall health and vitality. However, in today’s world, liver health is compromised with myriads of challenges, ranging from toxins, unhealthy lifestyle to exposure of agents that harm liver cells and impair its function. This backdrop has contributed to a global surge in liver diseases including fatty liver, cirrhosis, viral hepatitis and liver cancer, causing more than two million deaths each year1. It is reported that one out of every 25 deaths are due to liver diseases. Today the third biggest cause of mortality is liver cancer, which is the sixth most prevalent cancer in the world. In terms of prevalence, it ranks ninth among cancers in women and fifth among males. Hepatocellular carcinomas (HCC), makes up around 75% of liver cancer cases and is the most prevalent kind of liver cancer. Amongst those suffering from chronic liver diseases, HCC is one of the major reasons of death. Even more concerning is the significant burden cirrhosis places on global healthcare systems, as it ranks as the 15th leading cause of disability-adjusted life-years (DALYs) worldwide. Once cirrhosis develops, mortality risk increases five–tenfold2. Hepatitis B, hepatitis C virus (HCV), Nonalcoholic fatty liver disease (NAFLD) and alcoholic liver disease are the main causes of cirrhosis and chronic liver disorders. NAFLD has emerged as one of the main causes of HCC in current years3. If NAFLD aggravates, it can result into a state known as nonalcoholic steatohepatitis, or NASH. NASH can lead to liver cancer, fibrosis (liver scarring), cirrhosis (everlasting scarring and hardening of the liver), and finally liver failure. One in five people with NAFLD, according to experts, will eventually develop NASH. The persistent inflammatory atmosphere within and around the liver may ultimately end up in liver cancer if NAFLD is not treated. Thus, one of the major risk factors for liver cancer is NAFLD. However, majority of people have limited access to costly HCC treatments such as targeted therapy, immunotherapy, radiation therapy, and surgery4,5. Addressing this rising prevalence of liver diseases necessitates not only improved diagnostic modalities but also novel preventive and therapeutic strategies aimed at preserving liver function and mitigating damage6,7,8.

In recent years, the concept of hepatoprotection has gained prominence as researchers and clinicians recognize the importance of preserving liver health and function9,10. The quest for newer hepatoprotective compounds may include exploration of traditional herbal remedies to synthetic pharmaceuticals11,12,13,14,15. Currently, there has been a growing focus on the study of phytochemicals as promising hepatoprotective agents. The global market has approximately 600 commercially available herbal formulations acknowledged for their hepatoprotective properties16,17. These phytochemicals offer a diverse array of mechanisms that contribute to their hepatoprotective properties. Acting as antioxidants, anti-inflammatory, and stimulators of liver regeneration; phytochemicals mitigate oxidative stress, inflammation, and damage within the liver18,19,20. They enhance detoxification processes, shield the liver from hepatotoxic agents, and inhibit hepatic stellate cell activation, crucial in fibrosis development21.

One of the best examples is Silymarin, a composite of flavonolignans (silybin, isosilybin, silydianin, and silychristin) from the dried seeds of Silybum marianum (milk thistle), well-known for its hepatoprotective effects through modulation of liver biochemical markers, induction of Nrf2 (nuclear factor-erythroid 2-related factor 2) expression, and anti-inflammatory properties22,23. Another prominent example is phyllanthin and hypophyllanthin, potent hepatoprotective lignans found in Phyllanthus niruri Linn., demonstrate significant liver-protective properties against various toxins, including carbon tetrachloride and alcohol, through mechanisms involving the inhibition of superoxide and hydroxyl radicals and lipid peroxidation24. Andrographis paniculata, known as the "king of bitters," contains active compounds such as andrographolide and neoandrographolide, which exhibit hepatoprotective effects through antioxidant and anti-inflammatory mechanisms25. Likewise, numerous herbs are documented for their hepatoprotective properties. However, it is noteworthy that only a limited fraction of hepatoprotective plants and formulations within traditional medicine have undergone rigorous pharmacological assessments26. The active constituents behind the hepatoprotective effect of plants such as Phyllanthus amarus, Boerhavia diffusa, Eclipta alba, Picrorhiza kurroa, Solanum nigrum, Tephrosia purpurea, Cichorium intybus, Embelia ribes, Aegle marmelos, Tinospora cordifolia, Terminalia arjuna, Tamarix gallica, Cressa cretica, Clerodendrum infortunatum, and Fumaria officinalis are not well known. Several phytoconstituents, including berberine, quercetin, kaempferol, resveratrol, ellagic acid, rutin, genistein, lycopene, epigallocatechin gallate (EGCG), allicin, gingerol, boswellic acid, and ginkgolides, have not had their mechanisms well studied despite their potential hepatoprotective effects27. Thus, to address these gaps, current research strategies should prioritize the scientific exploration of herbal-based medicines to uncover their hepatoprotective mechanisms, safety profiles, and efficacy. By delving into the mysteries hidden within herbal plants, we can unlock their full potential for liver health and develop novel therapeutic interventions. With the aim of identifying novel compounds for hepatoprotection, this study employed a comprehensive approach integrating advanced computational techniques, including network pharmacology, structure-based (molecular docking), ligand-based (pharmacophore modeling) approaches, and virtual screening, with knowledge of traditional herbal medicine (Fig. 1).

Fig. 1
figure 1

Research strategy for the current study.

Material and methods

Selection of protein targets and construction of protein–protein interaction (PPI) network

Through an extensive review of existing literature, multiple proteins implicated in the development and progression of liver disorders were identified. Protein–protein interaction studies were conducted using the STRING 11.0 (https://string-db.org/) database28 to construct the protein network. By specifying ‘Multiple Proteins’ as the selection criteria, the species was designated as ‘Homo sapiens’, though all other factors were maintained at default settings. Subsequently, the resulting protein–protein interaction network was imported into Cytoscape V 3.7.2 (https://cytoscape.org/)29 software for further analysis. This network was examined based on a confidence score threshold of > 0.7 to ensure high reliability. Utilizing the Network Analyzer tool within Cytoscape, the network’s topology was evaluated based on degree. Finally, core targets linked to liver diseases were identified through the generated network.

Gene ontology (GO) and Kyoto encyclopedia of genes and genomes (KEGG) pathway enrichment analysis

To explore the underlying biological processes, the major targets were subjected to GO and KEGG pathway enrichment analysis utilizing the Database for Annotation, Visualization, and Integrated Discovery (DAVID, https://david.ncifcrf.gov/)30. The signaling pathways recognized through GO and KEGG enrichment analysis (www.kegg.jp/kegg/kegg1.html)31,32,33 were rigorously assessed, with selection criteria based on a false discovery rate (FDR) threshold of less than 0.05. As the FDR value decreases, the correlation between several targets within these signaling pathways and the pharmacological actions of the associated drug constituents becomes more robust. The plots were generated using SRplot for visualization34.

Selection of phytocompounds

Hepatoprotective plants used in commercially available polyherbal formulations were chosen for investigation. Phytochemical constituents of these plants, known for their hepatoprotective properties, were identified through an exhaustive review of existing literature across multiple electronic databases such as PubMed, Embase, Web of Science, ScienceDirect, Google Scholar, etc. Our quest involved use of various keywords, including “Hepatoprotection,” “Herbal Plants,” “Liver Disease,” “Phytoconstituents,” “Liver Health,” “Active Constituent,” and “Hepatoprotective Activity” Emphasis was focused on identifying phytoconstituents with reported IC50 values, and compounds tested on HepG2 cell lines were included to ensure data robustness.

Molecular docking studies

Docking studies were conducted utilizing Molsoft ICM-Pro V.3.9.2a (https://www.molsoft.com/icm_pro.html)35 to investigate the interactions between hepatoprotective phytoconstituents and identified proteins.

Ligand preparation

The 2D structures of the selected phytoconstituents were initially generated using the ICM molecular editor and compiled for docking. These structures were subsequently converted into their 3D formats by retaining hydrogen atoms, correcting the orientations of amide bonds, and performing energy optimization using the software built-in tools. Furthermore, 3D conformers of the compounds were generated and used for docking to ensure accurate interpretation of their interactions with the target proteins.

Protein preparation

X-ray diffraction crystal structures of proteins were obtained from the RCSB (https://www.rcsb.org/) protein database36 and converted into ICM format. For the proteins with co-crystalized ligands, AKT1 (PDB ID: 7NH5), TNF (PDB ID: 2AZ5), and mTOR (PDB ID: 4JT5); co-crystalized ligands were removed for optimization and saved as separate objects. Docking was performed with the grid centered around the co-crystalized ligands, keeping the target structures rigid while keeping the ligands flexible. For the proteins lacking co-crystalized ligands, NF-kβ1 (PDB ID: 1SVC) and INFAR1 (PDB ID: 3S98); the ICM Pocket Finder module was utilized for the identification of the ligand binding sites. This module employs a grid potential map based on Van der Waals interactions between the receptor and ligand surface to locate potential binding pockets. Notably, the identification process prioritized physical interactions over geometric criteria. The default settings of the ICM pocket finder were maintained without alterations during the analysis.

Molecular docking

Molecular docking of the selected proteins with the selected ligands was conducted using Molsoft ICM-Pro V.3.9.2a. The docking parameters were set with a thoroughness level of 3 and the number of conformers set to 10. The optimal conformer was chosen based on docking results, and intermolecular interactions were analyzed. Docking studies were validated through comparison of RMSD values37,38.

Pharmacophore modeling

The ligand-based pharmacophore was constructed employing the common feature pharmacophore generation protocol within Molsoft ICM-Pro V.3.9.2a (https://www.molsoft.com/icm_pro.html) software wherein top five compounds identified from each docking experiments were chosen. The 3D structures of selected compounds were flexibly superimposed, and a consensus Ph4 (Pharmacophore) model was subsequently generated utilizing the APF (Atomic Property Field) tool. This approach facilitated the identification of shared structural characteristics within the selected compounds, elucidating the presence and spatial arrangement of key functional groups such as hydrogen bond donors, acceptors, lipophilic regions, and aromatic moieties39.

Virtual screening

Dataset generation is a crucial step in the screening process to identify optimal lead molecules. The ZINC database (https://zinc.docking.org/)40 serves as a meticulously curated repository of commercially available chemical compounds, offering comprehensive information on molecular weight, chemical structure, and physicochemical properties relevant to interactions with biological macromolecules. With over 230 million purchasable entities available in 3D format on its user-friendly website, the ZINC database is a valuable resource for virtual screening studies. To construct a database for pharmacophore-based virtual screening, initially identified hepatoprotective phytoconstituents were submitted to the ZINC database. With a Tanimoto similarity threshold set at 40%, hits were retrieved from the “ZINC natural products” and “ZINC natural derivatives” subsets within the database. Subsequently, a total of 1089 compounds meeting the specified criteria were selected for further screening. The 3D conformers of these retrieved compounds were generated utilizing Molsoft ICM-Pro V.3.9.2a software, enabling their subsequent screening against the generated pharmacophore model30,41.

Molecular redocking of top hits with selected protein

The top 10 hits identified through pharmacophore-based virtual screening were subsequently redocked against five selected protein targets using Molsoft ICM-Pro V.3.9.2a (https://www.molsoft.com/icm_pro.html). This step aimed to validate the interactions between the identified phytoconstituents and the protein targets. To ensure consistency and reliability of the results, the same rigorous protocol used in the initial molecular docking studies was employed for the redocking process. This included setting the thoroughness parameter to 3, generating 10 conformers for each ligand, and using the same grid to dock the molecules, thereby allowing a comprehensive assessment of the binding interactions and enhancing the reliability of the docking results35,36,37.

Molecular dynamics simulation

MD simulations on the docked complexes of selected proteins with identified hits were conducted using Desmond 2020.1 from Schrödinger, LLC. The simulations employed the OPLS-2005 force field42,43,44 and an explicit solvent model with SPC (Simple Point Charge) water model in a periodic boundary solvation box of 10 Å × 10 Å × 10 Å45. Na+ ions were inserted to neutralize the system charge, and a 0.15 M NaCl solution was included to simulate physiological conditions. The system was initially balanced using an NVT ensemble for 10 ns to stabilize the protein–ligand complexes. Subsequently, a short equilibration and minimization phase using an NPT ensemble for 12 ns was followed. The NPT ensemble employed the Nose–Hoover chain coupling scheme with a relaxation time of 1.0 ps, maintaining a temperature of 300 K and a pressure of 1 bar46. A time step of 2 fs was applied. Pressure control was managed using the Martyna-Tuckerman–Klein chain coupling scheme47 with an additional time of 2 ps. Long-range electrostatic interactions were calculated with the particle mesh Ewald method48, with a Coulomb interaction radius set at 9 Å. The ultimate production run was performed for 500 ns. To monitor the stability of the MD simulations, root mean square deviation (RMSD), radius of gyration (Rg), root mean square fluctuation (RMSF), and the number of hydrogen bonds (H-bonds) were computed.

Binding free energy analysis

The Molecular Mechanics Generalized Born Surface Area (MM-GBSA) technique was utilized to estimate the binding free energies of the ligand–protein complexes. The Prime MM-GBSA binding free energy was calculated using the Python script thermal_mmgbsa.py on the simulation trajectory for the last 50 frames, with a 1-step sampling size. The binding free energy (kcal/mol) was assessed utilizing the principle of additivity, where distinct energy components such as coulombic, covalent, hydrogen bond, van der Waals, self-contact, lipophilic, and solvation energies of both the protein and ligand were summed. The equation used to calculate ΔGBind is as follows:

$$\Delta G_{bind} = \Delta G_{MM} + \Delta G_{Solv} - \Delta G_{SA}$$

where ΔGBind specifies the binding free energy. ΔGMM specifies the difference between the free energies of ligand–protein complexes and the total energies of protein and ligand in isolated form. ΔGSolv specifies the difference in the GSA solvation energies of the ligand-receptor complex and the sum of the solvation energies of the receptor and the ligand in the unbound state. ΔGSA specifies the difference in the surface area energies for the protein and the ligand49.

ADMET and drug likeliness properties prediction

Pharmacokinetic and drug-likeliness properties of selected hits were identified through the SwissADME web tool (http://www.swissadme.ch/)50. Toxicity prediction was performed using ADMETlab version 3.0 (https://admetlab3.scbdd.com/)51.

Results

Protein–protein interaction (PPI) network

The protein–protein interaction (PPI) network was built using the String database, comprising 12 nodes and 30 edges (Fig. 2a), to encompass multiple proteins implicated in the development and progression of liver disorders. The network view depicts the predicted connections for a specific group of proteins, where nodes denote proteins and edges characterize predicted functional correlations. In Fig. 2a, the network is displayed in evidence mode, with edges represented by seven distinctly colored lines: red, green, blue, purple, yellow, light blue, and black. These colors represent the occurrence of fusion, neighborhood, co-occurrence, experimental, text mining, database, and co-expression evidence, respectively. The generated PPI network shows multiple green and purple lines, indicating strong neighborhood and experimental evidence of association. Additionally, the presence of blue, light blue, and black lines indicates co-occurrence, database, and co-expression evidence, respectively28.

Fig. 2
figure 2

(a) Protein–protein interaction network of various targets, with color-coded nodes representing individual proteins and colored lines denoting interactions between them; (b) Original input targets in the PPI network visualized using Cytoscape; (c) Sub-filtered targets in the PPI network depending on the confidence score (0.7–0.99); (d) Final network screened by degree centrality (4–9).

The resulting PPI network was analyzed using Cytoscape software to generate an interaction network diagram (Fig. 2b). In Cytoscape, the network is visualized in confidence mode, where the width of the lines represents the degree of confidence in the predicted interactions. A lower confidence score indicates enhanced interactions but also a higher likelihood of false positives. Interactions with a confidence score of 0.7 or higher were considered highly accurate29. By selecting interactions with confidence scores between 0.7 and 0.9, a network diagram comprising 11 nodes and 13 edges was obtained (Fig. 2c). Subsequently, the topological parameter, degree, was assessed for all nodes within the PPI network. The degree centrality of a node is defined by its degree, which is the number of edges linked to it. The presence of a higher degree indicates a more central node within the network. Subsequent screening led to the identification of a core interaction network comprising 8 nodes and 10 edges (Fig. 2d). The core targets identified were AKT1, TNF, NF-kβ1, mTOR, INFAR1, RAF1, PIK3CG, and STAT2, as outlined in Table 1.

Table 1 Top proteins in network ranked by Degree method.

Gene ontology (GO) and Kyoto encyclopedia of genes and genomes (KEGG) pathway enrichment analysis

To gain insights into the biological roles of the major targets identified, a total of eight key proteins were analyzed by GO and KEGG enrichment analysis using the DAVID database. The GO analysis revealed that these targets are primarily localized in the cytoplasm, cytosol, and plasma membrane, indicating their involvement in cellular signaling and structural functions. The biological processes (Fig. 3a) in which these targets are involved include inflammatory response, positive regulation of I-kappa B phosphorylation, positive regulation of lipid and nitric oxide biosynthetic processes, positive regulation of nitrogen compound metabolic processes and peptidyl-serine phosphorylation, and positive regulation of protein kinase B (AKT) signaling. Additionally, these targets are implicated in the positive regulation of protein metabolic processes, protein phosphorylation, and transcription from RNA polymerase II promoters. They are also involved in the negative regulation of apoptotic processes and extrinsic apoptotic signaling pathways, as well as in gene expression and macroautophagy. The molecular functions of these targets are focused on identical protein binding, protein kinase activity, protein serine/threonine kinase activity, and ATP binding (Fig. 3b). These functions are essential for enzymatic activity, signal transduction, and energy transfer within cells52.

Fig. 3
figure 3

GO and KEGG enrichment analysis of major targets. (a) The GO enrichment analysis categorized the major targets into biological processes (BP); (b) The GO enrichment analysis categorized the major targets into Cellular components (CC), and molecular functions (MF); (c) The KEGG enrichment analysis identified key pathways involving the selected targets.

The KEGG enrichment analysis highlighted a comprehensive network of 25 signaling pathways associated with liver disease and other related conditions, as depicted in Fig. 3c. Some of these pathways are crucial in the context of cancer and liver disease progression. Notable pathways include those involved in acute myeloid leukemia, the Toll-like receptor signaling pathway, and the JAK-STAT signaling pathway, all of which play crucial roles in immune response and inflammation. The phosphoinositide-3-kinase (PI3K)-Akt signaling pathway and the PD-L1 expression and PD-1 checkpoint pathway in cancer are critical for regulating cell survival, proliferation, and immune evasion mechanisms, particularly in oncogenesis. In addition to these, the analysis identified pathways related to specific infections and chronic diseases, such as Kaposi sarcoma-associated herpesvirus infection and the development of hepatitis B and C.

Selection of phytoconstituents

A curated selection of hepatoprotective herbs used in marketed polyherbal formulations was made, and phytocompounds present in these herbs (40 molecules) with IC50 values against HepG2 cell lines were chosen to construct a pharmacophore model for further analysis (Table 2).

Table 2 Hepatoprotective phytocompounds with their IC50 values.

Molecular docking studies

Molecular docking studies were conducted to investigate the interactions between the set of selected 40 phytocompounds and the selected protein targets. Among the eight core targets identified through protein–protein interaction studies, the top five protein targets were selected for docking studies, including AKT1 (PDB ID: 7NH5), TNF (PDB ID: 2AZ5), NF-kβ1 (PDB ID: 1SVC), mTOR (PDB ID: 4JT5), and INFAR1 (PDB ID: 3S98).

Docking with AKT1

The co-crystalized ligand (~ {N}-methyl-6-[4-[[4-[2-oxidanylidene-6-(propanoylamino)-3 ~ {H}-benzimidazol-1-yl]piperidin-1-yl]methyl]phenyl]-5-phenyl-pyridine-3-carboxamide) was redocked within the binding pocket to validate the molecular docking analysis. The redocking yielded a docking score of -49.8 kcal/mol with an RMSD value of 1.295, demonstrating high accuracy of the docking protocol (Fig. 4). Among the selected phytocompounds, Quercetin, Apigenin, Curcumin, Acteoside, and Samarone B exhibited the highest binding affinities for the target AKT1, with docking scores of -36.71, -31.92, -31.46, -30.03, and -29.66 kcal/mole, respectively. Analysis of the binding interactions revealed common interaction types between the ligands and AKT1, predominantly involving hydrogen bonding and hydrophobic interactions (Table 3). Quercetin, for instance, formed hydrogen bonds with Thr211 and Ser205 while engaging in hydrophobic interactions with Leu210, Leu264, Lys268, Tyr272, and Asp292. Apigenin demonstrated hydrophobic interactions with Trp80, Leu210, Leu264, Lys268, Val270, Tyr272, and Asp292, indicating a robust engagement with the hydrophobic pockets of AKT1. The lack of hydrogen bonds suggests that the binding stability of Apigenin is mainly derived through its extensive hydrophobic contacts. Curcumin exhibited both hydrogen bonding with Ala58 and Asn204, along with hydrophobic interactions with several residues, including Leu210, Leu264, Lys268, and Asp292. Similarly, Acteoside formed hydrogen bonds with Asn53, Trp80, Asn204, Thr211, and Tyr272, while interacting hydrophobically with Gln79, Thr82, Leu210, Leu264, Lys268, Val270, and Asp292. Samarone B also engaged in hydrogen bonding with Ser205 and hydrophobic interactions with Gln79, Leu210, Leu264, Lys268, Tyr272, Arg273, Asp274, and Asp292.

Fig. 4
figure 4

Docking interaction of co-crystalized ligand and top hit with AKT1.

Table 3 Docking score and interacting residues between AKT1 and selected ligands.

Overall, the analysis revealed that Quercetin, Curcumin, and Acteoside exhibit both hydrogen bonding and hydrophobic interactions, contributing to their high binding affinity with AKT1. In comparison, Apigenin and Samarone B rely more on hydrophobic interactions, which are crucial for their binding stability. The consistent involvement of Leu210, Leu264, Lys268, Tyr272, and Asp292 residues across multiple ligands underscores their importance in ligand binding and stabilization within the AKT1 active site.

Docking with TNF

The accuracy of the docking approach in predicting ligand–protein interactions was confirmed through validation experiments involving redocking of the co-crystalized ligand, (6,7-dimethyl-3-[(methyl{2-[methyl({1-[3-(trifluoromethyl)phenyl]-1H-indol-3-yl}methyl)amino]ethyl}amino)methyl]-4H-chromen-4-one) with the target TNF. This procedure produced a docking score of − 19.5 kcal/mole with an RMSD value of 0.972, demonstrating a close alignment between the expected and experimental binding poses (Fig. 5). Among the tested compounds, Sanguiin H-4, Curcumin, Rosmarinic Acid, Jambones F, and Ellipticine revealed the highest binding affinities to the target TNF, with docking scores of − 24.12, − 23.88, − 23.45, − 21.72, and − 21.67 kcal/mole, respectively (Table 4). These scores indicate a strong potential for these compounds to interact effectively with TNF, suggesting their potential as therapeutic agents for modulating TNF-related pathways in hepatic disorders. The molecular docking analysis of the top five phytoconstituents and co-crystalized ligand with TNF revealed common types of interactions that contribute to their binding affinity. All these compounds demonstrated significant hydrophobic interactions with key amino acid residues in the TNF binding site, for example, Leu57, Tyr59, Tyr119, Leu120, Gly121, Gly122, and Tyr151. These interactions display a crucial role in stabilizing the compounds within the binding pocket of TNF by maximizing van der Waals forces. Additionally, hydrogen bonding was also observed in some compounds, contributing further to binding affinity and specificity. For instance, Sanguiin H-4, Rosmarinic Acid, Jambones F, and Ellipticine all formed hydrogen bonds with Gly121, enhancing their interaction with TNF. These consistent interaction patterns across different compounds highlight the critical role of hydrophobic interactions supplemented by hydrogen bonds in achieving strong binding affinity with TNF, thereby supporting their potential as effective hepatoprotective agents.

Fig. 5
figure 5

Docking interaction of co-crystalized ligand and top hit with TNF.

Table 4 Docking score and interacting residues between TNF and selected ligands.

Docking with NF-kβ1

To identify the ligand binding sites on the NF-kβ1 protein (PDB ID: 1SVC), the ICM Pocket Finder module was employed. This module utilizes computational algorithms to predict potential binding pockets within the protein structures. Subsequently, the top-ranked pocket exhibiting the highest drug-like density and volume was chosen as the active site for molecular docking studies. Among the selected phytocompounds, Ellagic acid, Caffeic acid, Rosmarinic acid, Quercetin and Gallic acid exhibited the highest binding affinity, as evidenced by their dock scores of − 29.14, − 24.98, − 24.39, − 22.76, and − 21.54 kcal/mole, respectively. Docking scores, binding interactions, and the nature of interactions between these molecules and NF-kβ1 have been presented in Table 5 and Fig. 6 respectively. The binding analysis reveals that both hydrogen bonding and hydrophobic interactions play crucial role by which these phytocompounds engage in interactions with NF-kβ1. Hydrogen bonds with Gly68, Arg57, and Ile142 residues provide stability, while hydrophobic interactions with residues such as Phe56, Arg59, Val115, Gly141, and Ile142 enhance binding affinity.

Table 5 Docking score and interacting residues between NF-kβ1 and selected ligands.
Fig. 6
figure 6

Docking interaction of top two ligands with NF-kβ1.

Docking with mTOR

To ensure the accuracy of docking results, the co-crystalized ligand, (2-[4-amino-1-(propan-2-yl)-1H-pyrazolo[3,4-d]pyrimidin-3-yl]-1H-indol-5-ol), was redocked with mTOR revealing a favorable docking score of − 40.6 kcal/mole. The close alignment between the predicted and experimental ligand binding poses, with an RMSD value of 0.571, confirmed the accuracy of the docking results in predicting ligand–protein interactions. The molecular docking analysis with mTOR identified Quercetin, Apigenin, Sanguiin H-4, Caffeic Acid, and Ferulic acid as the top-performing phytoconstituents in terms of their binding affinity to the target, with respective dock scores of − 34.75, − 32.93, − 32.39, − 31.45, and − 29.12 kcal/mole. Detailed information on docking scores, binding interactions, and the nature of interactions between these phytoconstituents and mTOR is provided in Table 6 and Fig. 7 respectively. Analyzing the interaction profiles, hydrogen bonds were frequently formed with key residues such as Gly2238 and Val2240, facilitating stable ligand–protein complex formation. Additionally, hydrophobic interactions were observed with residues like Leu2185, Lys2187, Tyr2225, Ile2237, Met2345, and Ile2356, contributing to the overall binding affinity of the compounds to mTOR.

Table 6 Docking score and interacting residues between mTOR and selected ligands.
Fig. 7
figure 7

Docking interaction of co-crystalized ligand and top hit with mTOR.

Docking with IFNAR1

The ligand binding sites of IFNAR1 protein (PDB ID: 3S98) were identified using the ICM Pocket Finder module, which employs computational algorithms to predict potential binding pockets within protein structures. After rigorous analysis, the top-ranked pocket, characterized by its notable drug-like density and volume (441.7⁰A), was designated as the active site for subsequent molecular docking experiments. Notably, among the selected phytocompounds, including Quercetin, Apigenin, Ferulic acid, Curcumin, and Coumaric acid, the molecular docking results revealed remarkable binding affinities (Table 7). Specifically, Quercetin exhibited the highest docking score of − 27.21 kcal/mole, followed by Apigenin with − 23.25 kcal/mole, Ferulic acid with − 22.26 kcal/mole, Curcumin with − 21.97 kcal/mole, and Coumaric acid with − 21.61 kcal/mole (Fig. 8). The analysis highlights the importance of hydrogen bonding and hydrophobic interactions in mediating the binding between these phytoconstituents and IFNAR1. Remarkably, Lys161 and Tyr163 residues appear to play pivotal roles in facilitating hydrogen bonding interactions across multiple compounds, while residues like Tyr157 and His160 contribute significantly to hydrophobic interactions. This detailed examination sheds light on the diverse interactions through which these phytocompounds interact with IFNAR1, offering valuable information for further exploration in hepatoprotective applications.

Table 7 Docking score and interacting residues between IFNAR1 and selected ligands.
Fig. 8
figure 8

Docking interaction of top two ligands with IFNAR1.

Pharmacophore modeling

Using Molsoft ICM-Pro V.3.9.2a software, we constructed a ligand-based pharmacophore utilizing the common feature pharmacophore generation protocol. To achieve this, we selected the top five molecules from each docking study, including Quercetin, Apigenin, Curcumin, Acteoside, Samarone B, Sanguiin H-4, Rosmarinic Acid, Ellipticine, Jambone F, Caffeic Acid, Gallic Acid, Ellagic Acid, Ferulic Acid and Coumaric Acid. Their 3D structures were flexibly superimposed, allowing for the identification of shared structural characteristics among the selected compounds. Subsequently, a consensus PH4 (pharmacophore) model was generated utilizing the APF tool, facilitating the elucidation of the presence and spatial arrangement of key functional groups, including hydrogen bond donors (volume: 288°A), indicated in blue; acceptors (volume: 1422°A), in red; lipophilic regions (volume: 646°A), in yellow; and aromatic moieties (volume: 4376°A in grey. Figure 9 illustrates the spatial arrangement of these features within the consensus PH4 model, offering a visual representation of the ligand-binding landscape.

Fig. 9
figure 9

Key pharmacophoric features of the developed model.

Virtual screening

Virtual screening was executed using the Molsoft ICM-Pro V.3.9.2a software against the ZINC natural products and ZINC natural derivatives subsets within the ZINC database. A ligand-based pharmacophore approach was employed to screen a total of 1089 compounds for their hepatoprotective activity. Out of the screened compounds, 10 were identified as potential hits based on their predicted scores. Pharmacophore mapping revealed that the hit compounds aligned well with the generated pharmacophore features, indicating their potential to interact with key molecular targets involved in hepatoprotection. Hit compounds exhibited diverse chemical structures, with common structural motifs including aromatic rings, hydrogen bond acceptors, hydrogen bond donors, and lipophilic portions. The detailed structures, predicted scores, and ZINC IDs of the hit compounds are provided in Table 8 and Fig. 10.

Table 8 Chemical structures, predicted scores, and ZINC IDs of the hit compounds.
Fig. 10
figure 10

Superposition of identified molecules on key pharmacophoric features of the developed model.

Molecular redocking of top hits with selected proteins

The results of molecular redocking studies against the selected protein targets revealed promising interactions between the identified phytocompounds and the protein binding sites. All the compounds displayed comparable binding scores comparable to the set of molecules used to build the pharmacophore model, validating the capability of the pharmacophore model to identify hits with high binding affinities across multiple protein targets (Table 9).

Table 9 Dock score between selected ligands with proteins.

Among the identified hits, the compound (2S,5E)-2-(3,4-Dihydroxybenzyl)-6-(3,4-dihydroxyphenyl)-4-oxo-5-hexenoic acid (ZINC70454608) demonstrated strong binding affinities with docking scores of − 29.51 kcal/mole for AKT1, − 28.25 kcal/mole for NF-kβ1, and − 27.81 kcal/mole for mTOR (Fig. 11). Similarly, the molecule ZINC14436469 [5′-Hydroxymorin] showed significant binding affinities with docking scores of − 35.98 kcal/mole for AKT1, − 37.79 kcal/mole for NF-kβ1, and − 32.00 kcal/mole for mTOR (Fig. 12).

Fig. 11
figure 11

Binding interactions of (2S,5E)-2-(3,4-Dihydroxybenzyl)-6-(3,4-dihydroxyphenyl)-4-oxo-5-hexenoic acid [ZINC70454608] with selected proteins.

Fig. 12
figure 12

Binding interactions of ZINC14436469 [5'-Hydroxymorin] with selected proteins.

Both compounds exhibited hydrogen bonds and hydrophobic interactions with specific amino acids in each target, mirroring the interactions of the molecules used to generate the pharmacophore model (Table 10). These consistent interactions across different proteins highlight the potential of these compounds to modulate key pathways implicated in hepatic disorders, suggesting their potential as multi-targeted therapeutic candidates for liver diseases.

Table 10 Docking score and interacting residues between selected ligands and proteins.

Molecular dynamics simulation

Following a thorough analysis of docking results, the complexes, hit 2 (ZINC70454608) and 9 (ZINC14436469) with the highest predictive binding energy with the proteins AKT1 (PDB ID: 7NH5), TNF (PDB ID: 2AZ5), NF-kβ1 (PDB ID: 1SVC), mTOR (PDB ID: 4JT5), and IFNAR1 (PDB ID: 3S98) were selected for Molecular Dynamics (MD) simulations. The overall stability of these complexes was assessed over a 500 ns simulation period by analyzing root-mean-square deviation (RMSD), root-mean-square fluctuations (RMSF), radius of gyration (ROG), and intramolecular hydrogen bonding, which is sufficient to observe the stable configurations of backbone atoms in complex with the lead compound.

Root mean square deviation (RMSD)

In MD simulations, RMSD indicates the stability of a protein by measuring the deviation in the structure of the protein or protein–ligand complex compared to the initial docking structure. The results of MD simulations of hit 2 and 9 with the different proteins revealed that all complexes exhibited stable behavior, as indicated by the RMSD values (Supplementary Fig. S1). Among the five selected proteins, hit 2 exhibited stable behavior with mTOR (average RMSD ~ 3.99 Å) when compared to NF-kβ1 (average RMSD ~ 6.4 Å). The RMSD of mTOR-hit 2 complex was stable around 3.99 Å during the simulation time of 500 ns, signifying that the complex did not experience large conformational fluctuations (Fig. 13a). Furthermore, ligand RMSD stabilized around 3.2 Å up to 350 ns, which was suddenly raised to 5.6 Å and again after dropped to 4.0 Å at the end of the simulation, suggesting that the ligand is stably bound to the protein binding site and has not diffused away from the bound position (average RMSD ~ 3.51 Å). The RMSD of mTOR- co-crystalized ligand complex was stable around 4.4 Å during the simulation time and ligand RMSD stabilized around 2.8 Å, validating stability of docked complex (Supplementary Fig. S2).

Fig. 13
figure 13

MD simulation data of mTOR-hit 2 complex: (a) RMSD plot; (b) RMSF plot; (c) RoG plot; (d) Protein–ligand interaction histogram plot; (e) Schematic presentation of protein–ligand interactions.

The MD simulation findings for hit 9 with the five studied proteins showed that the NF-kβ1-Hit 9 complex demonstrated the least stability, while the AKT1-hit 9 complex demonstrated the most stable behavior among all the complexes (Fig. 14a). For AKT1 in the presence of hit 9, the RMSD fluctuated from 1.8 to 2.6 Å over the first 100 ns, then stabilized around 2.8 Å until 400 ns, after which it increased slightly to 3.2 Å but remained stable till the end of simulation. The RMSD of the ligand in complex with AKT1 was stable around 3.0 Å for initial 200 ns, then stabilized around 2.5 Å until end of simulation indicating the overlapping of ligand and protein backbone with the formation of a stable complex. The RMSD of the AKT1–co-crystallized ligand complex remained stable at approximately 3.0 Å throughout the simulation, while the ligand RMSD stabilized around 2.4 Å, confirming the stability of the complex and the reliability of the simulation process (Supplementary Fig. S3).

Fig. 14
figure 14

MD simulation data of AKT1-hit 9 complex: (a) RMSD plot; (b) RMSF plot; (c) RoG plot; (d) Protein–ligand interaction histogram plot; (e) Schematic presentation of protein–ligand interactions.

Overall, the mTOR-hit 2 and AKT1-hit 9 complexes were the most stable among the studied protein–ligand pairs, as reflected by their stable RMSD values throughout the simulation period, indicating minimal conformational changes and strong binding interactions, which can be correlated with potential high biological activity. These findings suggest that hit 2 and hit 9 are promising candidates for further development as inhibitors of mTOR and AKT1, respectively, offering potential therapeutic applications.

Root mean square fluctuation (RMSF)

The RMSF value of a protein is typically measured to assess variations in the side chains of the protein due to ligand binding. The RMSF plot represents structural regions of the protein with vertical bars: brown for α-helices, teal for β-sheets, and white for loop regions. Additionally, green lines perpendicular to the X-axis indicate the amino acid residues interacting with the ligand. Peaks in the plot denote areas of the protein that exhibit the most fluctuation during the simulation. Generally, the N-terminal and C-terminal tails fluctuate more than other parts of the protein. The RMSF plots of the contact between ligands and the protein residues are provided in Supplementary Fig. S4. The RMSF values of the protein backbone residues of the mTOR-hit 2 complex over a 500 ns period ranged from 5.9 to 1.5 Å, with some residues exhibiting higher flexibility (Fig. 13b). Notably, significant fluctuations were observed in residues Ser1558, Leu1559, Ala1560, Arg1611, Lys1867, Ser2091, and Asn2093 with RMSF values of 8.12 Å, 7.78 Å, 6.743 Å, 6.236 Å, 6.231 Å, 6.791 Å, and 6.717 Å, respectively, indicating higher flexibility in these regions. Additionally, the RMSF plot shows that the green lines, which represent the interacting amino acid residues, differ from the highly fluctuating residues, supporting the validity of the results. The average RMSF value for AKT1 upon binding of Hit 9 is 1.22 Å, with all binding cavity residues fluctuating within a range of 0.4–8.06 Å. This indicates fluctuation with relative secondary conformational stability of the protein upon binding of the lead compound. The RMSF plot of the AKT1-Hit 9 complex, displayed in Fig. 14b, shows the highest fluctuations in the residues Gln43, Asp44, Lys111, Asn204, Phe442, and Thr443, which did not interact with the ligand.

Radius of gyration (ROG)

With the aim of evaluating the stability of the ligands within the binding pockets of proteins over a 500 ns simulation period, the radius of gyration (ROG) was also investigated (Supplementary Fig. S5). The ROG value indicates how stretched a ligand is, which corresponds to its primary moment of inertia. Hit 2 and Hit 9 showed ROG values ranging from 4.439 to 5.407 Å and 3.83–4.022 Å, respectively, in the complexes with mTOR and AKT1 (Figs. 13c and 14c). Despite minor variations, the ROG values remained within these particular ranges, suggesting structural stability. This suggests that the proteins maintained their integrity while undergoing conformational changes, supporting the potential effectiveness of these ligand–protein interactions.

Hydrogen bonding

Protein–ligand interaction analysis of hit 2 and hit 9 with the selected proteins revealed that hydrogen bonding was the primary stabilizing force in the complexes (Supplementary Fig. S6). For the mTOR-hit 2 complex, the total number of contacts between mTOR and Hit 2 varied between 1 and 7 throughout the simulation. Key amino acid residues involved in hydrogen bonding with hit 2 included Lys2171, Trp2239, Val2240, and Asp2357, which formed bonds for 100% of the simulation time, and Lys2187, which formed bonds for 75% of the simulation time (Fig. 13d). Additionally, Leu2185, Asp2195, and Ile2356 were involved in hydrophobic interactions with hit 2 for 75% of the simulation time (Fig. 13e). The AKT1-hit 9 complex, the number of contacts between the protein and ligand ranged from 1 to 7 (Fig. 14d). Hydrogen bonds were formed with amino acid residues Gln79, Thr211, and Ile290. Hydrophobic interactions were observed with Trp80 for 100% and Leu210, Val270 for 40% of the simulation time (Fig. 14e). The dynamic nature of the varying number of hydrogen bonds highlights the flexibility of the interactions, showcasing alternating periods of stability.

Solvent accessible surface area (SASA)

SASA are calculated to monitor the stability of the MD simulations. The SASA plots reveal the extent of solvent exposure for the unbound and bound receptor states. The reduction in SASA upon ligand binding confirms the structural stabilization of the receptor ligand complex. The SASA analysis highlight that AKT1-co crystalized ligand complex (7nh5_UCB502) shows the largest SASA reduction, indicating most stable complex (Fig. 15).

Fig. 15
figure 15

MD simulation analysis of 1000 Frame work of Solvent accessible surface area: (a) mTOR-co crystalized ligand complex (4JT5_P2X); (b) AKT1-co crystalized ligand complex (7nh5_UCB502); (c) AKT1-hit 9 complex (7nh5_AKT1); (d) mTOR-hit 2 complex (4JT5_WmTOR).

Molecular mechanics generalized born surface area (MM-GBSA) calculations

Using the Molecular Mechanics Generalized Born Surface Area (MM-GBSA) approach, the binding free energies of the protein–ligand complexes were computed in order to provide understanding into the strength and stability of the interactions. The MMGBSA results for the complexes of hit 2 and hit 9 with the five target proteins: AKT1, TNF, NF-kβ1, mTOR, and IFNAR1 are presented in Supplementary Table T1.

The results in Table 11 indicate that complexes 4JT5_P2X, 7nh5_UCB502, 7nh5_AKT1 and 4JT5_WmTOR exhibited ΔGbind of − 56.45 kcal/mol, − 76.89 kcal/mol, − 32.41 kcal/mol and − 61.69 kcal/mol respectively. The electrostatic interactions (ΔGbindCoulomb) show the strongest contribution for 7nh5_AKT1 (− 99.90 kcal/mol), whereas 4JT5_P2X (− 20.24 kcal/mol) (− 61.55 kcal/mol) and 4JT5_WmTOR has a moderate contribution. The complex 7nh5_UCB502 (− 7.58 kcal/mol) show least electrostatic interactions. Covalent interactions contribute minimally to binding, and hydrogen bonding is relatively weak. Lipophilic interactions favor 7nh5_UCB502 the most while 7nh5_AKT1 exhibits the least contribution. Vander Waals interactions play significant role, particularly in 7nh5_UCB502 and 4JT5_P2X. Solvation energy destabilizes complexes. Overall, 7nh5_UCB502 has the most stable binding due to strong Vander Waals and lipophilic interactions, coupled with relatively moderate solvation penalties. On other hand, 7nh5_AKT1 shows weak binding, due to desolvation penalty, despites favorable electrostatic interactions.

Table 11 Post dynamic-MMGBSA based Binding free energy for the protein ligand complexes. 4JT5_P2X, 7nh5_UCB502, 7nh5_AKT1 and 4JT5_WmTOR calculated by MM-GBSA.

ADMET and drug likeliness properties prediction

The ADMET and drug-likeliness properties of the hits 2 and 9 were listed in Table 12. Based on the predicted results, both compounds met the standards for drug-like properties and molecular characteristics, with only minor violations in some criteria, which are considered acceptable.

Table 12 ADMET and drug likeliness properties of the hit 2 and hit 9.

The drug-likeness properties of hit 2 (Table 12) indicate compliance with Lipinski, Ghose, and Veber’s rules, with a single violation of lead likeliness due to its molecular weight, which exceeds the accepted value of 350. Pharmacokinetics data indicated good absorption, plasma protein binding (92.29%), and a significant volume of distribution (0.388). The compound does not inhibit cytochrome P450, indicating that it is unlikely to interfere with the metabolic processes mediated by these enzymes. However, the compound is identified as a substrate for P-gp and a non-inhibitor of P-gp, indicating it does not interfere with the P-glycoprotein-mediated transport of other drugs. In the case of hit 9, it obeyed Ghose rule, with one exception in Lipinski and Veber’s rules. Hit 9 shows hydrogen bond donor capacity of 6, exceeding the accepted count of 5. Hit 9 displayed a metabolic profile similar to that of hit 2. With a few exceptions, both compounds exhibited bioavailability that generally met drug-likeness criteria and fell within the medium range (T1/2 ≤ 3). In toxicity prediction, both compounds demonstrated non-toxic characteristics in several organs, including the kidneys, liver, skin, and eyes. They do not pose immediate acute toxicity risks at typical exposure levels. The low score for carcinogenicity indicates a reduced risk of cancer, while the low score for non-biodegradability suggests minimal concerns about environmental toxicity and supports its utility for sustainable development.

Discussion

The integration of artificial intelligence (AI) into the field of natural product research has opened new avenues for understanding the mechanistic pathways of bioactive constituents present in traditional medicines68. Moreover, modern analytical tools such as, Flash Chromatography, Liquid Chromatography-Mass Spectroscopy (LC–MS), High Performance Thin Layer Chromatography (HPTLC) are being widely applied in research to help identify the bioactives that are responsible for biological efficacy69. In addition to these tools, the growing use of network pharmacology and computational studies is rapidly advancing research paradigms to a newer level70. Notable studies, such as those by Sarkar et al.71 investigated the pharmacological mechanisms of herbal plants for treating hepatic disorders using network pharmacological analysis. Similarly, Ouyang et al.72 used network pharmacology to study the pharmacological mechanisms of Wuzhuyu decoction, which comprises of four herbs, for hepatocellular carcinoma. These studies highlight the power of computational approaches in elucidating the multi-targeted and multi-pathway mediated activities of herbs. Although current research is increasingly focused on exploring the mechanistic aspects of hepatoprotective herbs, it is worth noting that the majority of hepatoprotective plants and formulations within traditional medicine have not been extensively studied or investigated in detail till date. This gap highlights the need for extensive and systematic scientific work to validate and expand the understanding of their therapeutic potential. In response to this need, we integrated cutting-edge computational techniques with traditional herbal knowledge to uncover potential hepatoprotective compounds.

The protein–protein interaction network is a key to understanding the working of protein in a coordinated manner within cells to perform various functions. Analysis of PPI network enables the interpretation of complicated processes involved in the occurrence and progression of diseases, facilitating targeted therapies. In this study, PPI network analysis identified AKT1, TNF, NF-kβ1, mTOR, and INFAR1 as the top five targets implicated in the development and progression of liver disorders. These targets are primarily involved in oxidative and inflammatory pathways, collectively underscoring the multifaceted interplay of inflammatory, proliferative, and antiviral mechanisms in the pathogenesis of liver disorders. Chronic inflammation driven by TNF and NF-kβ1 not only directly damages liver cells but also creates an environment favorable to fibrosis and cancer. The PI3K/AKT/mTOR and RAS/RAF/MEK/ERK pathways supply proliferative signals that, when dysregulated, lead to uncontrolled cell growth and cancer. Simultaneously, impairments in antiviral defenses via INFAR1 and STAT2 pathways facilitate chronic infections, further fueling inflammation and fibrosis73,74,75,76,77,78,79,80,81. Moreover, research findings indicate that the primary hepatoprotective mechanisms of the herbal plants often involve the scavenging of free radicals and modulation of inflammatory pathways27.

The GO analysis of these targets highlights their multifaceted roles in critical cellular processes related to liver health. Their involvement in inflammatory response and regulation of I-kappa B phosphorylation connects them to immune and inflammatory processes. The regulation of lipid and nitric oxide biosynthesis further emphasizes their role in maintaining cellular homeostasis. Their participation in protein phosphorylation, AKT signaling, and apoptotic pathways underscores their potential in regulating cell survival and preventing unwanted cell death. Additionally, the roles in gene expression, protein metabolism, and macroautophagy suggest that these proteins may contribute to cellular maintenance and stress responses, making them promising therapeutic targets for liver diseases. The molecular functions related to kinase activity and ATP binding indicate their involvement in critical enzymatic functions and signal transduction, further validating their significance in liver biology. The KEGG enrichment analysis highlighted the signaling pathways involved in the intricate interplay between viral infections and liver pathology, contributing to chronic liver diseases and increasing the chance of HCC. The study also encompassed metabolic and cancer-related pathways, including those implicated in non-alcoholic fatty liver disease, alcoholic liver disease, pancreatic cancer, prostate cancer, and insulin resistance. These pathways collectively underscore the multifactorial nature of liver disease progression, integrating inflammatory, metabolic, and oncogenic processes.

In line with the growing emphasis on computational approaches to uncover the pharmacological potential of bioactive compounds, this study meticulously selected the bioactive components of well-studied hepatoprotective plants, already present in marketed polyherbal formulations, for molecular docking studies. With an emphasis on phytoconstituents with reported IC50 values and compounds tested on HepG2 cell lines, a set of 40 phytocompounds was included to ensure data robustness. Molecular docking studies were conducted to investigate the interactions between the set of 40 phytocompounds and five selected protein targets. Overall, the comprehensive molecular docking studies revealed significant interactions between selected phytoconstituents and key protein targets implicated in liver disorders. Notably, quercetin, curcumin, apigenin, sanguiin H-4, rosmarinic acid, caffeic acid, and ferulic acid consistently demonstrated high binding affinities across multiple protein targets. Their diverse interaction profiles across different protein targets highlight their versatility and efficacy in modulating key pathways implicated in hepatic disorders, thereby suggesting their potential as promising candidates for developing multi-target therapeutic strategies against liver diseases.

Based on the insightful findings from the molecular docking studies, we then proceeded with development of a novel pharmacophore model, which can play a vital role in further screening of compounds. For this purpose, we selected the top five molecules from each docking experiment, including Quercetin, Apigenin, Curcumin, Acteoside, Samarone B, Sanguiin H-4, Rosmarinic Acid, Ellipticine, Jambone F, Caffeic Acid, Gallic Acid, Ellagic Acid, Ferulic Acid and Coumaric Acid. Structural superposition of these 14 molecules identified hydrogen bond donors, acceptors, lipophilic regions and aromatic moieties as key pharmacophoric features believed to be essential for activity. Each of these features plays a pivotal role in ligand-receptor interactions, influencing the binding affinity and specificity of pharmacological agents. Hydrogen bond donors and acceptors facilitate the formation of critical hydrogen bonds between the ligand and receptor, contributing to molecular recognition and stabilization of the complex. Lipophilic regions, on the other hand, provide a hydrophobic environment beneficial to favorable interactions with nonpolar regions of the receptor. Additionally, aromatic moieties contribute to π-π stacking interactions, enhancing ligand binding and potency. The elucidation of these pharmacophoric features is beneficial in rational drug design and optimization, enabling the identification of structurally diverse compounds with enhanced pharmacological properties. By aligning ligand molecules with the complementary features of the receptor, the PH4 model guides the development of novel therapeutics with improved efficacy and selectivity. Moreover, insights gained from pharmacophore modeling facilitate the exploration of structure–activity relationships, guiding iterative modifications to optimize drug candidates.

The resulting pharmacophore model was then screened against the ZINC natural database of 1089 compounds, facilitating the identification of 10 novel compounds with potential hepatoprotective activity. Among the identified hit compounds, 3-O-Methylquercetin, Quercetagetin, and Demethoxycurcumin have been previously studied for their hepatoprotective activity, further validating the robustness of the model. 3-O-Methylquercetin (ZINC5998596), a structural derivative of quercetin, is found in various plant species, including Artemisia incanescens and Halimodendron halodendron82. In the 2016 study reported by Kumar Naveen et al., 3-O-methyl quercetin and kaempferol were isolated from the stem bark of Semecarpus anacardium. These compounds demonstrated protective effects on normal lung and liver cells against H₂O₂-induced cytotoxicity83. Quercetagetin (ZINC5784821), a hexahydroxyflavone, is functionally related to quercetin found in various plant species, including Citrus reticulata (Mandarin orange) and Cupressus sempervirens (Mediterranean cypress). It acts as an antioxidant, antiviral agent, and plant metabolite84. In 2023, Wu et al. reported that quercetagetin effectively alleviated ZEN-induced oxidative damage and liver injury in rabbits by modulating the Keap1-Nrf2-antioxidant response element (ARE) signaling pathway85. Demethoxycurcumin (ZINC1903857764) is a beta-diketone derivative of curcumin, naturally occurring in various plant species, including Curcuma zedoaria, Curcuma xanthorrhiza, Curcuma kwangsiensis, and Etlingera elatior. Demethoxycurcumin acts as an antineoplastic agent and an anti-inflammatory agent86. Cheon et al. (2007) investigated the hepatoprotective effects of curcumin, demethoxycurcumin, and bisdemethoxycurcumin, isolated from Curcuma longa Linn, on hepatocyte injury induced by carbon tetrachloride. Their study revealed that demethoxycurcumin significantly decreased the levels of aspartate aminotransferase, indicating its protective effect on liver cells87.

The compounds with ZINC IDs, including ZINC59587610, ZINC70454608, ZINC40940113, ZINC95909973, ZINC4731234, ZINC35859102, and ZINC14436469, have not yet been explored to date for their hepatoprotective activity in existing studies, representing potential novel candidates for further investigation due to their high predicted scores and distinct chemical structures. Among these, ZINC70454608 and ZINC14436469 emerge as promising candidates for further investigation as potential hepatoprotective agents. ZINC70454608, chemically identified as (2S,5E)-2-(3,4-Dihydroxybenzyl)-6-(3,4-dihydroxyphenyl)-4-oxo-5-hexenoic acid, is a polyphenolic compound structurally related to curcumin. This compound is derived from Curcuma longa (turmeric) and Curcuma xanthorrhiza. Studies have demonstrated its ability to protect cells from β-amyloid insult, suggesting its potential therapeutic utility in the treatment of Alzheimer’s disease88. Similarly, 5′-Hydroxymorin (ZINC14436469), also known as 3,5,7-trihydroxy-2-(2,4,5-trihydroxyphenyl)-4H-chromen-4-one, is a naturally occurring compound. It is found in plants such as Distemonanthus benthamianus and Limonium gmelinii. It belongs to the class of flavonoids, characterized by its chromenone structure and multiple hydroxyl groups. Flavonoids are widely recognized for their varied pharmacological activities, comprising antioxidant, anti-inflammatory, and anticancer properties89. For other compounds (ZINC59587610, ZINC40940113, ZINC95909973, ZINC4731234, and ZINC35859102), specific details regarding their natural sources and biological activities are currently unavailable. Further research is required to explicate their sources, chemical properties, and potential biological activities. Investigating these compounds could provide valuable insights into their pharmacological significance.

To validate the interactions between identified compounds and the protein targets, the top 10 hits were subsequently redocked against five selected protein targets. Among the identified hits, the compound (2S,5E)-2-(3,4-Dihydroxybenzyl)-6-(3,4-dihydroxyphenyl)-4-oxo-5-hexenoic acid (ZINC70454608) and ZINC14436469 (5'-Hydroxymorin) showed significant binding affinities with the specific amino acids in each target, highlighting the potential of these compounds to modulate key pathways implicated in hepatic disorders. Docking provided a fixed view of the binding pose of a compound in the active site of a protein by using the rigid crystal structure of the protein; therefore, molecular dynamics simulations were executed to compute the movements of atoms over time, allowing for the assessment of the dynamic performance and stability of protein–ligand complexes. The hit 2 (ZINC70454608) and 9 (ZINC14436469) complexes with the selected five proteins exhibiting the highest predictive binding energy analyzed through docking were selected for MD simulations. According to MD simulation results, the mTOR-hit 2 complex and AKT1-hit 9 showed the most stable behavior with a binding free energy of − 57.63 ± 3.85 & − 30.59 ± 4.10 kcal/mol, respectively, indicating a strong and favorable interaction. The stability characteristics demonstrated that both the complexes remained stable for the whole MD simulation, hence confirming the accuracy of the molecular docking studies.

A good drug must have the acceptable pharmacokinetic criteria, including absorption, distribution, metabolism, and excretion (ADME), with minimum toxicity. It should be rapidly and efficiently absorbed from the gastrointestinal tract, particularly circulated to its target site, metabolized in a way that supports its pharmacological activity, and excreted completely without causing any harm to the body. The ADMET evaluation of hit 2 and hit 9 highlights their potential as viable drug candidates, with both compounds demonstrating acceptable drug-likeness and pharmacokinetic properties. Although hit 2 exceeded the molecular weight limit for lead-likeness, its good absorption, high plasma protein binding, and ability to cross the blood–brain barrier suggest strong potential for therapeutic use. The lack of cytochrome P450 inhibition is favorable, as it reduces the risk of drug-drug interactions, while its identification as a P-gp substrate indicates its potential for multi-drug resistance involvement, a point to consider in further development. Similarly, hit 9, despite its slight violation of Lipinski’s and Veber’s rules due to an extra hydrogen bond donor, also showed a promising pharmacokinetic profile similar to hit 2. The bioavailability within the medium range for both compounds ensures that they can sustain therapeutic concentrations over a reasonable period. The favorable toxicity profiles, particularly the low carcinogenicity and non-biodegradability scores, further enhance their prospects for development. Overall, these compounds meet key drug-likeness criteria, making them strong candidates for further study and potential drug development.

Overall, the results indicated that the discovery of these potential hit compounds, with predicted hepatoprotective activity, underscores the efficacy of ligand-based pharmacophore screening in the realm of drug discovery. The diverse chemical structures of these compounds and their alignment with pharmacophore features suggest a range of potential molecular mechanisms underlying their hepatoprotective effects. However, further experimental validation, including rigorous in vitro and in vivo studies, is necessary to approve the efficacy and safety profiles of these hit compounds. Specifically, compounds (2S,5E)-2-(3,4-Dihydroxybenzyl)-6-(3,4-dihydroxyphenyl)-4-oxo-5-hexenoic acid and 5′-Hydroxymorin need focused investigation to elucidate their hepatoprotective activity and potential therapeutic significance.

Conclusion

In conclusion, this study employed a multifaceted approach integrating network pharmacology, molecular docking, pharmacophore modeling, virtual screening, and molecular dynamic simulation with traditional herbal knowledge to identify novel compounds for hepatoprotection. Notably, compounds (2S,5E)-2-(3,4-Dihydroxybenzyl)-6-(3,4-dihydroxyphenyl)-4-oxo-5-hexenoic acid (ZINC70454608) and 5′-Hydroxymorin (ZINC14436469) represent particularly interesting candidates due to their high predicted scores and distinct structural motifs. The unique structural features of these compounds can be used as a scaffold for the development of new drugs. Medicinal chemists and researchers around the globe can modify these scaffolds to improve their pharmacokinetic and pharmacodynamic properties, leading to optimization of lead compounds with better efficacy and safety profiles. Further experimental validation through in vitro and in vivo studies is necessary to evaluate the efficacy, safety, and mechanisms of action of these lead compounds, ultimately paving the way for the translation of these findings into clinical applications for the benefit of patients worldwide.