Introduction

Diabetic retinopathy (DR) is the most common microvascular complication of diabetes, and the risk of blindness in diabetic patients is 25 times that of non-diabetic patients1,2. Since the development of diabetes mellitus (DM) is progressive. It causes severe visual impairment, resulting in treatment difficulty if DR is not promptly diagnosed and treated3. Therefore, how to diagnose DR at an early stage and control its progression has become the focus of attention4. However, the current diagnosis of DR based on comprehensive ophthalmic examination is not only costly but also not conducive to rapid diagnosis5. Therefore, it is particularly important to find biomarkers that are more specific for DR and can provide a more rapid diagnosis for elucidating the molecular mechanism of DR occurrence and assisting DR treatment.

Competitive endogenous RNA (ceRNA) networks with the components of ncRNA and mRNA are novel mechanisms with a regulatory role at the transcriptional level. Studies have shown that the interaction between ncRNA and mRNA can regulate target gene expression and regulate disease progression in several diseases, including DR6,7,8. And with the maturation of biotechnology, differentially expressed gene identification supported by gene sequencing technology is increasingly used to explore the pathogenesis of diseases, including DR9.

Intensive studies on diabetes in the last decade have revealed that immune inflammation is one of the major causes of type 2 diabetes and is driven by multiple types of immune cells10,11,12,13. For example, the accumulation of macrophages in the adipose tissue of diabetic patients alters the polarization of macrophages, leading to an increase in M1-type pro-inflammatory macrophages14; antigen-specific cytotoxic T lymphocytes are able to destroy pericytes, leading to vascular collapse in diabetic retinopathy15; intravitreal injections of anti-IL-17 A monoclonal antibodies are effective in improving Müller cell dysfunction, vascular leukocyte arrest and vascular leakage in the retina16. All of these indicate that immune cells play an important role in the development and progression of DR. Therefore, from the perspective of the immune system, evaluating the infiltration of immune cells in the DR microenvironment and determining the differences in the composition of infiltrating immune cells and immune-related genes (IRGs) are of great value for elucidating the molecular mechanism.

Therefore, in this study, we collected three datasets from the GEO database, which contains mRNA, miRNA, and lncRNA transcript expression matrix data from 55 DR samples and 23 healthy control samples. We aimed to analyse the predominantly infiltrating immune cells in DR by combining gene sequencing data with bioinformatics analysis and to obtain the central immune-related genes and their ceRNA networks with potential regulatory roles in DR. The results will provide new diagnostic markers and potential therapeutic targets for diabetic retinopathy patients. The overall flow chart of this study is shown in Fig. 1.

Fig. 1
figure 1

The overall workflow of this study. The figure illustrates the research flow of this paper. Among them, the blue text box represents the main research steps and results, and the orange font represents the databases and datasets used in the research process. There is no difference between the real and imaginary arrows in the figure. DE-mRNAs differentially expressed mRNAs,GO Gene Ontology, KEGG Kyoto Encyclopedia of Genes and Genomes, TSEGs tissue-specific expressed genes, IRGs immune-Related genes, IRHGs immune-related hub genes.

Materials and methods

Access to ncRNA and mRNA expression profiles of DR patients

From the National Institutes of Health (NIH)-National Center for Biotechnology Information (NCBI)-Gene Expression Omnibus (GEO) database (https://www.ncbi.nlm.nih.gov/), we retrieved and downloaded three DR datasets: GSE160306, GSE160308, and GSE102485. The data in datasets GSE160306 and GSE160308 are from the same samples but represent different transcript data, where dataset GSE160306 contains mRNA and lncRNA transcript data, and dataset GSE160308 contains miRNA data from retinal tissue. The datasets (GSE160306 and GSE160308) contained a total of 79 samples, from which we selected retinal tissue samples from 36 DRs and 20 healthy controls for the bioinformatics analyses in this study. Dataset GSE102485 collected neovascular proliferating membranes from 19 patients with type I and type II diabetes and 3 controls for transcriptome analysis.

Therefore, we used dataset GSE160306-mRNA as a test dataset to screen DE-mRNAs in the DR and for subsequent ceRNA network construction; dataset GSE102485, GSE160308, and GSE160306-lncRNA were used as a validation dataset to verify the results of the ceRNA network. The details of these GEO datasets are shown in Table 1. It is worth noting that our approach is reasonable. For example, as a common practice, we17,18,19 and others20 often studied gene expression in pathophysiological conditions.

Table 1 Information of selected datasets.

Immune cell infiltrate analyses

ssGSEA enrichment analysis is a method for assessing immune cell infiltration in samples in a dataset based on the expression levels of immune cell-specific marker genes. ImmuCellAI (ImmuCellAI, http://bioinfo.life.hust.edu.cn/ImmuCellAI) is an online tool that uses ssGSEA enrichment analysis to study immune cell infiltration in different disease expression matrices, including DR21. Upload the gene expression matrix data to ImmuCellAI to obtain the immune cell infiltration matrix. The correlation heatmap shows the correlation of genes in 24 types of immune cell infiltration, and the violin plot shows the difference in immune cell infiltration.

Immune and stromal cells analyses for the datasets

Recently published xCell (xCell, http://xCell.ucsf.edu/) is a new gene signature-based analysis method for inferring the characteristic values of genes in 64 immune and stromal cell species. It has been extensively analyzed in silico and compared to cellular immunophenotypes22. By applying the gene expression matrix data to xCell, the proportion of immune and stromal cell types for each retinal sample gene can be obtained by the GSEA analysis method. The cut-off values for the cell analyses were P < 0.05. Cell types were categorized into lymphoid, myeloid, stromal, stem cells and others. Violin plots are used to visualize the expression of different cell types in the xCell analysis results.

Functional enrichment analyses

The biological functions of identified DE-mRNAs of interest were assessed using the Metascape bioinformatics resources (Metascape, a gene annotation & analysis resource, http://metascape.org, version 3.5)23,24. Briefly, screened DE-mRNAs were imported into Metascape, and GO and KEGG enrichment analyses were then conducted. For GO analyses, enriched biological processes (BPs), molecular functions (MFs), and cellular components (CCs) were assessed, and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analyses. A minimum overlap value of 3, P-value cutoff of 0.05 and minimum enrichment of 1.5 were set as thresholds. Bubble charts are used for the visualisation of the results of the enrichment analysis.

Gene Set Enrichment Analysis (GSEA) is used to assess the distribution trend of genes of a predefined geneset in a table of genes sorted by phenotypic relevance to determine their contribution to the phenotype25. We obtained GSEA software (version 3.0) from the GSEA (http://software.broadinstitute.org/gsea/index.jsp) website, divided the samples into two groups according to disease type, and obtained the data from the Molecular Signatures Database (http://www.gsea-msigdb.org/gsea/downloads.jsp) and downloaded the c2.cp.kegg.v7.4.symbols.gmt subset for assessing related pathways and molecular mechanisms based on gene expression profiles and phenotypic grouping, setting the minimum gene set to 5 and the maximum to 5000, with one thousand resamplings and a P-value < 0.05 was considered statistically significant.

Protein-protein interaction analysis

Protein-protein interaction (PPI) networks were generated from the STRING database (STRING, https://string-db.org/). A score of 0.4 (medium confidence) was set as a threshold. Next, to access hub genes, the visualized regulatory network was constructed using the Cytoscape software (Cytoscape, https://cytoscape.org/)26. CytoHubba was used to identify significant genes in this network as hub genes27. The Maximal Clique Centrality (MCC) algorithm is used to obtain the clustered network and analyze the importance of the network by analyzing the correlation of the degree value of each node with the neighboring nodes, which can then result in the hub gene in the network with the highest value. Based on this algorithm we came to calculate the top 10 hub genes of the PPI network28,29.

Access to immune-related genes

To understand whether immune-related genes (IRGs) are involved in regulation in DR patients, we obtained IRGs from immune-related databases for subsequent studies. The ImmPort database (https://www.immport.org/) and InnateDB database (https://www.innatedb.com) are key databases in immunological studies, covering immune related genes in mammals. The list of immune genes was downloaded from the ImmPort and InnateDB databases separately. The downloaded data were then merged and duplicates removed, resulting in a final list of immune-related genes from the 2 databases. These genes were used for subsequent identification of DR-related immune genes.

Prediction of target miRNAs

The online database-miRcode (http://www.mircode.org/)30 was used to predict the interacting miRNAs with hub genes. Searched for hub genes in miRcode separately and then obtained a list of target miRNAs for each hub gene. The predicted miRNAs were then further validated using the dataset GSE160308 miRNA. According to the screening criteria of adj. P-value < 0.05 and |LogFC|>1, the miRNAs that met the requirements were used as candidate miRNAs.

Construction of ceRNA networks

DIANA-LncBase31 (https://diana.e-ce.uth.gr/lncbasev3, version 3.0) is used to predict lncRNAs that interacted with the selected miRNAs32. The predicted lncRNA was validated in the GSE160306-lncRNA database to obtain target lncRNAs with statistical significance in DR (adj.P-value < 0.05 and |LogFC |>1 as screening criteria). Then the ceRNA network based on the interaction between mRNAs, miRNAs and lncRNAs was constructed using Cytoscape.

Statistical analysis of gene expression matrix

The series matrix files for each of the three datasets were annotated with the official gene symbols and generated read count gene expression matrix files for each dataset. The datasets GSE160306 and GSE160308 were of high-throughput sequencing count data, so the R package DESeq2 (version 1.32.0) was used for the difference analysis of the 2 datasets to obtain the DE-mRNAs, DE-miRNAs, and DE-lncRNAs between the DR group and the control group33. Specifically, we obtained the expression profiling dataset by removing the genes with expression value of 0 greater than 50%, constructing the input matrix using the DESeqDataSetFromMatrix function, further normalising the data using the DESeq function, and finally performing the difference analysis using the results function to obtain the significance of the differences of each gene.

For the dataset GSE102485, the GEO2R (http://www.ncbi.nlm.nih.gov/geo/geo2r/) platform was used in this study to identify differentially expressed mRNAs. GEO2R is an online application based on the R language34. After user-defined grouping, GEO2R directly reads the raw sequence matrix files and platform annotation files uploaded by the submitters and plots the box plots of the distribution of expression values of the selected samples via ‘boxplot’ and ‘limma’ R language scripts and gene table (including P-value, fold change, gene annotation, etc.). Adj.P-value < 0.05 and |LogFC |>1 were set as thresholds for screening differentially expressed genes.

Results

Identification of DE-RNAs in the human retina

The dataset GSE160306-mRNA was included in our study. As shown in the heatmap (Fig. 2A) and volcano plot (Fig. 2B), a total of 27 DE-mRNAs (12 up-regulated and 15 down-regulated) were identified in the GSE160306-mRNA dataset based on the screening criteria of adj.P-value < 0.05 and |LogFC |>1. The heatmap shows the relative differential expression of the 27 DE-mRNAs between DR retinal tissue and normal retinal tissue (Fig. 2A).

Fig. 2
figure 2

Differential expression of mRNAs and lncRNAs in the human retina dataset GSE160306. (A) Volcano plot of mRNA expression in dataset GSE160306-mRNA. The red triangles represent upregulated genes, the grey triangles represent nonsignificant genes, and the blue triangles represent downregulated genes. (B) Heatmap of DE-mRNAs expression in dataset GSE160306-mRNA. Vertical coordinates represent the 27 DE-mRNAs in the dataset GSE160306-mRNAs with the largest changes in expression. Each square in the graph represents the relative expression after Z-score transformation in the corresponding sample: red squares indicate genes with high expression, and the darker the red, the higher the gene expression level; blue squares indicate genes with low expression, and the darker the blue, the lower the gene expression level. The greater the color difference between the DR group and the control group, the greater the fluctuation in gene expression level. (C) Volcano plot of lncRNA expression in dataset GSE160306-lncRNA. The pink triangles represent upregulated genes, the grey triangles represent nonsignificant genes, and the green triangles represent downregulated genes. (D) Heatmap of DE-lncRNAs expression in dataset GSE160306-lncRNA. Vertical coordinates represent the 10 DE-lncRNAs in the dataset GSE160306-lncRNAs with the largest changes in expression. Pink squares indicate genes with high expression; green squares indicate genes with low expression. The greater the color difference between the DR group and the control group, the greater the fluctuation in gene expression level. DE-mRNAs differentially expressed mRNAs, DE-lncRNAs differentially expressed lncRNAs.

For the dataset GSE160306-lncRNA, we also identified DE-lncRNAs using the screening criteria of “adj.P-value < 0.05 and |LogFC|>1”. The results showed that 10 lncRNAs were differentially expressed in the dataset GSE160306-lncRNA, of which 3 were up-regulated and 7 were down-regulated. (Fig. 2C). The heatmap showed that these 10 DE-lncRNAs were differentially expressed between DR retinal tissues and normal retinal tissues (Fig. 2D).

Macrophages are immunomodulatory cells with major infiltration in DR

Data from dataset GSE160306-mRNA was uploaded to ImmucellAI for analysis of the degree of immune cell infiltration in DR retinal samples. The correlation heatmap of 24 immune cells showed a significant positive correlation between natural killer T (NKT) cells and exhausted T cells (Fig. 3A-B); and a significant negative correlation between NKT cells and mucosal-associated invariant T cells (MAIT) and macrophages (Fig. 3A and C-D). Violin plots of immune cell infiltration results showed significant macrophage infiltration in DR retinal tissues compared with normal retinas, while infiltration of natural killer T cells, exhausted T cells, and MAIT cells were reduced in DR retinal tissues (Fig. 3E).

Fig. 3
figure 3

Evaluation and visualization of immune cell infiltration. (A) Correlation heatmap of 24 types of immune cells. The size of the colored squares represents the strength of the correlation; blue represents a negative correlation, and red represents a positive correlation. The darker the color, the stronger the correlation. (B) Scatter plot of the correlation between natural killer T (NKT) cells and exhausted T cells. The blue bars represent the expression of NKT cells in each sample and the yellow bars represent the expression of exhausted T cells in each sample. (C) Scatter plot of the correlation between NKT cells and MAIT cells. The blue bars represent the expression of NKT cells in each sample and the yellow bars represent the expression of MAIT cells in each sample. (D) Scatter plot of the correlation between NKT cells and macrophages. The blue bars represent the expression of NKT cells in each sample and the yellow bars represent the expression of macrophages in each sample. (E) Violin diagram of the proportion of 24 types of immune cells. The red marks represent the difference in infiltration between the two groups.

Then, to further identify the cell types involved in the DR process, we used xCell to convert gene expression data into cell type enrichment scores. The xCell cell type results for the GSE160306 dataset are shown in Fig. 4A-E. Among the 64 types of immune cells, a total of 41 types of immune cells were present in DR. Compared to the control group, there were 5 types of immune cells with statistically significant enrichment in DR, including memory B-cells, eosinophils, lymphatic endothelial cells, plasmacytoid dendritic cells and M2 macrophages. Notably, we found that macrophages were present in the results of immune infiltration analysis by both ImmucellAI and xCell methods, further illustrating the important role of macrophages in the progression of DR (Fig. 4F).

Fig. 4
figure 4

xCell scores of 64 cell types between DR samples and control samples in the GSE160306 dataset. (AE) Violin charts of lymphoid cells, myeloid cells, stem cells, stromal cells and other cells, respectively. (F) Venn diagram of immune cell intersection of ImmucellAI and xCell results.

CSMNs class-switched memory B-cells, CLP common lymphoid progenitor, HSCs hematopoietic stem cells, MEP megakaryocyte-erythroid progenitor, Ecs endothelial cells, SMCs smooth muscle cells, MSCs mesenchymal stem cells.

GO and GSEA functional enrichment analysis

The consequences of GO analysis indicated that these DE-mRNAs were associated with diverse biological processes, molecular functions and cellular components. The top 3 biological processes those DE-mRNAs were involved in were: negative regulation of adaptive immune response based on somatic recombination of immune receptors built from immunoglobulin superfamily domains, negative regulation of adaptive immune response, and inflammatory response; the top 3 significant cellular components related to those DE-mRNAs were: extracellular matrix, external encapsulating structure, and collagen-containing extracellular matrix; the top 2 significant molecular functions in which the DE-mRNAs participated were: immune receptor activity, and protein homodimerization activity (Fig. 5A).

Fig. 5
figure 5

GO/KEGG and GSEA enrichment analysis of DE-mRNA in the GSE160306 dataset. (A) Bubble plot of GO enrichment of DE-mRNAs in dataset GSE160306-mRNA. The most significant BP involved in negative regulation of adaptive immune response based on somatic recombination of immune receptors built from immunoglobulin superfamily domains, negative regulation of adaptive immune response, and inflammatory response; CC involved in extracellular matrix, external encapsulating structure, and collagen-containing extracellular matrix; and MF involved in immune receptor activity, and protein homodimerization activity. (BH) GSEA enrichment analysis of DE-mRNAs from dataset GSE160306-mRNA. DE-mRNAs differentially expressed mRNAs, GO gene ontology, BP biological process, CC cellular components, MF molecular function, GSEA gene set enrichment analysis.

To evaluate the possible signaling pathways and molecular mechanisms in the gene expression matrix of dataset GSE160306, we performed GSEA analysis. The results showed that 7 gene sets were enriched from the c2.kegg subcollection, which are as follows: limonene and pinene degradation, dorso ventral axis formation, glycosphingolipid biosynthesis lacto and neolacto series, other glycan degradation, glycosaminoglycan biosynthesis chondroitin sulfate, ascorbate and aldarate metabolism and long term potentiation (Fig. 5B–H).

PPI network analysis and hub gene sreening

The interaction network between proteins coded by DE-mRNAs, which was comprised of 27 nodes and 6 edges, was constructed by STRING (Fig. 6A). Next, to obtain genes with significant positions in DR, we used the CytoHubba plugin to identify hub genes. Through the MCC algorithm, we identified the top 10 hub genes in the PPI network and visualized them by Cytoscape (Fig. 6B)27. Figure 6C shows the ranking and score of these hub genes under the MCC algorithm, and Table 2 summarises the expression of hub genes in the dataset GSE160306-mRNA.These genes have high values in the PPI network, which means that they may play an essential role in the pathogenesis of DR.

Fig. 6
figure 6

PPI network and hub genes of DE-mRNAs. (A) Interaction network between proteins encoded by DE-mRNA. The network consists of 27 nodes and 6 edges. Each node represents a protein and each edge represents a protein-protein association. (B) CytoHubba identified the top 10 hub gene networks. Red colour represents the gene scoring 2 and green colour represents the gene scoring 1. (C) CytoHubba ranking of top 10 hub genes.

Table 2 Top 10 hub genes identified by MCC algorithms of cytoHubba.

Screening 5 Immune-related hub genes from the Immune Geneset

As the most common autoimmune disease, a better understanding of immune-related mechanisms in DR is an important part of current research. The discovery of differentially expressed immune genes in DR retina may contribute to the specific diagnosis and treatment of DR. Therefore, we collected and organized 5232 IRGs (Supplementary materials-Table S1) from the ImmPort and InnateDB databases, and intersected 10 hub genes of CytoHubba with 5232 IRGs (Fig. 7A). Eventually, we obtained five immune-related hub genes (four up-regulated and one down-regulated): SLPI (adj.P-value = 0.014, LogFC = 2.33) (Fig. 7B), ACAN (adj.P-value < 0.001, LogFC = 1.89) (Fig. 7C), FCGR2B (adj.P-value = 0.008, LogFC = 1.19) (Fig. 7D), C4A (adj.P-value = 0.041, LogFC = 1.08) (Fig. 7E) and CR2 (adj.P-value = 0.018, LogFC = -1.65) (Fig. 7F).

Fig. 7
figure 7

5 immune-related hub genes and their expression in the dataset GSE160306. (A) Violin diagram of immune-related genes and hub genes. (DF) Violin plots of the expression of the 5 hub genes in dataset GSE160306. Pink represents the DR group and blue represents the control group.

3 Candidate miRNAs were identified

The miRCode online database was used to predict upstream miRNAs for SLPI, ACAN, FCGR2B, C4A and CR2, and we obtained 253 effector miRNAs for five hub genes (Supplementary materials-Table S2). The dataset GSE160308 miRNA contains 39 DR samples from humans and 20 healthy control samples. Then, differential expression analysis was performed on the GSE160308-miRNA dataset using the DESeq2 algorithm, and 17 DE-miRNAs were obtained (based on the screening criteria of adj.P value < 0.05 and |LogFC |>1). The volcano plot shows the miRNA distribution and expression levels in the dataset GSE160308, where red triangles represent up-expressed miRNAs and green triangles represent down-expressed miRNAs (Fig. 8A). The heatmap demonstrates the relative expression of the 17 DE-miRNAs between the DR group and the control group (Fig. 8B). Then, 253 predicted miRNAs were taken to intersect with the 17 DE miRNAs, and the validated candidate miRNAs were obtained: miR-10a-5p (adj.P-value < 0.001, LogFC = 2.43) (Fig. 9A), miR-507 (adj.P-value = 0.045, LogFC=-1.22) (Fig. 9B) and miR-23c (adj.P-value < 0.001, LogFC=-1.75) (Fig. 9C). These three shared miRNAs are considered regulatory miRNAs of hub genes.

Fig. 8
figure 8

Differential expression of miRNAs in dataset GSE160308. (A) Volcano plot of miRNA expression in dataset GSE160308. The red triangles represent upregulated genes, the black triangles represent nonsignificant genes, and the green triangles represent downregulated genes. (B) Heatmap of 17 DE-miRNAs expression in dataset GSE160308. Vertical coordinates represent the 17 DE-miRNAs in the dataset GSE160308 with the largest changes in expression. Red squares indicate genes with high expression; green squares indicate genes with low expression. The greater the color difference between the DR group and the control group, the greater the fluctuation in gene expression level.

Fig. 9
figure 9

Related gene expression in ceRNA networks. (A) Violin plot of miR-10a-5p expression in dataset GSE160308. (B) Violin plot of miR-507 expression in dataset GSE160308. (C) Violin plot of miR-23c expression in dataset GSE160308. (D) Venn diagram of the intersection of LncBaes-lncRNA with the dataset GSE160306-lncRNA. (E) Histogram of DDN-AS1 expression in dataset GSE160306. (F) Histogram of LINC01515 expressed in dataset GSE160306. (G) CeRNA network. The ceRNA network consists of 2 lncRNAs, 3 miRNAs and 5 immune-related hub genes. (H) FCGR2B expression in dataset GSE102485. (I) C4A expression in dataset GSE102485.

Identified 2 regulatory lncRNAs and constructed a ceRNA network

Similar to the prediction of miRNAs, we used the LncBase online database to predict lncRNAs that interacted with 3 candidate miRNAs. From the LncBase database, we predicted a total of 723 lncRNAs (Supplementary materials-Table S3). This implies that these lncRNAs have potential roles in regulating miR-10a-5p, miR-507, and miR-23c. However, it is still unclear whether these lncRNAs are present in the DR and whether these lncRNAs are expressed at statistically different levels in the DR. Therefore, we used the dataset GSE160306-lncRNA to validate the predicted lncRNAs in the DR. Of the 723 predicted lncRNAs, only 2 lncRNAs were found in the 10 DE-lncRNAs of the dataset GSE160306-lncRNAs, namely: DDN-AS1 (adj.P-value = 0.008, LogFC=-1.214) (Fig. 9E) and LINC01515 (adj.P-value = 0.013, LogFC = 1.383) (Fig. 9F). This suggests that DDN-AS1 and LINC01515 may regulate the expression of hub genes by interacting with candidate miRNAs in DR.

To this extent, through bioinformatics analysis, we have explored and identified 2 lncRNAs, 3 miRNAs, and 5 mRNAs. This suggests that both coding and non-coding RNAs may play a regulatory role in the progression of DR. We then used Cytoscape to concatenate these genes into a network and obtained a network of ceRNAs targeting 5 hub genes (Fig. 9G).

Validated and obtained the immune-related hub gene-FCGR2B and its ceRNA network

Then, to further validate the stability of the 5 candidate IRHGs in DR, we selected dataset GSE102485, which also derived from the human retina, to validated it.From Fig. 6-G, we found that SLPI lacked upstream miRNA and lncRNA molecules, while CR2 and ACAN did not have a complete ceRNA regulatory pathway, so we further verified the reliability of C4A and FCGR2B. The results showed that among C4A and FCGR2B, only FCGR2B was validated with adj.P-value < 0.001, LogFC = 1.827 (Fig. 9H-I). To this end, we obtained complete and validated ceRNA networks: the DDN-AS1/miR-10a-5p/FCGR2B and LINC01515/miR-10a-5p/FCGR2B networks. (Validation of SLPI, CR2 and ACAN in dataset GSE102485 is described in the Supplementary materials - Figure S1A-C.)

Discussion

As the most common microvascular complication of diabetes, patients with DR have a 25-fold higher risk of blindness than those without diabetes35. Due to the lack of detailed understanding of DR, patients with DR often lose the best opportunity for treatment, resulting in a poor prognosis. Therefore, the search for key regulatory processes during the progression of DR is of profound importance to improving the prognosis of DR patients. The regulation of gene expression is an important process of biological function. The abnormal expression of genes is often closely associated with a range of pathological conditions, including DR.With the rapid development of science and technology, bioinformatics technology can provide us with transcript information of plasma, atrial fluid, vitreous, and even neonatal retinal tissues, which provides a powerful strategy to study the specific pathogenesis of DR.

In this study, we performed bioinformatics analysis of DR datasets from the GEO database. The results showed that a total of 27 DE-mRNAs were identified in the dataset GSE160306, and GO/KEGG and GSEA enrichment analyses indicated that these DE-mRNAs were associated with cellular defence responses and immune responses. This study then used the ImmuCellAI and xCell platforms to analyse immune cell infiltration in DR retinal tissues, and the results of the 2 platforms showed that macrophages have a significant weight in DR disease. Subsequently, based on the STRING online database, we established a PPI network of DE-mRNAs from which we screened the top 10 hub genes. Whereas the list of immune-related genes (IRGs) (5235 IRGs) downloaded from ImmPort and InnateDB databases revealed that only five of the 10 hub genes, SLPI, ACAN, FCGR2B, C4A and CR2, were associated with immune regulation. This suggests that these 5 immune-related hub genes (IRHGs) may be critical nodes in DR and have the potential to determine the progression of DR. Finally, by predicting the upstream miRNAs and lncRNAs of IRHGs, 2 complete immune-associated ceRNA networks were obtained: DDN-AS1/miR-10a-5p/FCGR2B and LINC01515/miR-10a-5p/FCGR2B, and changes in the expression level of IRHG-FCGR2B were also validated in the dataset GSE102485 was validated. This further suggests that the immune-associated ceRNA network targeting FCGR2B may have a potential regulatory role in DR.

The immune response characterized by activation of classical immune cells is considered one of the most important mechanisms in DR, and currently there is still a lack of research on immune cell infiltration in DR. Therefore, it is important to clarify the types and functions of immune cells involved in DR for the treatment of DR. In this study, we used the ImmuCellAI and xCell platforms to analyse immune cell infiltration in DR and demonstrated that macrophages were present and statistically significant in the results of both analyses. Abu et al. found that the levels of CD163+M2 macrophages were positively correlated with the levels of VEGF in the epiretinal fibrovascular membrane tissue of patients with proliferative diabetic retinopathy (PDR), suggesting that M2 macrophages may be involved in the angiogenesis of PDR36; Van Hove et al. analyzed the transcriptome of single cells in mouse retina and revealed the existence of macrophages in DR37. Thus, our studies and those of the above scholars have demonstrated that macrophages take an important role in the immunomodulatory processes of DR, especially in pro-angiogenesis and fibrosis.

As well as macrophages, other types of immune cells have also been studied in diabetes. For example, a previous study showed a significant increase in vascular permeability of retinal tissue after artificial depletion of NKT cells in DR model mice, suggesting that the degree of NKT cell infiltration was positively correlated with the severity of DR38. MAIT is an important mucosal immune effector cell. Studies have shown that MAIT cells can reduce the positive proportion of diabetes-related autoantibodies39; additional to this, the diabetic state also induces Müller neuroglia cells to secrete vascular endothelial growth factor (VEGF) and TNF-α, which further aggravates the progression of DR40. These demonstrate that immune cells are important regulators of DR disease transformation and progression and cannot be ignored in DR progression.

GO functional enrichment analysis of DE-mRNAs showed that these DE-mRNAs are mainly involved in the adaptive immune, inflammatory response process in DR. Existing studies suggest that an imbalance (mainly a decrease) in adaptive immune Treg cells may be the main cause of bursting inflammation in DR41,42,43; in terms of inflammation, macrophage-restricted protein tyrosine phosphatase 1B (PTP1B) is a key regulator of inflammation in insulin-resistant metabolic syndrome, and dysregulation of PTP1B may underlie retinal microvascular disease44; Chondroitin sulphate (CS) is a naturally occurring sticky polysaccharide of animal cartilage origin.Qi et al. revealed that CS reduced serum levels of inflammatory cytokines (TNF-α, IL-1, IL-6 and MCP-1) and attenuated inflammation in diabetic rats45. ascorbic acid is used as an antioxidant, and it has been found that low levels of ascorbic acid may exacerbate DR lesions by affecting the function of transporter proteins46. The above studies provide support for the results of GSEA enrichment analysis in this paper and suggest us that DE-mRNAs affect DR progression mainly through inflammation-related pathways. Therefore, the study of DR inflammation and immune response is crucial for the treatment of DR.

In this study, we identified the immune-related hub gene FCGR2B in DR. The presence of IRHGs implies that these genes have significant regulatory roles in the immune mechanisms of DR. FCGR2B (Fc gamma receptor Iib, FCGR2B) is an inhibitory receptor gene encoding a protein that is a low-affinity receptor for the Fc region of the immunoglobulin γ complex47. The encoded protein is involved in the phagocytosis of immune complexes and in the regulation of antibody production by B-cells. Previous studies have shown that FCGR2B plays a regulatory role in various inflammatory diseases. For example, the deletion of FCGR2B reduces the T cell tolerance of type II collagen in BQ mice, allowing autoreactive T cells to be activated and develop arthritis48; FCGR2B may also inhibit allergic lung inflammation by attenuating allergen-specific IgE responses49. Unfortunately, the study of FCGR2B in diabetes retinopathy is relatively scarce. Only Wang et al. proposed that the expression of transcriptome FCGR2B significantly increases with the severity of DR, which is related to the inflammation and immune response of DR, and suggested that FCGR2B is a key regulatory factor of DR50. However, the above research still suggests that FCGR2B, as an important inflammatory regulator, may play a critical role in DR, which requires further exploration in the future.

miR-10a-5p also has abundant regulatory functions in the organism. Existing studies have demonstrated its regulatory role in a variety of diseases such as ovarian cancer51, osteoarthritis52, myocardial infarction53, and diabetes. Strycharz showed that miR-10a-5p expression is elevated in female visceral adipose tissue (VAT) and promotes oxidative stress, chronic inflammation and insulin signalling in T2DM54; Karam-Palos et al. found that miR-10a-5p was differentially expressed in tear fluid and blood of T2DM patients with DR, and further regulated the progression of DR by modulating the levels of apoptosis function genes BCL2L2L and TP5355.

DDN-AS1 and LINC01515 are relatively new lncRNAs, and current research on them has focused on cancer. For example, Liu’s study found that DDN-AS1 promotes cancer cell proliferation, migration and invasion through miR-15a/16-TCF3 feedback loop56; LINC01515 not only promotes nasopharyngeal carcinoma progression by acting as a sponge for miR-32557, but also regulates oxidative stress in airway epithelial cells58. Although there is a temporary lack of studies on DDN-AS1 and LINC01515 in DR, previous studies can still show that they have the ability to bind to miRNAs, which also provides indirect evidence for this study.

Conclusions

The results and discussion of this study suggest that immune-related hub genes and signaling pathways, including ceRNA networks, have important roles in the regulation of DR. Ultimately, we proposed new ceRNA networks that may be regulators of DR progression by bioinformatics analysis of gene expression matrices: DDN-AS1/miR-10a-5p/FCGR2B and LINC01515/miR-10a-5p/FCGR2B. However, some scholars have previously explored the variation of transcript levels in diabetic retinopathy. For example, De Silva K et al. explored miRNA biomarkers associated with type 2 diabetes by integrating data from five miRNA sequences59; Becker et al. explored miRNA/mRNA mechanisms in DR by genetic sequencing of retinal tissue from DR patients60. All these studies illustrate that transcriptomic analysis is an important tool to reveal the mechanism of DR progression. The present study on DR mechanism not only explored the regulatory mechanism of DR from the immune perspective by bioinformatics but also constructed a complete ceRNA network at the transcriptional level. The results of this study will further improve the understanding of the mechanism of DR progression and provide potential targets for clinical treatment of DR.

Although this study used the GEO dataset from human retinal tissue to explore immune-related hub genes in DR and identify their ceRNA networks, there are still some limitations to be addressed. First, the sample size used for analytical validation in this study was relatively small. In the future, it is necessary to continue to increase the sample size and conduct prospective cohort studies to further confirm our point of view. Second, the ssGSEA assay was used in this study to calculate the proportion of immune cells in DR tissues; therefore, the use of tissue-based flow cytometry to validate the assay results enhances the confidence of our findings. Finally, due to the lack of data including disease phenotype and disease activity status, the association between immune cells and related genes and disease severity requires further estimation.