MTFAP: a comprehensive platform for predicting and analyzing master transcription factors

Zhou, Jianyuan; Yu, Haojie; Lou, Chunhui; Yang, Min; Li, Yanshang; Yang, Qian; Li, Shuhan; Ji, Chunwang; Li, Song; Wang, Shuang; Cao, Haotian; Li, Xuecang; Liu, Lian

doi:10.1038/s41598-024-83686-9

Download PDF

Article
Open access
Published: 30 December 2024

MTFAP: a comprehensive platform for predicting and analyzing master transcription factors

Jianyuan Zhou¹,
Haojie Yu²,
Chunhui Lou²,
Min Yang⁴,
Yanshang Li⁴,
Qian Yang¹,
Shuhan Li¹,
Chunwang Ji¹,
Song Li¹,
Shuang Wang¹,
Haotian Cao³,
Xuecang Li² &
…
Lian Liu¹

Scientific Reports volume 14, Article number: 32012 (2024) Cite this article

2689 Accesses
1 Altmetric
Metrics details

Subjects

Abstract

Master transcription factors (MTFs) activate gene expression in pluripotent embryonic stem cells (ESCs) by binding to enhancers and super-enhancers, which precisely control ESC fate. Compelling evidence reveals a strong correlation between the operation of MTFs and the initiation and progression of cancer. Nevertheless, the challenge of identifying MTFs imposes a barrier for researchers. Therefore, we developed a master transcription factors prediction and analysis web resource (MTFAP). MTFAP is a comprehensive web tool designed to predict and analyze MTFs with different data types. To enhance user experience and facilitate exploration of interest MTFs, MTFAP offers search and browse functionalities. Furthermore, we have developed a Docker file to empower users with the capability to conduct localized analyses Additionally, MTFAP extends support for further analysis and data visualization for the MTFs identified by Coltron and CRCmapper. The platform is freely available at http://www.xiejjlab.bio/MTFAP/

Parallelized multidimensional analytic framework applied to mammary epithelial cells uncovers regulatory principles in EMT

Article Open access 08 February 2023

Asteltoxin inhibits extracellular vesicle production through AMPK/mTOR-mediated activation of lysosome function

Article Open access 23 April 2022

Clonal evolution after treatment pressure in multiple myeloma: heterogenous genomic aberrations and transcriptomic convergence

Article Open access 28 May 2022

Introduction

Master transcription factors (MTFs) activate the gene expression program in pluripotent embryonic stem cells (ESCs) by binding to enhancers and recruiting the mediator complex. At critical loci regulating pluripotency, MTFs assemble into enhancer elements, thereby meticulously orchestrating the fate of embryonic stem cells¹. Positioned at the apex of gene regulatory networks, MTFs are often associated with nearby enhancers², which can be found in distal regions, intergenic areas, or near promoters, and these enhancers typically regulate the activity of MTFs³. MTFs are regulated by enhancers and can form a complete regulatory loop with other MTFs, controlling the identity of cells according to their fate and differentiation³. Key transcription factors in stem cells, such as NANOG, OCT4, and SOX2, interact to form the core regulatory circuit (CRC), which plays a critical role in maintaining stem cell pluripotency⁴. This CRC loop facilitates positive regulation among these factors while also engaging with other transcription factors and signaling pathways, effectively balancing self-renewal and differentiation. This interdependent CRC loop is one of the key regulatory mechanisms for stem cell reprogramming and differentiation⁴.

Moreover, MTFs significantly contribute to the initiation and progression of certain pathological processes, including tumors². In esophageal squamous cell carcinoma (ESCC), MTFs like TP63, SOX2, and KLF5 create a regulatory loop that influences the survival and metastasis of cancer cells by modulating the expression of numerous genes⁵. Recent studies have identified four esophageal adenocarcinoma-specific MTFs (ELF3, KLF5, GATA6, and EHF), which work synergistically to enhance PPARG transcription by directly binding to its promoter and enhancer regions⁶.Furthermore, MTFs play a crucial role in the reprogramming of fibroblasts into myofibroblasts, which increases the tumorigenicity and invasiveness of cancer⁷. Therefore, accurately identifying MTFs is very important for developing anti-cancer drugs and finding cancer treatment targets.

Current methods for identifying MTFs mainly rely on ChIP-seq data. However, researchers often face challenges in obtaining the corresponding raw data necessary for using software tools like CRCmapper and coltron. To address this gap, Jessica Reddy developed the CaCTS (Cancer Core Transcription Factor Specificity) algorithm to predict specific MTFs⁸, but using CaCTS requires good programming skills. Therefore, there is an urgent need for an accessible analysis platform to facilitate MTF prediction. In response, we developed MTFAP, a powerful online tool designed to analyze and predict MTFs with bulk RNAseq and single cell RNAseq data. MTFAP also enables comprehensive downstream analyses, effectively visualizes results, and provides access to information on transcription factors within feedback loops (Table 1). Researchers can easily search for relevant information on the MTFAP platform, making it a user-friendly solution for MTF prediction with different types of data (Fig. 1).

Table 1 Detail data sources.

Full size table

Results

MTFAP main function

MTFs Single-cell analysis

In MTFs single-cell analysis (Fig. 2A), users can identify MTFs within single-cell transcriptome data that are associated with specific cancer types. The process unfolds as follows: Initially, the user uploads two distinct files—the expression profile file (*.txt) and the cell identity file (*.txt). It’s crucial for users to ensure that the uploaded files maintain the same format as the sample file to facilitate smooth analysis. Once uploaded, initiates the analysis by clicking the ‘Submit’ button.. Subsequently, the user selects the specific cancer type they intend to analyze. Following this selection, the user clicks on the ‘Analysis’ option and patiently awaits the results. It’s important to note that the format of the group file must align with the example file format, and the names assigned to the samples should precisely match those in the expression profile file. Eventually, MTF returns a table sorting the transcription factors (Fig. 2D). Users can then click on the transcription factor of interest to delve into the TF analysis page for further exploration.

MTFs bulk analysis

In the bulk analysis of MTFs (Fig. 2A), users are empowered to identify MTFs within bulk transcriptome data pertaining to specific cancer types. This process entails the upload of two distinct files: the expression profile file (*.txt) and the sample group file (*.txt). It is imperative for users to ensure that the uploaded file adheres to the same format as the sample file to ensure seamless analysis. Once these files are uploaded, users proceed by clicking the ‘Submit’ button. Following this, users select the specific cancer type they wish to analyze. Upon selecting the cancer type, users then proceed to click on the ‘Analysis’ option. Subsequently, MTF returns a table containing sorted transcription factors (Fig. 2D). Users can then click on the transcription factor of interest to navigate to the TF analysis page for further exploration.

CRC analysis

Users can upload master transcription factor score files (*.txt), which are obtained from the output results of CRCmapper³ (https://github.com/younglab/CRCmapper) and Coltron⁹ (https://pypi.org/project/coltron/) (Fig. 2A). MTFAP can provide visualization for the core transcription factor network and analyze the important transcription factor in the loop. Additionally, researchers can query the genes co-regulated by transcription factors in the CRC loop.

Search

MTFAP offers two search methods to retrieve information on transcription factors (Fig. 2B):

(1) Search by TF, searching transcription factor based on gene symbol.

(2) Search by TF Family, involves Searching transcription factor according to the classification of the transcription factor.

Browse

MTFAP provides queries for MTFs based on TCGA data¹⁰ (Fig. 2C). These master transcription factor data are downloaded from the supplementary files of Jessica Reddy et al. Providing these data helps researchers better query the MTFs of various cancers, while also reducing redundant calculations and server loads. Users can browse MTF information in different cancer types and cancer subtypes through the selection bar on the left side of the page. By clicking on a core transcription factor, the user can visit the detailed page of the transcription factor to obtain more detailed information.

Transcription factor detailed information

TF over review

On the left side of the column is the basic information of transcription factors (TF name, Ensembl ID, TF Family, Protein ID, Entrez ID, and Species), and on the right side is the seqlogo of TF (Fig. 2E). The Seq-logo data comes from the Jaspar database¹¹, and the TF data is sourced from AnimalTFDB¹² (http://bioinfo.life.hust.edu.cn/AnimalTFDB4/#/). JASPAR is an open-access database that provides curated, non-redundant transcription factor (TF) binding profiles stored as position frequency matrices (PFMs) and TF flexible models (TFFMs) for TFs across multiple species in six taxonomic groups. AnimalTFDB is a comprehensive database that includes the classification and annotation of genome-wide TFs and transcription cofactors across 183 animal genomes.

TF essential

Users can select different cancers for comparison to determine the cancer cell line most sensitive to the target core transcription factor (see Fig. 2E), thereby deciding which transcription factors to validate in subsequent experiments. The Cancer Cell Line Encyclopedia (CCLE) is a comprehensive compilation of gene expression, chromosomal copy number, and massively parallel sequencing data from 947 human cancer cell lines. This collection, when combined with pharmacological profiles for 24 anticancer drugs across 479 of the cell lines, enables the identification of genetic, lineage, and gene-expression-based predictors of drug sensitivity. Different expression transcription factors data mainly comes from CCLE’s experimental results for the knockout and knockdown¹³. We also provide visualization and comparison capabilities for data from two different algorithms, namely “Gene Effect” and “Gene Dependency”.

Gene effect

Outcomes from DEMETER2 or CERES. DEMETER2¹⁴is a gene function analysis tool used to identify and evaluate the roles of genes in cell survival and growth based on CRISPR-Cas9 gene knockout experimental data. It improves and expands upon the original DEMETER model, aiming to reduce data noise and enhance the accuracy and reliability of gene dependency scores. CERES is a computational tool designed for analyzing CRISPR-Cas9 gene knockout screening data to determine gene essentiality. A lower score means that a gene is more likely to be dependent in each cell line¹⁴. A score of 0 is equivalent to a gene that is not essential whereas a score of −1 corresponds to the median of all common essential genes (Fig. 2E).

Gene dependency

Genetic dependency is a gene that is considered necessary for the proliferation and survival of a given cell population¹³. A cell line is considered dependent if it has a probability of dependency greater than 0.5. Depending on their needs, users can make selections from the options above, and by clicking the “Plot” button, they can conduct searches and visualizations across various cancers.

CRC model

Core Transcription Regulatory Circuitry (CRC) is an interaction network of key transcription factors that regulate gene expression³. It plays an important role in cells, controlling the transcription and expression of genes. The CRC model offers a query on the CRC loops identified in Chip-seq data. Users can inquire about CRC loop data associated with the target core transcription factor in different samples (Fig. 2E). We offer CRC loop data derived from CRCdb¹⁵. CRCdb is a specialized database designed to catalog and analyze core transcriptional regulatory circuits (CRCs) across various human cell and tissue types.

Target genes

It can be queried for the genes regulated by MTFs in cancer. These data, derived from TCGA’s ATAC data¹⁶and analyzed using Fimo¹⁷, fimo –motif < PWM matrix > < sequence file > . The PWM matrix comes from JASPAR¹¹, The sequence file is obtained by extracting TCGA open chromatin data using the bedtools, results can be sorted based on p-values and preference scores. Users can click on “Cancer Type” to switch between different cancer types (Fig. 2E).

Target pathway

Users can inquire about the biological function and pathways regulated by MTFs in different cancer types. MTFAP provides Go¹⁸enrichment and KEGG¹⁹ pathway enrichment. Through the hypergeometric test, the top 20 regulated Go terms or KEGG pathways by MTFs are presented. Users can click Cancer Type to switch between different cancer types and click Go enrichment or KEGG pathway enrichment to obtain target gene function annotation. MTFAP will return the pathways associated with the target transcription factor and visualize the results using bar graphs (Fig. 2E).

TF-associated immune microenvironment

The query on the relationship between MTFs and the immune microenvironment is presented with different colors representing different R values. Users can hover their mouse pointers over the target area to check whether the correlation is significant (Fig. 2E). The R-value is used to indicate the relationship between the transcription level of the transcription factor and the number of various immune cells in the tumor microenvironment. All data utilized in the MTFAP was sourced from the TIMER2 database, an extensive repository enabling systematic analysis of immune infiltrates across a broad spectrum of cancer type^20,21.

TF networks

Users can select different types of cancer at the top of the column to conduct an analysis, seeking the MTFs with the highest correlation at the RNA transcription level to infer the formation of CRC-loop^3,9 (Fig. 2E).

Survival analysis

Users can select the cancer types they are interested in for survival analysis, and query the survival significance of MTFs. The MTFs survival analysis is implemented using the API provided by GEPIA 2 (Gene Expression Profiling Interactive Analysis 2)²² in 33 cancer types. GEPIA2 is a comprehensive resource for analyzing the RNA sequencing expression data of tumors and normal samples from the TCGA and GTEx projects.

RNA different expression

Users can select the cancer types they are interested in to conduct gene expression differential analysis and query the difference in expression of the MTFs between cancer and adjacent noncancerous tissues. The MTF’s different expression is implemented using the API provided by GEPIA 2 in 33 cancer types from TCGA.

Interact TF

Users can query transcription factors and transcription co-factors that interact factually with MTFs to facilitate subsequent experimental research (Fig. 2E). The interaction data pertaining to transcription factor interactions were obtained from the TcoF-DB v2 database²³ a specialized repository dedicated to the study of transcription co-factors.

MTFAP docker

Considering the limitations of the network transmission, we restrict the file upload size to 20 MB. However, it can be challenging to upload extremely large expression spectrum and sample grouping files to the server. To address this situation, we have developed MTFAP docker to assist users in localized analysis. Researchers can use docker files to deploy analysis pipelines on local servers for transcription factor analysis. We have uploaded the entire MTFAP image to Docker Hub (https://www.docker.com/products/docker-hub/) so that researchers can search and download this docker file.

Benchmarking

To test the functionality of MTFAP, we downloaded single-cell data (GSE103322) for head and neck cancer from the GEO database and bulk data obtained the TCGA expression profile from XENA²⁴. The downloaded expression data were processed, then divided into files of 500 KB, 1 MB, 2 MB, 5 MB, 10 MB, and 20 MB, and input into MTFAP to assess server performance. Processing times on the server (Server Hardware :16 Cores 32G RAM) were recorded and analyzed (Figure S2).

Case study

We performed an in-depth analysis of the transcriptome data from the TCGA dataset employing the MTFAP analysis tool. By uploading TCGA’s expression profile data along with grouping data into MTFAP, we found that the expression of the TP63 gene was significantly elevated in esophageal cancer samples, with a high CaCTS score and ranking (Figure S1A). Through querying TP63, we obtained a wealth of detailed information about TP63 in MTFAP (Figure S1B). On the detailed information page for TP63, we identified a complex regulatory network involving TP63, KLF5, and SOX2 by querying the CRC model (Figure S1C). Subsequent transcription factor network analysis confirmed that these three factors indeed form a regulatory circuit in esophageal cancer (Figure S1D). Jiang et al. validated the regulatory relationships among TP63, KLF5, and SOX2 through siRNA experiments in esophageal cancer cell lines, highlighting the significant impact of this regulatory circuit on the onset and progression of esophageal cancer.

Methods

Targets gene

We have acquired ATAC-seq data relating to tumors from TCGA¹⁶. Subsequently, we employed Fimo¹⁷for the analysis of accessible chromatin regions, while maintaining a specific threshold. Only genes that satisfied this criterion (p ≤ 0.05 and q ≤ 0.05) were retained for further examination. The PWM matrix comes from JASPAR, while the sequence file is derived from the extraction of TCGA open chromatin data utilizing bedtools. Lastly, we harnessed Homer²⁵ for the process of annotation, specifically within the TF regulatory region. Fimo is a component of MEME-suits. The acronym FIMO stands for ‘Find Individual Motif Occurrences’. This program searches a set of sequences for occurrences of known motifs, treating each motif independently. HOMER (Hypergeometric Optimization of Motif EnRichment) comprises a comprehensive suite of tools designed for motif discovery.

Targets pathway

After obtaining the target genes of the core transcription factor through analyzing open chromatin data, we employed the Cluster Profiler²⁶ R package to perform a hypergeometric analysis on these target genes (p ≤ 0.05 and q ≤ 0.05). Subsequently, we retained only the top 20 pathways, which were ranked according to their P values.

TF networks

We acquired data from TCGA and subsequently employed R for the removal of missing values. Thereafter, we extracted transcription factors from the expression profiles and performed a Pearson correlation analysis.

TF essential

Data pertaining to gene dependency and gene effects across multiple cancer cell lines were retrieved from the CCLE database^13,14. Subsequently, we employed R language for statistical computing to extract and organize the necessary information related to transcription factors, facilitating its presentation in a structured format.

MTFAP Single-cell analysis and bulk analysis

MTFAP predicts MTFs based on Cancer Core Transcription Factor Specificity (CaCTS), which can prioritize candidate MTFs using pan-cancer RNA sequencing data. We applied this method to single-cell sequence and bulk sequence data to predict MTFs. CaCTS leverages the Jensen-Shannon Divergence (JSD) algorithm to predict transcription factors by evaluating the discrete values of diverse samples.

$${score}_{i,j}=-{log}_{10}JSD(\widehat{{x}_{i}},\widehat{{u}_{j}})$$

where $\widehat{{x}_{i}}=\frac{{x}_{i}}{\left|{x}_{i}\right|}$, and x_i = (x_{i, k}) represents the ordered vector of normalized gene expression of gene i and k ∈ 1 (n is number of cancer types). $\widehat{{u}_{j}}$= (u_{j, k}) is the idealized cell type-specific gene expression for cell type j, expressed as a unit vector of length n. The JSD quantifies the similarity between two probability distributions, here used to measure the similarity between two-unit vectors $\widehat{{x}_{i}} and \widehat{{u}_{j}}$. The list of candidate MTFs for a specific cell type is determined by the intersection between of the 5% most highly expressed transcription factors in that cell type and the top 5% of transcription factors ranked according to the CaCTS score.

Implementation and graphical representation of the web service

MTFAP was developed based on Java with Vue.js, R 3.5.2 and MySQL 5.7.16. All services are encapsulated in docker containers. TF interactions and CRC model were developed by us based on open-source visualization plugin echarts 5.4.3. The functionalities such as TF essential, target pathway, TF-associated immune microenvironment, and TF networks are analyzed and obtained through R scripts developed by us, and the visualization of the results is implemented using JavaScript. The TF survival analysis and pan-cancer gene expression are implemented through the API provided by GEPIA 2. A minimum browser resolution of 1440 × 900 is recommended.

Discussion

MTFs are a crucial type of transcription factor that plays a significant role in regulating downstream genes through the formation of transcriptional regulatory loops. These downstream-regulated genes are responsible for controlling cell identity and fate, and these classical models have been supported by numerous studies. Current approaches to identifying MTFs rely on ChIP-seq data, which is unavailable for many cancers. Therefore, Jessica Reddy et al. developed the CaCTS algorithm to prioritize candidate MTFs using pan-cancer RNA sequencing data. CaCTS can identify candidate MTFs across 34 tumor types and 140 subtypes and this method has gained widespread use.

Despite the significant potential of this method, its implementation has been limited by the requisite coding proficiency. To address this challenge, we have developed a user-friendly online analytics platform named MTFAP. The MTFAP interface simplifies the prediction and analysis of MTFs for researchers. By uploading their own expression profile and grouping files, researchers can easily achieve MTF prediction and analysis with bulk RNA-seq or single-cell RNA-seq data. Furthermore, MTFAP provides comprehensive downstream analysis of MTFs, including the analysis of core transcription factor regulatory loops, sensitivity analysis in various cancer cell lines, an exploration of the influence of transcription factors on the tumor immune microenvironment, and investigation of transcription factor mutual regulation and regulatory pathways. These analyses are vital for cancer research.

MTFAP also supports the analysis of results from COLTRON and CRCmapper, introducing novel functionalities that enable the visualization of principal transcription factor regulatory networks and the querying of co-regulated genes mediated by transcription factors. These features are unique to network tools. We assert that MTFAP holds significant value and potential in cancer research. Nevertheless, MTFAP is not without its limitations. Due to resource constraints, these limitations include the inability to perform online analysis of Chip data, among others. In its commitment to advancing this field, MTFAP will persist in updating its data and software versions in the forthcoming years. In future versions, MTFAP 2.0 will include features such as visualizing transcription factor tracks and online analysis of ChIP-seq data to identify core transcription factors. Continuous endeavors will be undertaken to enhance the platform with accessible data and to elevate the functional characteristics of the MTFAP.

Data availability

Publicly available datasets were analyzed in this study. This data can be found here: http://www.xiejjlab.bio/MTFAP/ . The code related to this paper is hosted at https://github.com/chunquanli/MTFAP/tree/main.

Code availability

The code related to this paper is hosted at https://github.com/chunquanli/MTFAP/tree/main

References

Whyte, W. A. et al. Master transcription factors and mediator establish super-enhancers at key cell identity genes. Cell 153, 307–319. https://doi.org/10.1016/j.cell.2013.03.035 (2013).
Article PubMed PubMed Central MATH Google Scholar
Chen, Y., Xu, L., Lin, R.Y.-T., Müschen, M. & Koeffler, H. P. Core transcriptional regulatory circuitries in cancer. Oncogene 39, 6633–6646. https://doi.org/10.1038/s41388-020-01459-w (2020).
Article PubMed PubMed Central Google Scholar
Saint-André, V. et al. Models of human core transcriptional regulatory circuitries. Genome Res. 26, 385–396. https://doi.org/10.1101/gr.197590.115 (2016).
Article PubMed PubMed Central MATH Google Scholar
Boyer, L. A. et al. Core transcriptional regulatory circuitry in human embryonic stem cells. Cell 122, 947–956. https://doi.org/10.1016/j.cell.2005.08.020 (2005).
Article PubMed PubMed Central MATH Google Scholar
Jiang, Y.-Y. et al. TP63, SOX2, and KLF5 establish a core regulatory circuitry that controls epigenetic and transcription patterns in esophageal squamous cell carcinoma cell lines. Gastroenterol. 159, 1311-1327.e1319. https://doi.org/10.1053/j.gastro.2020.06.050 (2020).
Article Google Scholar
Chen, L. et al. Master transcription factors form interconnected circuitry and orchestrate transcriptional networks in oesophageal adenocarcinoma. Gut. 69, 630–640. https://doi.org/10.1136/gutjnl-2019-318325 (2020).
Article PubMed MATH Google Scholar
Lee, K. W. et al. PRRX1 is a master transcription factor of stromal fibroblasts for myofibroblastic lineage progression. Nat. Commun. 13, 2793. https://doi.org/10.1038/s41467-022-30484-4 (2022).
Article ADS PubMed PubMed Central MATH Google Scholar
Reddy, J. et al. Predicting master transcription factors from pan-cancer expression data. Sci. Adv. 7, 6123. https://doi.org/10.1126/sciadv.abf6123 (2021).
Article ADS Google Scholar
Ott, C. J. et al. Enhancer architecture and essential core regulatory circuitry of chronic lymphocytic leukemia. Cancer Cell 34, 982-995.e987. https://doi.org/10.1016/j.ccell.2018.11.001 (2018).
Article PubMed PubMed Central MATH Google Scholar
Blum, A., Wang, P. & Zenklusen, J. C. SnapShot: TCGA-analyzed tumors. Cell https://doi.org/10.1016/j.cell.2018.03.059 (2018).
Article PubMed Google Scholar
Castro-Mondragon, J. A. et al. JASPAR 2022: the 9th release of the open-access database of transcription factor binding profiles. Nucleic. Acids Res. 50, D165–D173. https://doi.org/10.1093/nar/gkab1113 (2022).
Article PubMed Google Scholar
Shen, W. K. et al. AnimalTFDB 4.0: a comprehensive animal transcription factor database updated with variation and expression annotations. Nucleic Acids Res. 51, D39–D45. https://doi.org/10.1093/nar/gkac907 (2023).
Article PubMed Google Scholar
Li, H. et al. The landscape of cancer cell line metabolism. Nat. Med. 25, 850–860. https://doi.org/10.1038/s41591-019-0404-8 (2019).
Article PubMed PubMed Central MATH Google Scholar
McFarland, J. M. et al. Improved estimation of cancer dependencies from large-scale RNAi screens using model-based normalization and data integration. Nat. Commun. 9, 4610. https://doi.org/10.1038/s41467-018-06916-5 (2018).
Article ADS PubMed PubMed Central MATH Google Scholar
Feng, C. et al. Landscape and significance of human super enhancer-driven core transcription regulatory circuitry. Molecular Therapy – Nucleic. Acids 32, 385–401. https://doi.org/10.1016/j.omtn.2023.03.014 (2023).
Article PubMed PubMed Central MATH Google Scholar
Corces, M. R. et al. The chromatin accessibility landscape of primary human cancers. Science https://doi.org/10.1126/science.aav1898 (2018).
Article PubMed PubMed Central Google Scholar
Bailey, T. L. et al. MEME SUITE: tools for motif discovery and searching. Nucleic. Acids Res. 37, W202–W208. https://doi.org/10.1093/nar/gkp335 (2009).
Article PubMed PubMed Central MATH Google Scholar
Aleksander, S. A. et al. The gene ontology knowledgebase in 2023. Genetics https://doi.org/10.1093/genetics/iyad031 (2023).
Article PubMed PubMed Central MATH Google Scholar
Okuda, S. et al. KEGG Atlas mapping for global analysis of metabolic pathways. Nucleic Acids Res. 36, W423–W426. https://doi.org/10.1093/nar/gkn282 (2008).
Article PubMed PubMed Central MATH Google Scholar
Newman, A. M. et al. Robust enumeration of cell subsets from tissue expression profiles. Nat. Methods 12, 453–457. https://doi.org/10.1038/nmeth.3337 (2015).
Article PubMed PubMed Central MATH Google Scholar
Liu, X. S. et al. TIMER2.0 for analysis of tumor-infiltrating immune cells. Nucleic Acids Res. 48, W509–W514. https://doi.org/10.1093/nar/gkaa407 (2020).
Article PubMed PubMed Central Google Scholar
Tang, Z. et al. GEPIA: a web server for cancer and normal gene expression profiling and interactive analyses. Nucleic Acids Res. 45, W98–W102. https://doi.org/10.1093/nar/gkx247 (2017).
Article PubMed PubMed Central Google Scholar
Schmeier, S., Alam, T., Essack, M. & Bajic, V. B. TcoF-DB v2: update of the database of human and mouse transcription co-factors and transcription factor interactions. Nucleic Acids Res. 45, D145–D150. https://doi.org/10.1093/nar/gkw1007 (2017).
Article PubMed Google Scholar
Caicedo, H. H., Hashimoto, D. A., Caicedo, J. C., Pentland, A. & Pisano, G. P. Overcoming barriers to early disease intervention. Nat. Biotechnol. 38, 669–673. https://doi.org/10.1038/s41587-020-0550-z (2020).
Article PubMed MATH Google Scholar
Duttke, S. H., Chang, M. W., Heinz, S. & Benner, C. Identification and dynamic quantification of regulatory elements using total RNA. Genome Res. 29, 1836–1846. https://doi.org/10.1101/gr.253492.119 (2019).
Article PubMed PubMed Central MATH Google Scholar
Wu, T. et al. clusterProfiler 4.0: A universal enrichment tool for interpreting omics data. Innovation. 2(3), 100141. https://doi.org/10.1016/j.xinn.2021.100141 (2021).
Article PubMed PubMed Central MATH Google Scholar

Download references

Acknowledgements

We thank TCGA for sharing their cancer chromatin accessibility data; We thank Kate Lawrenson and his colleagues for sharing the CaCTS R package with this work; We thank Zemin Zhang and his colleagues for sharing GIPIA (python package) to this work.

Funding

This work was supported by grants from Youth Project of Guangdong Basic and Applied Basic Research Foundation, 2021A1515110828, Study on the mechanism of JAG1 inhibition of angiogenesis in Oral squamous cell carcinoma regulated by ZFP36L2 driven by super enhancer, 2021.10–2024.9, in progress.

Author information

Authors and Affiliations

Department of Medical Oncology, Qilu Hospital, Cheeloo College of Medicine, Shandong University, Jinan, 250012, Shandong, China
Jianyuan Zhou, Qian Yang, Shuhan Li, Chunwang Ji, Song Li, Shuang Wang & Lian Liu
School of Medical Informatics, Harbin Medical University, Daqing Campus, Daqing, 163319, China
Haojie Yu, Chunhui Lou & Xuecang Li
Department of Oral & Maxillofacial Surgery, Sun Yat-Sen Memorial Hospital, Sun Yat-Sen University, Guangzhou, 510120, China
Haotian Cao
Department of Biochemistry and Molecular Biology, Medical College of Shantou University, Shantou, China
Min Yang & Yanshang Li

Authors

Jianyuan Zhou
View author publications
Search author on:PubMed Google Scholar
Haojie Yu
View author publications
Search author on:PubMed Google Scholar
Chunhui Lou
View author publications
Search author on:PubMed Google Scholar
Min Yang
View author publications
Search author on:PubMed Google Scholar
Yanshang Li
View author publications
Search author on:PubMed Google Scholar
Qian Yang
View author publications
Search author on:PubMed Google Scholar
Shuhan Li
View author publications
Search author on:PubMed Google Scholar
Chunwang Ji
View author publications
Search author on:PubMed Google Scholar
Song Li
View author publications
Search author on:PubMed Google Scholar
Shuang Wang
View author publications
Search author on:PubMed Google Scholar
Haotian Cao
View author publications
Search author on:PubMed Google Scholar
Xuecang Li
View author publications
Search author on:PubMed Google Scholar
Lian Liu
View author publications
Search author on:PubMed Google Scholar

Contributions

JYZ, HTC and YSL: study concepts. JYZ, HTC, and LL: study design. CHL, HJY, MY, YSL, and JYZ: data acquisition. JYZ, CWJ, and CHL: quality control of data and algorithms. QY and JYZ: data analysis and interpretation. MY, SHL, and SL: statistical analysis. SW, JYZ, LL and XCL: manuscript preparation. All authors have reviewed and edited the manuscript.

Corresponding authors

Correspondence to Haotian Cao, Xuecang Li or Lian Liu.

Ethics declarations

Competing interests

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Supplementary Information 1.

Supplementary Information 2.

Supplementary Information 3.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Zhou, J., Yu, H., Lou, C. et al. MTFAP: a comprehensive platform for predicting and analyzing master transcription factors. Sci Rep 14, 32012 (2024). https://doi.org/10.1038/s41598-024-83686-9

Download citation

Received: 16 August 2024
Accepted: 16 December 2024
Published: 30 December 2024
DOI: https://doi.org/10.1038/s41598-024-83686-9

Subjects

Abstract

Similar content being viewed by others

Parallelized multidimensional analytic framework applied to mammary epithelial cells uncovers regulatory principles in EMT

Asteltoxin inhibits extracellular vesicle production through AMPK/mTOR-mediated activation of lysosome function

Clonal evolution after treatment pressure in multiple myeloma: heterogenous genomic aberrations and transcriptomic convergence

Introduction

Results

MTFAP main function

MTFs Single-cell analysis

MTFs bulk analysis

CRC analysis

Search

Browse

Transcription factor detailed information

TF over review

TF essential

Gene effect

Gene dependency

CRC model

Target genes

Target pathway

TF-associated immune microenvironment

TF networks

Survival analysis

RNA different expression

Interact TF

MTFAP docker

Benchmarking

Case study

Methods

Targets gene

Targets pathway

TF networks

TF essential

MTFAP Single-cell analysis and bulk analysis

Implementation and graphical representation of the web service

Discussion

Data availability

Code availability

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Competing interests

Additional information

Publisher’s note

Supplementary Information

Supplementary Information 1.

Supplementary Information 2.

Supplementary Information 3.

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Quick links