Abstract
Master transcription factors (MTFs) activate gene expression in pluripotent embryonic stem cells (ESCs) by binding to enhancers and super-enhancers, which precisely control ESC fate. Compelling evidence reveals a strong correlation between the operation of MTFs and the initiation and progression of cancer. Nevertheless, the challenge of identifying MTFs imposes a barrier for researchers. Therefore, we developed a master transcription factors prediction and analysis web resource (MTFAP). MTFAP is a comprehensive web tool designed to predict and analyze MTFs with different data types. To enhance user experience and facilitate exploration of interest MTFs, MTFAP offers search and browse functionalities. Furthermore, we have developed a Docker file to empower users with the capability to conduct localized analyses Additionally, MTFAP extends support for further analysis and data visualization for the MTFs identified by Coltron and CRCmapper. The platform is freely available at http://www.xiejjlab.bio/MTFAP/
Similar content being viewed by others
Introduction
Master transcription factors (MTFs) activate the gene expression program in pluripotent embryonic stem cells (ESCs) by binding to enhancers and recruiting the mediator complex. At critical loci regulating pluripotency, MTFs assemble into enhancer elements, thereby meticulously orchestrating the fate of embryonic stem cells1. Positioned at the apex of gene regulatory networks, MTFs are often associated with nearby enhancers2, which can be found in distal regions, intergenic areas, or near promoters, and these enhancers typically regulate the activity of MTFs3. MTFs are regulated by enhancers and can form a complete regulatory loop with other MTFs, controlling the identity of cells according to their fate and differentiation3. Key transcription factors in stem cells, such as NANOG, OCT4, and SOX2, interact to form the core regulatory circuit (CRC), which plays a critical role in maintaining stem cell pluripotency4. This CRC loop facilitates positive regulation among these factors while also engaging with other transcription factors and signaling pathways, effectively balancing self-renewal and differentiation. This interdependent CRC loop is one of the key regulatory mechanisms for stem cell reprogramming and differentiation4.
Moreover, MTFs significantly contribute to the initiation and progression of certain pathological processes, including tumors2. In esophageal squamous cell carcinoma (ESCC), MTFs like TP63, SOX2, and KLF5 create a regulatory loop that influences the survival and metastasis of cancer cells by modulating the expression of numerous genes5. Recent studies have identified four esophageal adenocarcinoma-specific MTFs (ELF3, KLF5, GATA6, and EHF), which work synergistically to enhance PPARG transcription by directly binding to its promoter and enhancer regions6.Furthermore, MTFs play a crucial role in the reprogramming of fibroblasts into myofibroblasts, which increases the tumorigenicity and invasiveness of cancer7. Therefore, accurately identifying MTFs is very important for developing anti-cancer drugs and finding cancer treatment targets.
Current methods for identifying MTFs mainly rely on ChIP-seq data. However, researchers often face challenges in obtaining the corresponding raw data necessary for using software tools like CRCmapper and coltron. To address this gap, Jessica Reddy developed the CaCTS (Cancer Core Transcription Factor Specificity) algorithm to predict specific MTFs8, but using CaCTS requires good programming skills. Therefore, there is an urgent need for an accessible analysis platform to facilitate MTF prediction. In response, we developed MTFAP, a powerful online tool designed to analyze and predict MTFs with bulk RNAseq and single cell RNAseq data. MTFAP also enables comprehensive downstream analyses, effectively visualizes results, and provides access to information on transcription factors within feedback loops (Table 1). Researchers can easily search for relevant information on the MTFAP platform, making it a user-friendly solution for MTF prediction with different types of data (Fig. 1).
Results
MTFAP main function
MTFs Single-cell analysis
In MTFs single-cell analysis (Fig. 2A), users can identify MTFs within single-cell transcriptome data that are associated with specific cancer types. The process unfolds as follows: Initially, the user uploads two distinct files—the expression profile file (*.txt) and the cell identity file (*.txt). It’s crucial for users to ensure that the uploaded files maintain the same format as the sample file to facilitate smooth analysis. Once uploaded, initiates the analysis by clicking the ‘Submit’ button.. Subsequently, the user selects the specific cancer type they intend to analyze. Following this selection, the user clicks on the ‘Analysis’ option and patiently awaits the results. It’s important to note that the format of the group file must align with the example file format, and the names assigned to the samples should precisely match those in the expression profile file. Eventually, MTF returns a table sorting the transcription factors (Fig. 2D). Users can then click on the transcription factor of interest to delve into the TF analysis page for further exploration.
A. MTFAP provides three analysis methods (MTFs Single-cell analysis, MTFs Bulk analysis, and CRC analysis) to help users analyze and predict MTF. B. MTFAP supports two search methods (Search by TF, Search by TF Family) to retrieve transcription factors that users are interested in. C. MTFAP provides MTF query based on TCGA data. D. The predicted results returned by MTFAP can be further analyzed by users by clicking on MTF according to their specific needs. E. The MTFAP provides MTF detail analysis, including: TF over review: MTFAP provides basic information about transcription factors. TF essential: Users can query the sensitivity of transcription factors in each cell line. Target genes: MTFAP allows the query for genes regulated by MTFs in cancer. CRC model: MTFAP can provide CRC (Core Transcription Regulatory Circuitry) loop data related to target core transcription factors in different samples. Target pathway: MTFAP provides the query for the biological function and pathways regulated by MTFs in different cancer types. Interact TF: MTFAP supports the query for transcription factors and transcription co-factors that interact with MTFs, facilitating subsequent experimental research. TF networks: MTFAP supports the analysis of different types of cancer to find the MTF with the highest correlation at the RNA transcription level, inferring the formation of CRC loops. TF-associated immune microenvironment: MTFAP supports the query for the relationship between MTFs and the immune microenvironment, where different colors in the heatmap represent different R values.
MTFs bulk analysis
In the bulk analysis of MTFs (Fig. 2A), users are empowered to identify MTFs within bulk transcriptome data pertaining to specific cancer types. This process entails the upload of two distinct files: the expression profile file (*.txt) and the sample group file (*.txt). It is imperative for users to ensure that the uploaded file adheres to the same format as the sample file to ensure seamless analysis. Once these files are uploaded, users proceed by clicking the ‘Submit’ button. Following this, users select the specific cancer type they wish to analyze. Upon selecting the cancer type, users then proceed to click on the ‘Analysis’ option. Subsequently, MTF returns a table containing sorted transcription factors (Fig. 2D). Users can then click on the transcription factor of interest to navigate to the TF analysis page for further exploration.
CRC analysis
Users can upload master transcription factor score files (*.txt), which are obtained from the output results of CRCmapper3 (https://github.com/younglab/CRCmapper) and Coltron9 (https://pypi.org/project/coltron/) (Fig. 2A). MTFAP can provide visualization for the core transcription factor network and analyze the important transcription factor in the loop. Additionally, researchers can query the genes co-regulated by transcription factors in the CRC loop.
Search
MTFAP offers two search methods to retrieve information on transcription factors (Fig. 2B):
(1) Search by TF, searching transcription factor based on gene symbol.
(2) Search by TF Family, involves Searching transcription factor according to the classification of the transcription factor.
Browse
MTFAP provides queries for MTFs based on TCGA data10 (Fig. 2C). These master transcription factor data are downloaded from the supplementary files of Jessica Reddy et al. Providing these data helps researchers better query the MTFs of various cancers, while also reducing redundant calculations and server loads. Users can browse MTF information in different cancer types and cancer subtypes through the selection bar on the left side of the page. By clicking on a core transcription factor, the user can visit the detailed page of the transcription factor to obtain more detailed information.
Transcription factor detailed information
TF over review
On the left side of the column is the basic information of transcription factors (TF name, Ensembl ID, TF Family, Protein ID, Entrez ID, and Species), and on the right side is the seqlogo of TF (Fig. 2E). The Seq-logo data comes from the Jaspar database11, and the TF data is sourced from AnimalTFDB12 (http://bioinfo.life.hust.edu.cn/AnimalTFDB4/#/). JASPAR is an open-access database that provides curated, non-redundant transcription factor (TF) binding profiles stored as position frequency matrices (PFMs) and TF flexible models (TFFMs) for TFs across multiple species in six taxonomic groups. AnimalTFDB is a comprehensive database that includes the classification and annotation of genome-wide TFs and transcription cofactors across 183 animal genomes.
TF essential
Users can select different cancers for comparison to determine the cancer cell line most sensitive to the target core transcription factor (see Fig. 2E), thereby deciding which transcription factors to validate in subsequent experiments. The Cancer Cell Line Encyclopedia (CCLE) is a comprehensive compilation of gene expression, chromosomal copy number, and massively parallel sequencing data from 947 human cancer cell lines. This collection, when combined with pharmacological profiles for 24 anticancer drugs across 479 of the cell lines, enables the identification of genetic, lineage, and gene-expression-based predictors of drug sensitivity. Different expression transcription factors data mainly comes from CCLE’s experimental results for the knockout and knockdown13. We also provide visualization and comparison capabilities for data from two different algorithms, namely “Gene Effect” and “Gene Dependency”.
Gene effect
Outcomes from DEMETER2 or CERES. DEMETER214is a gene function analysis tool used to identify and evaluate the roles of genes in cell survival and growth based on CRISPR-Cas9 gene knockout experimental data. It improves and expands upon the original DEMETER model, aiming to reduce data noise and enhance the accuracy and reliability of gene dependency scores. CERES is a computational tool designed for analyzing CRISPR-Cas9 gene knockout screening data to determine gene essentiality. A lower score means that a gene is more likely to be dependent in each cell line14. A score of 0 is equivalent to a gene that is not essential whereas a score of −1 corresponds to the median of all common essential genes (Fig. 2E).
Gene dependency
Genetic dependency is a gene that is considered necessary for the proliferation and survival of a given cell population13. A cell line is considered dependent if it has a probability of dependency greater than 0.5. Depending on their needs, users can make selections from the options above, and by clicking the “Plot” button, they can conduct searches and visualizations across various cancers.
CRC model
Core Transcription Regulatory Circuitry (CRC) is an interaction network of key transcription factors that regulate gene expression3. It plays an important role in cells, controlling the transcription and expression of genes. The CRC model offers a query on the CRC loops identified in Chip-seq data. Users can inquire about CRC loop data associated with the target core transcription factor in different samples (Fig. 2E). We offer CRC loop data derived from CRCdb15. CRCdb is a specialized database designed to catalog and analyze core transcriptional regulatory circuits (CRCs) across various human cell and tissue types.
Target genes
It can be queried for the genes regulated by MTFs in cancer. These data, derived from TCGA’s ATAC data16and analyzed using Fimo17, fimo –motif < PWM matrix > < sequence file > . The PWM matrix comes from JASPAR11, The sequence file is obtained by extracting TCGA open chromatin data using the bedtools, results can be sorted based on p-values and preference scores. Users can click on “Cancer Type” to switch between different cancer types (Fig. 2E).
Target pathway
Users can inquire about the biological function and pathways regulated by MTFs in different cancer types. MTFAP provides Go18enrichment and KEGG19 pathway enrichment. Through the hypergeometric test, the top 20 regulated Go terms or KEGG pathways by MTFs are presented. Users can click Cancer Type to switch between different cancer types and click Go enrichment or KEGG pathway enrichment to obtain target gene function annotation. MTFAP will return the pathways associated with the target transcription factor and visualize the results using bar graphs (Fig. 2E).
TF-associated immune microenvironment
The query on the relationship between MTFs and the immune microenvironment is presented with different colors representing different R values. Users can hover their mouse pointers over the target area to check whether the correlation is significant (Fig. 2E). The R-value is used to indicate the relationship between the transcription level of the transcription factor and the number of various immune cells in the tumor microenvironment. All data utilized in the MTFAP was sourced from the TIMER2 database, an extensive repository enabling systematic analysis of immune infiltrates across a broad spectrum of cancer type20,21.
TF networks
Users can select different types of cancer at the top of the column to conduct an analysis, seeking the MTFs with the highest correlation at the RNA transcription level to infer the formation of CRC-loop3,9 (Fig. 2E).
Survival analysis
Users can select the cancer types they are interested in for survival analysis, and query the survival significance of MTFs. The MTFs survival analysis is implemented using the API provided by GEPIA 2 (Gene Expression Profiling Interactive Analysis 2)22 in 33 cancer types. GEPIA2 is a comprehensive resource for analyzing the RNA sequencing expression data of tumors and normal samples from the TCGA and GTEx projects.
RNA different expression
Users can select the cancer types they are interested in to conduct gene expression differential analysis and query the difference in expression of the MTFs between cancer and adjacent noncancerous tissues. The MTF’s different expression is implemented using the API provided by GEPIA 2 in 33 cancer types from TCGA.
Interact TF
Users can query transcription factors and transcription co-factors that interact factually with MTFs to facilitate subsequent experimental research (Fig. 2E). The interaction data pertaining to transcription factor interactions were obtained from the TcoF-DB v2 database23 a specialized repository dedicated to the study of transcription co-factors.
MTFAP docker
Considering the limitations of the network transmission, we restrict the file upload size to 20 MB. However, it can be challenging to upload extremely large expression spectrum and sample grouping files to the server. To address this situation, we have developed MTFAP docker to assist users in localized analysis. Researchers can use docker files to deploy analysis pipelines on local servers for transcription factor analysis. We have uploaded the entire MTFAP image to Docker Hub (https://www.docker.com/products/docker-hub/) so that researchers can search and download this docker file.
Benchmarking
To test the functionality of MTFAP, we downloaded single-cell data (GSE103322) for head and neck cancer from the GEO database and bulk data obtained the TCGA expression profile from XENA24. The downloaded expression data were processed, then divided into files of 500 KB, 1 MB, 2 MB, 5 MB, 10 MB, and 20 MB, and input into MTFAP to assess server performance. Processing times on the server (Server Hardware :16 Cores 32G RAM) were recorded and analyzed (Figure S2).
Case study
We performed an in-depth analysis of the transcriptome data from the TCGA dataset employing the MTFAP analysis tool. By uploading TCGA’s expression profile data along with grouping data into MTFAP, we found that the expression of the TP63 gene was significantly elevated in esophageal cancer samples, with a high CaCTS score and ranking (Figure S1A). Through querying TP63, we obtained a wealth of detailed information about TP63 in MTFAP (Figure S1B). On the detailed information page for TP63, we identified a complex regulatory network involving TP63, KLF5, and SOX2 by querying the CRC model (Figure S1C). Subsequent transcription factor network analysis confirmed that these three factors indeed form a regulatory circuit in esophageal cancer (Figure S1D). Jiang et al. validated the regulatory relationships among TP63, KLF5, and SOX2 through siRNA experiments in esophageal cancer cell lines, highlighting the significant impact of this regulatory circuit on the onset and progression of esophageal cancer.
Methods
Targets gene
We have acquired ATAC-seq data relating to tumors from TCGA16. Subsequently, we employed Fimo17for the analysis of accessible chromatin regions, while maintaining a specific threshold. Only genes that satisfied this criterion (p ≤ 0.05 and q ≤ 0.05) were retained for further examination. The PWM matrix comes from JASPAR, while the sequence file is derived from the extraction of TCGA open chromatin data utilizing bedtools. Lastly, we harnessed Homer25 for the process of annotation, specifically within the TF regulatory region. Fimo is a component of MEME-suits. The acronym FIMO stands for ‘Find Individual Motif Occurrences’. This program searches a set of sequences for occurrences of known motifs, treating each motif independently. HOMER (Hypergeometric Optimization of Motif EnRichment) comprises a comprehensive suite of tools designed for motif discovery.
Targets pathway
After obtaining the target genes of the core transcription factor through analyzing open chromatin data, we employed the Cluster Profiler26 R package to perform a hypergeometric analysis on these target genes (p ≤ 0.05 and q ≤ 0.05). Subsequently, we retained only the top 20 pathways, which were ranked according to their P values.
TF networks
We acquired data from TCGA and subsequently employed R for the removal of missing values. Thereafter, we extracted transcription factors from the expression profiles and performed a Pearson correlation analysis.
TF essential
Data pertaining to gene dependency and gene effects across multiple cancer cell lines were retrieved from the CCLE database13,14. Subsequently, we employed R language for statistical computing to extract and organize the necessary information related to transcription factors, facilitating its presentation in a structured format.
MTFAP Single-cell analysis and bulk analysis
MTFAP predicts MTFs based on Cancer Core Transcription Factor Specificity (CaCTS), which can prioritize candidate MTFs using pan-cancer RNA sequencing data. We applied this method to single-cell sequence and bulk sequence data to predict MTFs. CaCTS leverages the Jensen-Shannon Divergence (JSD) algorithm to predict transcription factors by evaluating the discrete values of diverse samples.
where \(\widehat{{x}_{i}}=\frac{{x}_{i}}{\left|{x}_{i}\right|}\), and xi = (xi, k) represents the ordered vector of normalized gene expression of gene i and k ∈ 1 (n is number of cancer types). \(\widehat{{u}_{j}}\)= (uj, k) is the idealized cell type-specific gene expression for cell type j, expressed as a unit vector of length n. The JSD quantifies the similarity between two probability distributions, here used to measure the similarity between two-unit vectors \(\widehat{{x}_{i}} and \widehat{{u}_{j}}\). The list of candidate MTFs for a specific cell type is determined by the intersection between of the 5% most highly expressed transcription factors in that cell type and the top 5% of transcription factors ranked according to the CaCTS score.
Implementation and graphical representation of the web service
MTFAP was developed based on Java with Vue.js, R 3.5.2 and MySQL 5.7.16. All services are encapsulated in docker containers. TF interactions and CRC model were developed by us based on open-source visualization plugin echarts 5.4.3. The functionalities such as TF essential, target pathway, TF-associated immune microenvironment, and TF networks are analyzed and obtained through R scripts developed by us, and the visualization of the results is implemented using JavaScript. The TF survival analysis and pan-cancer gene expression are implemented through the API provided by GEPIA 2. A minimum browser resolution of 1440 × 900 is recommended.
Discussion
MTFs are a crucial type of transcription factor that plays a significant role in regulating downstream genes through the formation of transcriptional regulatory loops. These downstream-regulated genes are responsible for controlling cell identity and fate, and these classical models have been supported by numerous studies. Current approaches to identifying MTFs rely on ChIP-seq data, which is unavailable for many cancers. Therefore, Jessica Reddy et al. developed the CaCTS algorithm to prioritize candidate MTFs using pan-cancer RNA sequencing data. CaCTS can identify candidate MTFs across 34 tumor types and 140 subtypes and this method has gained widespread use.
Despite the significant potential of this method, its implementation has been limited by the requisite coding proficiency. To address this challenge, we have developed a user-friendly online analytics platform named MTFAP. The MTFAP interface simplifies the prediction and analysis of MTFs for researchers. By uploading their own expression profile and grouping files, researchers can easily achieve MTF prediction and analysis with bulk RNA-seq or single-cell RNA-seq data. Furthermore, MTFAP provides comprehensive downstream analysis of MTFs, including the analysis of core transcription factor regulatory loops, sensitivity analysis in various cancer cell lines, an exploration of the influence of transcription factors on the tumor immune microenvironment, and investigation of transcription factor mutual regulation and regulatory pathways. These analyses are vital for cancer research.
MTFAP also supports the analysis of results from COLTRON and CRCmapper, introducing novel functionalities that enable the visualization of principal transcription factor regulatory networks and the querying of co-regulated genes mediated by transcription factors. These features are unique to network tools. We assert that MTFAP holds significant value and potential in cancer research. Nevertheless, MTFAP is not without its limitations. Due to resource constraints, these limitations include the inability to perform online analysis of Chip data, among others. In its commitment to advancing this field, MTFAP will persist in updating its data and software versions in the forthcoming years. In future versions, MTFAP 2.0 will include features such as visualizing transcription factor tracks and online analysis of ChIP-seq data to identify core transcription factors. Continuous endeavors will be undertaken to enhance the platform with accessible data and to elevate the functional characteristics of the MTFAP.
Data availability
Publicly available datasets were analyzed in this study. This data can be found here: http://www.xiejjlab.bio/MTFAP/ . The code related to this paper is hosted at https://github.com/chunquanli/MTFAP/tree/main.
Code availability
The code related to this paper is hosted at https://github.com/chunquanli/MTFAP/tree/main
References
Whyte, W. A. et al. Master transcription factors and mediator establish super-enhancers at key cell identity genes. Cell 153, 307–319. https://doi.org/10.1016/j.cell.2013.03.035 (2013).
Chen, Y., Xu, L., Lin, R.Y.-T., Müschen, M. & Koeffler, H. P. Core transcriptional regulatory circuitries in cancer. Oncogene 39, 6633–6646. https://doi.org/10.1038/s41388-020-01459-w (2020).
Saint-André, V. et al. Models of human core transcriptional regulatory circuitries. Genome Res. 26, 385–396. https://doi.org/10.1101/gr.197590.115 (2016).
Boyer, L. A. et al. Core transcriptional regulatory circuitry in human embryonic stem cells. Cell 122, 947–956. https://doi.org/10.1016/j.cell.2005.08.020 (2005).
Jiang, Y.-Y. et al. TP63, SOX2, and KLF5 establish a core regulatory circuitry that controls epigenetic and transcription patterns in esophageal squamous cell carcinoma cell lines. Gastroenterol. 159, 1311-1327.e1319. https://doi.org/10.1053/j.gastro.2020.06.050 (2020).
Chen, L. et al. Master transcription factors form interconnected circuitry and orchestrate transcriptional networks in oesophageal adenocarcinoma. Gut. 69, 630–640. https://doi.org/10.1136/gutjnl-2019-318325 (2020).
Lee, K. W. et al. PRRX1 is a master transcription factor of stromal fibroblasts for myofibroblastic lineage progression. Nat. Commun. 13, 2793. https://doi.org/10.1038/s41467-022-30484-4 (2022).
Reddy, J. et al. Predicting master transcription factors from pan-cancer expression data. Sci. Adv. 7, 6123. https://doi.org/10.1126/sciadv.abf6123 (2021).
Ott, C. J. et al. Enhancer architecture and essential core regulatory circuitry of chronic lymphocytic leukemia. Cancer Cell 34, 982-995.e987. https://doi.org/10.1016/j.ccell.2018.11.001 (2018).
Blum, A., Wang, P. & Zenklusen, J. C. SnapShot: TCGA-analyzed tumors. Cell https://doi.org/10.1016/j.cell.2018.03.059 (2018).
Castro-Mondragon, J. A. et al. JASPAR 2022: the 9th release of the open-access database of transcription factor binding profiles. Nucleic. Acids Res. 50, D165–D173. https://doi.org/10.1093/nar/gkab1113 (2022).
Shen, W. K. et al. AnimalTFDB 4.0: a comprehensive animal transcription factor database updated with variation and expression annotations. Nucleic Acids Res. 51, D39–D45. https://doi.org/10.1093/nar/gkac907 (2023).
Li, H. et al. The landscape of cancer cell line metabolism. Nat. Med. 25, 850–860. https://doi.org/10.1038/s41591-019-0404-8 (2019).
McFarland, J. M. et al. Improved estimation of cancer dependencies from large-scale RNAi screens using model-based normalization and data integration. Nat. Commun. 9, 4610. https://doi.org/10.1038/s41467-018-06916-5 (2018).
Feng, C. et al. Landscape and significance of human super enhancer-driven core transcription regulatory circuitry. Molecular Therapy – Nucleic. Acids 32, 385–401. https://doi.org/10.1016/j.omtn.2023.03.014 (2023).
Corces, M. R. et al. The chromatin accessibility landscape of primary human cancers. Science https://doi.org/10.1126/science.aav1898 (2018).
Bailey, T. L. et al. MEME SUITE: tools for motif discovery and searching. Nucleic. Acids Res. 37, W202–W208. https://doi.org/10.1093/nar/gkp335 (2009).
Aleksander, S. A. et al. The gene ontology knowledgebase in 2023. Genetics https://doi.org/10.1093/genetics/iyad031 (2023).
Okuda, S. et al. KEGG Atlas mapping for global analysis of metabolic pathways. Nucleic Acids Res. 36, W423–W426. https://doi.org/10.1093/nar/gkn282 (2008).
Newman, A. M. et al. Robust enumeration of cell subsets from tissue expression profiles. Nat. Methods 12, 453–457. https://doi.org/10.1038/nmeth.3337 (2015).
Liu, X. S. et al. TIMER2.0 for analysis of tumor-infiltrating immune cells. Nucleic Acids Res. 48, W509–W514. https://doi.org/10.1093/nar/gkaa407 (2020).
Tang, Z. et al. GEPIA: a web server for cancer and normal gene expression profiling and interactive analyses. Nucleic Acids Res. 45, W98–W102. https://doi.org/10.1093/nar/gkx247 (2017).
Schmeier, S., Alam, T., Essack, M. & Bajic, V. B. TcoF-DB v2: update of the database of human and mouse transcription co-factors and transcription factor interactions. Nucleic Acids Res. 45, D145–D150. https://doi.org/10.1093/nar/gkw1007 (2017).
Caicedo, H. H., Hashimoto, D. A., Caicedo, J. C., Pentland, A. & Pisano, G. P. Overcoming barriers to early disease intervention. Nat. Biotechnol. 38, 669–673. https://doi.org/10.1038/s41587-020-0550-z (2020).
Duttke, S. H., Chang, M. W., Heinz, S. & Benner, C. Identification and dynamic quantification of regulatory elements using total RNA. Genome Res. 29, 1836–1846. https://doi.org/10.1101/gr.253492.119 (2019).
Wu, T. et al. clusterProfiler 4.0: A universal enrichment tool for interpreting omics data. Innovation. 2(3), 100141. https://doi.org/10.1016/j.xinn.2021.100141 (2021).
Acknowledgements
We thank TCGA for sharing their cancer chromatin accessibility data; We thank Kate Lawrenson and his colleagues for sharing the CaCTS R package with this work; We thank Zemin Zhang and his colleagues for sharing GIPIA (python package) to this work.
Funding
This work was supported by grants from Youth Project of Guangdong Basic and Applied Basic Research Foundation, 2021A1515110828, Study on the mechanism of JAG1 inhibition of angiogenesis in Oral squamous cell carcinoma regulated by ZFP36L2 driven by super enhancer, 2021.10–2024.9, in progress.
Author information
Authors and Affiliations
Contributions
JYZ, HTC and YSL: study concepts. JYZ, HTC, and LL: study design. CHL, HJY, MY, YSL, and JYZ: data acquisition. JYZ, CWJ, and CHL: quality control of data and algorithms. QY and JYZ: data analysis and interpretation. MY, SHL, and SL: statistical analysis. SW, JYZ, LL and XCL: manuscript preparation. All authors have reviewed and edited the manuscript.
Corresponding authors
Ethics declarations
Competing interests
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Zhou, J., Yu, H., Lou, C. et al. MTFAP: a comprehensive platform for predicting and analyzing master transcription factors. Sci Rep 14, 32012 (2024). https://doi.org/10.1038/s41598-024-83686-9
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41598-024-83686-9