Fig. 1: Machine learning allows all RefSeq genes to be ranked based on their similarity to genes known to cause APVR. | European Journal of Human Genetics

Fig. 1: Machine learning allows all RefSeq genes to be ranked based on their similarity to genes known to cause APVR.

From: Clinical exome sequencing efficacy and phenotypic expansions involving anomalous pulmonary venous return

Fig. 1

A The machine learning algorithm was trained using 35 genes known to cause APVR in humans and the human homologs of genes that cause APVR in mice. Receiver operating characteristic (ROC) style curves were generated based on a leave-one-out validation study analysis performed for each knowledge source: Gene Ontology (GO), Mouse Genome Database (MGI), Protein Interaction Network Analysis (PINA), GeneAtlas expression distribution (Exp), and transcription factor binding (TF) and epigenetic histone modifications data (Epi) from NIH Roadmap Epigenomics Mapping Consortium. The black curve represents an omnibus score whose positive deviation indicates that our algorithm can identify genes in the training set more effectively than chance (diagonal line). After validation, ARMs-specific pathogenicity scores were calculated for all RefSeq genes. B Box plot showing the algorithmically generated APVR-specific pathogenicity scores for APVR training genes.

Back to article page