Table 3 Training and testing accuracy of matching proteome to genome for SomaScan 5K data.

From: Large scale proteomic studies create novel privacy considerations

Cohort

Training

Testing

COPDGene (N = 2646 genomes)

COPDGene (N = 9970 genomes)

ARIC (N = 12,219 genomes)

Ancestry

European American (N = 1877 proteomes)

African American (N = 769 proteomes)

European American (N = 1870 proteomes)

African American (N = 776 proteomes)

European American (N = 8987 proteomes)

African American (N = 2774 proteomes)

# Proteins

Top 1 (%)

In top 3 (%)

In top 1% (%)

Top 1 (%)

In top 3 (%)

In top 1% (%)

Top 1 (%)

In top 3 (%)

In top 1% (%)

Top 1 (%)

In top 3 (%)

In top 1% (%)

Top 1 (%)

In top 3 (%)

In top 1% (%)

Top 1 (%)

In top 3 (%)

In top 1% (%)

20

85.56

93.61

99.15

60.73

76.20

96.62

83.90

92.09

98.66

60.05

77.32

97.55

52.77

70.54

96.44

35.63

52.34

80.52

40

99.04

99.63

99.89

94.93

97.66

99.48

97.97

98.93

99.63

94.59

97.68

99.74

94.08

97.28

99.71

86.87

94.41

99.24

60

99.52

99.79

99.89

97.92

98.83

99.48

98.72

99.30

99.63

97.29

98.84

99.74

97.36

98.88

99.78

94.27

97.56

99.76

100

99.79

99.89

99.89

98.83

99.09

99.48

99.36

99.52

99.63

98.45

99.23

99.74

98.83

99.49

99.80

96.75

98.81

99.95

150

99.84

99.89

99.89

99.09

99.22

99.48

99.47

99.63

99.63

98.84

99.48

99.87

99.05

99.53

99.80

97.61

98.90

99.86

All

96.27

96.86

98.61

98.83

99.22

99.61

97.97

98.93

99.63

94.59

97.68

99.74

99.02

99.63

99.80

97.13

98.71

99.81