Table 4 Cas1 proteins with lengths of < 250aa predicted by our models.
From: A new strategy for Cas protein recognition based on graph neural networks and SMILES encoding
 | Candidate Cas1 ID | Sequence length (< 250) | Predicted score | Number of Cas1 that BLAST hits | Region name in NCBI | Cas1 ___domain matched in SMART (start–end; E-values) |
---|---|---|---|---|---|---|
1 | KPU62407 | 61 | 1.08 | 331 | Cas1_I–II–III |  |
2 | KKG18251 | 153 | 1.05 | 66 | Cas1_I–II–III | 8–70; 3.80e–19 |
3 | SNY00500 | 100 | 1.37 | 33 | – | – |
4 | OBZ34426 | 89 | 0.96 | 32 | – | – |
5 | KGK98611 | 106 | 0.96 | 31 | – | – |
6 | ATU08116 | 73 | 0.65 | 30 | – | – |
7 | APH39473 | 91 | 1.01 | 30 | – | – |
8 | ATU08599 | 60 | 0.97 | 28 | – | – |
9 | EFC93806 | 56 | 1.06 | 28 | Cas1_I–II–III | – |
10 | SNY20592 | 58 | 0.99 | 27 | – | – |
11 | AFV22607 | 57 | 1 | 13 | – | – |
12 | OYT29111 | 237 | 1.32 | 1 | – | – |
13 | EQB71572 | 30 | 1.11 | 1 | – | – |
14 | AIC16667 | 84 | 1.21 | 1 | – | – |
15 | KKF97957 | 172 | 1.8 | 1 | – | – |