Figure 8

Comparison of one randomly selected piece of speech data from the NISQA_TEST_LIVETALK30 real test set. (a) denotes real noisy speech, (b) denotes the output of the TF-___domain two-way speech enhancement structure, (c) denotes the output of the T-___domain residual noise estimation structure, and (d) denotes the final enhanced speech generated by mixing (b,c).