Sound Demo for Voiced Speech Segregation
First column: mixtures from a corpus of 10 voiced utterances mixed with 10 intrusions collected by Martin Cooke. v3 and v8 are the utterances of the same sentence, "Why were you all weary", from different male speakers. 10 intrusions are: n0 - 1 kHz pure tone, n1 - white noise, n2 - noise bursts, n3 - �cocktail party� noise, n4 - rock music, n5 - siren, n6 - trill telephone, n7 - female speech, n8 - male speech, and n9 - female speech.
Second column: target speech segregated from the 10 mixtures in the first column using the Wang-Brown 1999 model. For more details, see D. L. Wang and G. J. Brown (1999): Separation of speech from interfering sounds based on oscillatory correlation , IEEE Trans. Neural Networks, Vol. 10, pp. 684-697.
Third column: target speech segregated from the 10 mixtures in the first column using the Hu-Wang 2004 model. For more details, see: G. Hu and D. L. Wang (2004): Monaural speech segregation based on pitch tracking and amplitude modulation, IEEE Trans. Neural Networks, vol. 15, pp. 1135-1150.
Mixture |
Wang-Brown
Model |
Hu-Wang
Model |