Publications in Reverse Chronological Order
[Publications by topic] [DeLiang Wang] [Laboratory]
Journal Articles
- Wang H., Pandey A., and Wang D.L. (2025):
A systematic study of DNN based speech enhancement in reverberant and reverberant-noisy environments.
Computer Speech & Language, vol. 89, article 101677, 12 pages.
- Kalkhorani V.A. and Wang D.L. (2024):
TF-CrossNet: Leveraging global, cross-band, narrow-band, and positional encoding for single- and multi-channel speaker separation.
IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 32, pp. 4999-5009.
- Zhang Y., Wang H., and Wang D.L. (2024):
Leveraging laryngograph data for robust voicing detection in speech.
Journal of the Acoustical Society of America, vol. 156, pp. 3502-3513.
- Taherian H. and Wang D.L. (2024):
Multi-channel conversational speaker separation via neural diarization.
IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 32, pp. 2467-2476.
- Zhang Y., Wang H., and Wang D.L. (2023):
F0 estimation and voicing detection with cascade architecture in noisy speech.
IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 31, pp. 3760-3770.
- Healy E.W., Johnson E.M., Pandey A., and Wang D.L. (2023):
Progress made in the efficacy and viability of deep-learning-based noise reduction.
Journal of the Acoustical Society of America, vol. 153, pp. 2751-2768.
- Pandey A. and Wang D.L. (2023):
Attentive training: A new training framework for speech enhancement.
IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 31, pp. 1360-1370.
- Zhang H., Pandey A., and Wang D.L. (2023):
Low-latency active noise control using attentive recurrent network.
IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 31, pp. 1114-1123.
- Zhang H. and Wang D.L. (2023):
Deep MCANC: A deep learning approach to multi-channel active noise control.
Neural Networks, vol. 158, pp. 318-327.
- Wang H., Zhang X., and Wang D.L. (2022):
Fusing bone-conduction and air-conduction sensors for complex-domain speech snhancement.
IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 30, pp. 3134-3143.
- Taherian H., Tan K., and Wang D.L. (2022):
Multi-channel talker-independent speaker separation through location-based training.
IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 30, pp. 2791-2800.
- Zhang H. and Wang D.L. (2022):
Neural cascade architecture for multi-channel acoustic echo suppression.
IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 30, pp. 2326-2336.
- Pandey A. and Wang D.L. (2022):
Self-attending RNN for speech enhancement to improve cross-corpus generalization.
IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 30, pp. 1374-1385.
- Wang H. and Wang D.L. (2022):
Neural cascade architecture with triple-domain loss for speech enhancement.
IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 30, pp. 734-743.
- Tan K., Wang Z.-Q., and Wang D.L. (2022):
Neural spectrospatial filtering.
IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 30, pp. 605-621.
- Healy E.W., Taherian H., Johnson E.M., and Wang D.L. (2021):
A causal and talker-independent speaker separation/dereverberation deep learning algorithm: Cost associated
with conversion to real-time capable operation.
Journal of the Acoustical Society of America, vol. 150, pp. 3976-3986.
- Healy E.W., Johnson E.M., Delfarah M., Krishnagiri D.S., Sevich V.A., Taherian H., and Wang D.L. (2021):
Deep learning based speaker separation and dereverberation can generalize across different languages to improve intelligibility.
Journal of the Acoustical Society of America, vol. 150, pp. 2526-2538.
- Li H., Wang D.L., Zhang X., Gao G. (2021):
Recurrent neural networks and acoustic features for frame-level signal-to-noise ratio estimation.
IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 29, pp. 2878-2887.
- Wang H. and Wang D.L. (2021):
Towards robust speech super-resolution.
IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 29, pp. 2058-2066.
- Wang Z.-Q., Wang P., and Wang D.L. (2021):
Multi-microphone complex spectral mapping for utterance-wise and continuous speech separation.
IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 29, pp. 2001-2014.
- Zhang H. and Wang D.L. (2021):
Deep ANC: A deep learning approach to active noise control.
Neural Networks, vol. 141, pp. 1-10.
- Healy E.W., Tan K., Johnson E.M., and Wang D.L. (2021):
An effectively causal deep learning algorithm to increase intelligibility in untrained noises for hearing-impaired listeners.
Journal of the Acoustical Society of America, vol. 149, pp. 3943–3953.
- Tan K., Zhang X., and Wang D.L. (2021):
Deep learning based real-time speech enhancement for dual-microphone mobile phones.
IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 29, pp. 1853-1863.
- Tan K. and Wang D.L. (2021):
Towards model compression for deep learning based speech enhancement.
IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 29, pp. 1785-1794.
- Pandey A. and Wang D.L. (2021):
Dense CNN with self-attention for time-domain speech enhancement.
IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 29, pp. 1270-1279.
(Related Sound Demo.)
- Wang P., Chen Z., Wang D.L., Li J., and Gong Y. (2021):
Speaker separation using speaker inventories and estimated speech.
IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 29, pp. 537-546.
- Pandey A. and Wang D.L. (2020):
On cross-corpus generalization of deep learning based speech enhancement.
IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 28, pp. 2489-2499.
- Delfarah M., Liu Y. and Wang D.L. (2020):
A two-stage deep learning algorithm for talker-independent speaker separation in reverberant conditions.
Journal of the Acoustical Society of America, vol. 148, pp. 1157-1168.
- Liu Y. and Wang D.L. (2020):
Causal deep CASA for monaural talker-independent speaker separation.
IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 28, pp. 2109-2118.
- Wang Z.-Q., Wang P., and Wang D.L. (2020):
Complex spectral mapping for single- and multi-channel speech enhancement and robust ASR.
IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 28, pp. 1778-1787.
- Zhao Y., Wang D.L., Xu B., and Zhang T. (2020):
Monaural speech dereverberation using temporal convolutional networks with self attention.
IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 28, pp. 1598-1607.
- Healy E.W., Johnson E.M., Delfarah M., and Wang D.L. (2020):
A talker-independent deep learning algorithm to increase intelligibility for hearing-impaired listeners in reverberant competing talker conditions.
Journal of the Acoustical Society of America, vol. 147, pp. 4106-4118.
- Taherian H., Wang Z.-Q., Chang J., and Wang D.L. (2020):
Robust speaker recognition based on single-channel and multi-channel speech enhancement.
IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 28, pp. 1293-1302.
- Wang Z.-Q. and Wang D.L. (2020):
Deep learning based target cancellation for speech dereverberation.
IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 28, pp. 941-950.
- Tan K. and Wang D.L. (2020):
Learning complex spectral mapping with gated convolutional recurrent networks for monaural speech enhancement.
IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 28, pp. 380-390.
(Related Source Code.)
- Wang P., Tan K., and Wang D.L. (2020):
Bridging the gap between monaural speech enhancement and recognition with distortion-independent acoustic modeling.
IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 28, pp. 39-48.
- Liu Y. and Wang D.L. (2019):
Divide and conquer: A deep CASA approach to talker-independent monaural speaker separation.
IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 27, pp. 2092-2102.
(Tensorflow Code and Description in GitHub.)
- Delfarah M. and Wang D.L. (2019):
Deep learning for talker-dependent reverberant speaker separation: An empirical study.
IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 27, pp. 1839-1848.
- Healy E.W., Vasco J.L., and Wang D.L. (2019):
The optimal threshold for removing noise from speech is similar across normal and impaired hearing — a time-frequency masking study.
Journal of the Acoustical Society of America Express Letters, vol. 145, pp. EL581-586.
- Pandey A. and Wang D.L. (2019):
A new framework for CNN-based speech enhancement in the time domain.
IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 27, pp. 1179-1188 (Recipient of the 2022 Young Author Best Paper Award to Pandey from the IEEE Signal Processing Society).
- Healy E.W., Delfarah M., Johnson E.M., and Wang D.L. (2019):
A deep learning algorithm to increase intelligibility for hearing-impaired listeners in the presence of a competing talker and reverberation.
Journal of the Acoustical Society of America, vol. 145, pp. 1378-1388.
- Wang Z.-Q. and Wang D.L. (2019):
Combining spectral and spatial features for deep learning based blind speaker separation.
IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 27, pp. 457-468.
- Tan K., Chen J., and Wang D.L. (2019):
Gated residual networks with dilated convolutions for monaural speech enhancement.
IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 27, pp. 189-198.
- Wang Z.-Q., Zhang X., and Wang D.L. (2019):
Robust speaker localization guided by deep learning-based time-frequency masking.
IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 27, pp. 178-188.
- Zhao Y., Wang Z.-Q., and Wang D.L. (2019):
Two-stage deep learning for noisy-reverberant speech enhancement.
IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 27, pp. 53-62.
- Zhao Y., Wang D.L., Johnson E.M., and Healy E.W. (2018):
A deep learning based segregation algorithm to increase speech intelligibility for hearing-impaired listeners in reverberant-noisy conditions.
Journal of the Acoustical Society of America, vol. 144, pp. 1627-1637.
- Wang D.L. and Chen J. (2018):
Supervised speech separation based on deep learning: An overview.
IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 26, pp. 1702-1726.
- Williamson D.S. and Wang D.L. (2017):
Time-frequency masking in the complex domain for speech dereverberation and denoising.
IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 25, pp. 1492-1501.
- Chen J. and Wang D.L. (2017):
Long short-term memory for speaker generalization in supervised speech separation.
Journal of the Acoustical Society of America, vol. 141, pp. 4705-4714.
- Mayer F., Williamson D.S., Mowlaee P., and Wang D.L. (2017):
Impact of phase estimation on single-channel speech separation based on time-frequency masking.
Journal of the Acoustical Society of America, vol. 141, pp. 4668-4679.
- Healy E.W., Delfarah M., Vasko J.L., Carter B.L., and Wang D.L. (2017):
An algorithm to increase intelligibility for hearing-impaired listeners in the presence of a competing talker.
Journal of the Acoustical Society of America, vol. 141, pp. 4230-4239.
- Delfarah M. and Wang D.L. (2017):
Features for masking-based monaural speech separation in reverberant conditions.
IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 25, pp. 1085-1094.
- Zhang X. and Wang D.L. (2017):
Deep learning based binaural speech separation in reverberant environments.
IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 25, pp. 1075-1084.
- Wang D.L. (2017):
Deep learning reinvents the hearing aid.
IEEE Spectrum, March Issue, pp. 32-37 (Cover Story).
- Liu Y. and Wang D.L. (2017):
Speaker-dependent multipitch tracking using deep neural networks.
Journal of the Acoustical Society of America, vol. 141, pp. 710-721.
- Chen J., Wang Y., Yoho S.E., Wang D.L., and Healy E.W. (2016):
Large-scale training to increase speech intelligibility for hearing-impaired listeners in novel noises.
Journal of the Acoustical Society of America, vol. 139, pp. 2604-2612.
- Zhang X.-L. and Wang D.L. (2016):
A deep ensemble learning method for monaural speech separation.
IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 24, pp. 967-977.
- Wang Z.-Q. and Wang D.L. (2016):
A joint training framework for robust automatic speech recognition.
IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 24, pp. 796-806.
- Williamson D.S., Wang Y., and Wang D.L. (2016):
Complex ratio masking for monaural speech separation.
IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 24, pp. 483-492.
- Chen J., Wang Y., and Wang D.L. (2016):
Noise perturbation for supervised speech separation.
Speech Communication, vol. 78, pp. 1-10.
- Zhang X.-L. and Wang D.L. (2016):
Boosting contextual information for deep neural network based voice activity detection.
IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 24, pp. 252-264.
- Healy E.W., Yoho S.E., Chen J., Wang Y., and Wang D.L. (2015):
An algorithm to increase speech intelligibility for hearing-impaired listeners in novel segments
of the same noise type.
Journal of the Acoustical Society of America, vol. 138, pp. 1660-1669.
- Williamson D.S., Wang Y., and Wang D.L. (2015):
Estimating nonnegative matrix model activations with deep neural networks to increase perceptual speech quality.
Journal of the Acoustical Society of America, vol. 138, pp. 1399-1407.
- Zhao X., Wang Y., and Wang D.L. (2015):
Cochannel speaker identification in anechoic and reverberant conditions.
IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 23, pp. 1727-1736.
- Yuan J., Wang D.L., and Cheriyadat A.M. (2015):
Factorization-based texture segmentation.
IEEE Transactions on Image Processing,
vol. 24, pp. 3488-3497.
(
Description with Matlab Code.)
- Han K., Wang Y., Wang D.L., Woods W.S., Merks I., and Zhang T. (2015):
Learning spectral mapping for speech dereverberation and denoising.
IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 23, pp. 982-992.
- Narayanan A. and Wang D.L. (2015):
Improving robustness of deep neural network acoustic models via speech separation and joint adaptive training.
IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 23, pp. 92-101.
- Healy E.W., Yoho S.E., Wang Y., Apoux F., and Wang D.L. (2014):
Speech cue transmission by an algorithm to increase consonant recognition in noise for hearing-impaired listeners.
Journal of the Acoustical Society of America, vol. 136, pp. 3325-3336. (Supplemental Confusion Matrices.)
- Han K. and Wang D.L. (2014):
Neural network based pitch tracking in very noisy speech.
IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 22, pp. 2158-2168.
- Jiang Y., Wang D.L., Liu R.S., Feng Z.M. (2014):
Binaural classification for reverberant speech segregation using deep neural networks.
IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 22, pp. 2112-2121.
- Chen J., Wang Y., and Wang D.L. (2014):
A feature study for classification-based speech separation at low signal-to-noise ratios.
IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 22, pp. 1993-2002. (Related Source Code.)
- Wang Y., Narayanan A. and Wang D.L. (2014):
On training targets for supervised speech separation.
IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 22, pp. 1849-1858
(Recipient of the 2019 Best Paper Award from the IEEE Signal Processing Society).
- Williamson D.S., Wang Y., and Wang D.L. (2014):
Reconstruction techniques for improving the perceptual quality of binary masked speech.
Journal of the Acoustical Society of America, vol. 136, pp. 892-902.
- Zhao X., Wang Y., and Wang D.L. (2014):
Robust speaker identification in noisy and reverberant conditions.
IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 22, pp. 836-845.
- Narayanan A. and Wang D.L. (2014):
Investigation of speech separation as a front-end for noise robust speech recognition.
IEEE/ACM Transactions on Audio, Speech, and Language Processing,
vol. 22, pp. 826-835.
- Yuan J., Wang D.L., and Li R. (2014):
Remote sensing image segmentation by combining spectral and texture features.
IEEE Transactions on Geoscience and Remote Sensing,
vol. 52, pp. 16-24.
- Hartmann W., Narayanan A., Fosler-Lussier E., and Wang D.L. (2013):
A direct masking approach to robust ASR.
IEEE Transactions on Audio, Speech, and Language Processing, vol. 21, pp. 1993-2005.
- Healy E.W., Yoho S.E., Wang Y., and Wang D.L. (2013):
An algorithm to improve speech recognition in noise for hearing-impaired listeners.
Journal of the Acoustical Society of America, vol. 134, pp. 3029-3038.
(Related
Press Release, YouTube Demo, and Test Data.)
- Hu K. and Wang D.L. (2013):
An iterative model-based approach to cochannel speech separation.
EURASIP Journal on Audio, Speech, and Music Processing,
vol. 2013, Article ID 2013-14, 11 pages. (Related Source Code.)
- Narayanan A. and Wang D.L. (2013):
The role of binary mask patterns in automatic speech recognition in background noise.
Journal of the Acoustical Society of America, vol. 133, pp. 3083-3093.
- Wang Y. and Wang D.L. (2013):
Towards scaling up classification-based speech separation.
IEEE Transactions on Audio, Speech, and Language Processing, vol. 21, pp. 1381-1390.
- Woodruff J. and Wang D.L. (2013):
Binaural detection, localization, and segregation in reverberant environments based on
joint pitch and azimuth cues.
IEEE Transactions on Audio, Speech, and Language Processing,
vol. 21, pp. 806-815.
- Wang Y., Han K., and Wang D.L. (2013):
Exploring monaural features for classification-based speech segregation.
IEEE Transactions on Audio, Speech, and Language Processing, vol. 21, pp. 270-279.
- Han K. and Wang D.L. (2013):
Towards generalizing classification based speech separation.
IEEE Transactions on Audio, Speech, and Language Processing, vol. 21, pp. 166-175.
- Hu K. and Wang D.L. (2013):
An unsupervised approach to cochannel speech separation.
IEEE Transactions on Audio, Speech, and Language Processing,
vol. 21, pp. 120-129. (Related Source Code.)
- Han K. and Wang D.L. (2012):
A classification based approach to speech segregation.
Journal of the Acoustical Society of America, vol. 132, pp. 3475-3483.
- Narayanan A. and Wang D.L. (2012):
A CASA based system for long-term SNR estimation.
IEEE Transactions on Audio, Speech, and Language Processing,
vol. 20, pp. 2518-2527. (Related Source Code.)
- Zhao X., Shao Y., and Wang D.L. (2012):
CASA-based robust speaker identification.
IEEE Transactions on Audio, Speech, and Language Processing,
vol. 20, pp. 1608-1616. (Related Source Code.)
- Woodruff J. and Wang D.L. (2012):
Binaural localization of multiple sources in reverberant and noisy environments.
IEEE Transactions on Audio, Speech, and Language Processing,
vol. 20, pp. 1503-1512.
- Hsu C.-L., Wang D.L., Jang J.-S.R., and Hu K. (2012):
A tandem algorithm for singing pitch extraction and voice separation from
music accompaniment.
IEEE Transactions on Audio, Speech, and Language Processing,
vol. 20, pp. 1482-1491.
- Yuan J., Wang D.L., and Li R. (2012):
Image segmentation using local spectral histograms and linear regression.
Pattern Recognition Letters,
vol. 33, pp. 615-622.
- Yuan J., Wang D.L., Wu B., Yan L., and Li R. (2011):
LEGION-based automatic road extraction from satellite imagery.
IEEE Transactions on Geoscience and Remote Sensing,
vol. 49, pp. 4528-4538.
- Jin Z. and Wang D.L. (2011):
Reverberant speech segregation based on multipitch tracking and classification.
IEEE Transactions on Audio, Speech, and Language Processing,
vol. 19, pp. 2328-2337. (Related Source Code.)
- Hu K. and Wang D.L. (2011):
Unvoiced speech segregation from nonspeech interference via CASA and spectral subtraction.
IEEE Transactions on Audio, Speech, and Language Processing,
vol. 19, pp. 1600-1609. (Related Source Code.)
- Jin Z. and Wang D.L. (2011):
HMM-based multipitch tracking for noisy and reverberant speech.
IEEE Transactions on Audio, Speech, and Language Processing,
vol. 19, pp. 1091-1102. (Related Source Code.)
- Jan T., Wang W., Wang D.L. (2011):
A multistage approach to blind separation of convolutive speech mixtures.
Speech Communication, vol. 53, pp. 524-539.
- Quiles M.G., Wang D.L., Zhao L., Romero R.A.F., and Huang D.-S. (2011):
Selecting salient objects in real scenes: An oscillatory correlation model.
Neural Networks,
vol. 24, pp. 54-64.
- Hu G. and Wang D.L. (2010):
A tandem algorithm for pitch estimation and voiced speech segregation.
IEEE Transactions on Audio, Speech, and Language Processing,
vol. 18, pp. 2067-2079. (Related
Sound Demo and Source Code.)
- Narayanan A. and Wang D.L. (2010):
Robust speech recognition from binary masks.
Journal of the Acoustical Society of America Express Letters,
vol. 128, pp. EL217-222.
- Woodruff J. and Wang D.L. (2010):
Sequential organization of speech in reverberant environments by integrating
monaural grouping and binaural localization.
IEEE Transactions on Audio, Speech, and Language Processing,
vol. 18, pp. 1856-1866.
- Srinivasan S. and Wang D.L. (2010):
Robust speech recognition by integrating speech separation and hypothesis testing.
Speech Communication, vol. 52, pp. 72-81.
- Shao Y., Srinivasan S., Jin Z., and Wang D.L. (2010):
A computational auditory scene analysis system for speech segregation and
robust speech recognition.
Computer Speech and Language, vol. 24, pp. 77-93.
- Kjems U., Boldt J.B., Pedersen M.S., Lunner T., and Wang D.L. (2009):
Role of mask pattern in intelligibility of ideal binary-masked noisy speech.
Journal of the Acoustical Society of America, vol. 126, pp. 1415-1426.
- Li Y., Woodruff J., and Wang D.L. (2009):
Monaural musical sound separation based on pitch and common amplitude modulation.
IEEE Transactions on Audio, Speech, and Language Processing, vol. 17, pp. 1361-1371. (Related
Sound Demo and Source Code.)
- Li Y. and Wang D.L. (2009):
Musical sound separation based on binary time-frequency masking.
EURASIP Journal on Audio, Speech, and Music Processing,
vol. 2009, Article ID 130567, 10 pages.
(Related
Sound Demo.)
- Shao Y. and Wang D.L. (2009):
Sequential organization of speech in computational auditory scene analysis.
Speech Communication, vol. 51, pp. 657-667.
- Brungart D.S., Chang P.S., Simpson B.D., and Wang D.L. (2009):
Multitalker speech perception with ideal time-frequency segregation: Effects of voice characteristics and number of talkers.
Journal of the Acoustical Society of America, vol. 125, pp. 4006-4022.
- Wang D.L., Kjems U., Pedersen M.S., Boldt J.B., and Lunner T. (2009):
Speech intelligibility in background noise with ideal binary time-frequency masking.
Journal of the Acoustical Society of America, vol. 125, pp. 2336-2347.
- Jin Z. and Wang D.L. (2009):
A supervised learning approach to monaural segregation of reverberant speech.
IEEE Transactions on Audio, Speech, and Language Processing,
vol. 17, pp. 625-638.
- Li Y. and Wang D.L. (2009):
On the optimality of ideal binary time-frequency masks.
Speech Communication, vol. 51, pp. 230-239.
- Wang D.L. (2008):
Time-frequency masking for speech separation and its potential for hearing aid design.
Trends in Amplification, vol. 12, pp. 332-353.
- Srinivasan S. and Wang D.L. (2008):
A model for multitalker speech perception.
Journal of the Acoustical Society of America, vol. 124, pp. 3213-3224.
- Wang D.L., Kjems U., Pedersen M.S., Boldt J.B., and Lunner T. (2008):
Speech perception of noise with binary gains.
Journal of the Acoustical Society of America, vol. 124, pp. 2303-2307.
- Hu G. and Wang D.L. (2008):
Segregation of unvoiced speech from nonspeech interference.
Journal of the Acoustical Society of America, vol. 124, pp. 1306-1319.
- Roman N. and Wang D.L. (2008):
Binaural tracking of multiple moving sources.
IEEE Transactions on Audio, Speech, and Language Processing,
vol. 16, pp. 728-739.
- Wang D.L. and Chang P.S. (2008):
An oscillatory correlation model of auditory streaming.
Cognitive Neurodynamics, vol. 2, pp. 7-19.
- Pedersen M.S., Wang D.L., Larsen J., and Kjems U. (2008):
Two-microphone separation of speech mixtures.
IEEE Transactions on Neural Networks, vol. 19, pp. 475-492.
(Related Sound Demo and Source Code.)
- Srinivasan S. and Wang D.L. (2007):
Transforming binary uncertainties for robust
speech recognition.
IEEE Transactions on Audio, Speech, and Language Processing, vol. 15, pp. 2130-2140.
- Li Y. and Wang D.L. (2007):
Separation of singing voice from music accompaniment for monaural
recordings.
IEEE Transactions on Audio, Speech, and Language Processing, vol. 15, pp. 1475-1487. (Related
Sound Demo and Source Code.)
- Hu G. and Wang D.L. (2007):
Auditory segmentation based on onset and offset analysis.
IEEE Transactions on Audio, Speech, and Language Processing, vol. 15, pp. 396-405. (Related Source Code.)
- Roman N., Srinivasan S., and Wang D.L. (2006):
Binaural segregation in multisource reverberant environments.
Journal of the Acoustical Society of America, vol. 120, pp. 4040-4051.
- Brungart D.S., Chang P.S., Simpson B.D., and Wang D.L. (2006):
Isolating the energetic component of speech-on-speech masking with ideal time-frequency segregation.
Journal of the Acoustical Society of America, vol. 120, pp. 4007-4018.
- Srinivasan S., Roman N., and Wang D.L. (2006):
Binary and ratio time-frequency masks for robust speech recognition.
Speech Communication, vol. 48, pp. 1486-1501.
- Liu X. and Wang D.L. (2006):
Image and texture segmentation using local spectral histograms.
IEEE Transactions on Image Processing, vol. 15, pp. 3066-3077.
- Roman N. and Wang D.L. (2006):
Pitch-based monaural segregation of reverberant speech.
Journal of the Acoustical Society of America, vol. 120, pp.
458-469.
- Wu M. and Wang D.L. (2006):
A two-stage algorithm for one-microphone reverberant speech enhancement.
IEEE Transactions on Audio, Speech, and Language Processing, vol. 14, pp. 774-784.
(Related Sound Demo and Source Code.)
- Wu M. and Wang D.L. (2006):
A pitch-based method for the estimation of short reverberation time.
Acta Acustica united with Acustica, vol. 92, pp. 337-339.
- Shao Y. and Wang D.L. (2006):
Model-based sequential organization in cochannel speech.
IEEE Transactions on Audio, Speech, and Language Processing
(formerly IEEE Transactions on Speech and Audio Processing),
vol. 14, pp. 289-298.
- Wang D.L. (2005):
The time dimension for scene analysis.
IEEE Transactions on Neural Networks, vol. 16, pp. 1401-1426
(Recipient of the 2007 Outstanding Paper Award from the IEEE Computational Intelligence Society).
- Wang D.L., Kristjansson A., and Nakayama K. (2005):
Efficient visual search without top-down or bottom-up guidance.
Perception & Psychophysics, vol. 67, pp. 239-253.
- Srinivasan S. and Wang D.L. (2005):
A schema-based model for phonemic restoration.
Speech Communication, vol. 45, pp. 63-87.
- Palomaki K.J., Brown G.J., and Wang D.L. (2004):
A binaural processor for missing data speech recognition in the presence
of noise and small-room reverberation.
Speech Communication, vol. 43, pp. 361-378.
- Hu G. and Wang D.L. (2004):
Monaural speech segregation based on pitch tracking and amplitude
modulation. IEEE Transactions on Neural Networks,
vol. 15, pp. 1135-1150.
(Related
Sound Demo and Source Code.)
- Campbell S.R., Wang D.L., and Jayaprakash C. (2004):
Synchronization rates in classes of relaxation oscillators.
IEEE Transactions on Neural Networks, vol. 15,
pp. 1027-1038.
- Wang D.L., Freeman W.J., Kozma R., Lozowski A., and Minai A. (2004):
Guest Editorial for Special Issue on Temporal Coding for Neural
Information Processing.
IEEE Transactions on Neural Networks, vol. 15,
pp. 953-956.
- Roman N., Wang D.L., Brown G.J. (2003):
Speech segregation based on sound localization.
Journal of the Acoustical Society of America, vol. 114, Pt. 1,
pp. 2236-2252.
(Related
Sound Demo and Source Code.)
- Liu X., Srivastava A., and Wang D.L. (2003):
Intrinsic generalization analysis of low dimensional representations.
Neural Networks, vol. 16, pp. 537-545.
- Liu X. and Wang D.L. (2003):
Texture classification using spectral histograms.
IEEE Transactions on Image Processing, vol. 12, pp. 661-670.
- Wu M., Wang D.L., and Brown G.J. (2003):
A multipitch tracking algorithm for noisy speech.
IEEE Transactions on Speech and Audio Processing, vol. 11, pp. 229-241.
(Related Source Code.)
- Cesmeli E., Lindsey D.T., and Wang D.L. (2002):
An oscillatory correlation model of visual motion analysis.
Perception & Psychophysics, vol. 64, 1191-1217.
- Liu X. and Wang D.L. (2002):
A spectral histogram model for texton modeling and texture
discrimination.
Vision Research, vol. 42, 2617-2634.
- Kristjansson A., Wang D.L., and Nakayama K. (2002):
The role of priming in conjunctive visual search.
Cognition, vol. 85, 37-52.
- Chen K. and Wang D.L. (2002):
A dynamically coupled neural oscillator network for image segmentation.
Neural Networks, vol. 15, 423-439.
- Wang D.L. and Liu X. (2002):
Scene analysis by integrating primitive segmentation and associative memory.
IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics, vol. 32, 254-268.
- Chen K. and Wang D.L. (2001):
Perceiving geometric patterns: from spirals to inside/outside relations.
IEEE Transactions on Neural Networks, vol. 12, 1084-1102.
- Wang D.L. (September 2001):
Book Review: Unsupervised Learning - Foundations of Neural
Computation. Edited by G. Hinton and T.J. Sejnowski, MIT Press, 1999.
AI Magazine, vol. 22, 101-102.
- Fox J.J., Jayaprakash C., Wang D.L., and Campbell S.R. (2001):
Synchronization in relaxation oscillator networks with conduction
delays.
Neural Computation, vol. 13, 1003-1021.
- Cesmeli E. and Wang D.L. (2001):
Texture segmentation using Gaussian-Markov random fields and
neural oscillator networks.
IEEE Transactions on Neural Networks, vol. 12, 394-404.
- Liu X., Chen K., and Wang D.L. (2001):
Extraction of hydrographic regions
from remote sensing images using an oscillator network with weight
adaptation.
IEEE Transactions on GeoScience and Remote Sensing,
vol. 39, 207-211. For a more extensive version see
Technical Report OSU-CISRC-4/99-TR12, 1999,
Department of Computer and Information Science,
The Ohio State University, Columbus, Ohio, USA.
- van der Kouwe A.J.W., Wang D.L., and Brown G.J. (2001):
A comparison of auditory
and blind separation techniques for speech segregation.
IEEE Transactions on Speech and Audio Processing, vol. 9, 189-195.
- Chen K., Wang D.L., and Liu X. (2000):
Weight adaptation
and oscillatory correlation for image segmentation.
IEEE Transactions on Neural Networks, vol. 11, 1106-1123.
- Cesmeli E. and Wang D.L. (2000):
Motion
segmentation based on motion/brightness
integration and oscillatory correlation. IEEE Transactions
on Neural Networks, vol. 11, 935-947.
- Wang D.L. (2000):
On connectedness: a
solution based on oscillatory correlation.
Neural Computation, vol. 12, 131-139.
- Liu X., Wang D.L., and Ramirez J.R. (2000):
Boundary detection by contextual nonlinear smoothing.
Pattern Recognition, vol. 33, 263-280.
- Campbell S.R., Wang D.L., and Jayaprakash C. (1999):
Synchrony and desynchrony in
integrate-and-fire oscillators. Neural Computation,
vol. 11, 1595-1619.
- Wang D.L. and Brown G.J. (1999):
Separation of speech from interfering sounds based on oscillatory
correlation.
IEEE Transactions on Neural Networks, vol. 10, 684-697.
(Related Source Code.)
- Liu X. and Wang D.L. (1999):
Range image segmentation using a LEGION network.
IEEE Transactions on Neural Networks, vol. 10, 564-573.
- Wang D.L. (1999):
Object selection based on oscillatory correlation.
Neural Networks, vol. 12, 579-592.
- Shareef N., Wang D.L., and Yagel R. (1999):
Segmentation of medical images
using LEGION. IEEE Transactions on Medical Imaging,
vol. 18, 74-91.
Click
here for an HTML version, slightly different from the printed version.
- Linsay P.S. and Wang D.L. (1998):
Fast numerical integration of relaxation oscillator networks based on
singular limit solutions.
IEEE Transactions on Neural Networks, vol. 9, 523-532.
(Related Source Code.)
- Campbell S.R. and Wang D.L. (1998):
Relaxation oscillators with time
delay coupling.
Physica D, vol. 111, 151-178.
- Wang D.L. (1997):
On the computational basis of synchronized codes.
Behavioral and Brain Sciences, vol. 20, 700-701.
- Wang D.L. and Terman D. (1997):
Image segmentation based on oscillatory correlation.
Neural Computation, vol. 9, 805-836
(For errata see Neural Computation, vol. 9, 1623-1626, 1997). Also, for
an earlier but extended version with detailed analysis see
Image segmentation based on oscillatory correlation.
Technical Report 19, Center for Cognitive Science,
The Ohio State University, Columbus, Ohio, USA.
(Related Source Code.)
- Brown G.J. and Wang D.L. (1997):
Modelling the perceptual segregation of
double vowels with a network of neural oscillators.
Neural Networks, vol. 10, 1547-1558.
- Wang D.L. and Yuwono B. (1996):
Incremental learning of complex temporal patterns.
IEEE Transactions on Neural Networks, vol 7, 1465-1481.
- Wang D.L. (1996):
Primitive auditory segregation based on oscillatory correlation.
Cognitive Science, vol. 20, 409-456.
- Campbell S.R. and Wang D.L. (1996):
Synchronization and desynchronization in a network of locally coupled
Wilson-Cowan oscillators.
IEEE Transactions on Neural Networks, vol. 7(3), 541-554.
- Wang D.L., Liu, X.M., and Ahalt, S.C. (1996):
On temporal generalization of simple recurrent networks.
Neural Networks, vol. 9, 1099-1118.
- Wang D.L. and Yuwono B. (1995):
Anticipation-based temporal pattern generation.
IEEE Transactions on Systems, Man, and Cybernetics,
vol. 25, 615-628.
- Wang D.L. (1995):
Emergent synchrony in locally coupled neural oscillators.
IEEE Transactions on Neural Networks, vol. 6, 941-948.
- Terman D. and Wang D.L. (1995):
Global competition and local cooperation
in a network of neural oscillators.
Physica D, vol. 81, 148-176.
(Related Source Code.)
- Wang D.L. and Terman D. (1995):
Locally excitatory globally inhibitory oscillator networks.
IEEE Transactions on Neural Networks, vol. 6(1), 283-286.
(Related Source Code.)
- Wang D.L. (1994):
Modeling neural mechanisms of vertebrate habituation:
Locus specificity and pattern discrimination.
Journal of Computational Neuroscience, vol. 1(4), 285-299.
- Wang D.L. (1993):
A neural model of synaptic plasticity underlying
short-term and long-term habituation. Adaptive Behavior,
vol. 2, 111-129.
- Wang D.L. and Arbib M.A. (1993):
Timing and chunking in processing temporal order.
IEEE Transactions on Systems, Man, and
Cybernetics, vol. 23, 993-1009.
- Wang D.L. (August 1993):
Pattern recognition: Neural networks in perspective.
IEEE Expert, vol. 8, 52-60.
- Wang D.L. and Arbib M.A. (1992):
The dishabituation hierarchy in toads: the role of the primordial hippocampus.
Biological Cybernetics, vol. 67, 535-544.
- Wang D.L. and Ewert J.-P. (1992):
Configurational pattern discrimination responsible for dishabituation in common toads
Bufo bufo (L.): behavioral tests of the predictions of a neural model.
Journal of Comparative Physiology A, vol. 170, 317-325.
- Wang D.L. and Arbib M.A.(1991):
How does the toad's visual system discriminate different worm-like stimuli?
Biological Cybernetics, vol. 64, 251-261.
- Wang D.L., Buhmann J., and von der Malsburg C. (1990):
Pattern segmentation in associative memory.
Neural Computation,
vol. 2, 94-106. Reprinted as a book chapter in 1991.
- Wang D.L. and Hsu C.C. (1990):
SLONN: A simulation language for modeling of neural networks.
Simulation, vol. 55, 69-83.
- Wang D.L. and Arbib M.A. (1990):
Complex temporal sequence learning
based on short-term memory.
Proceedings of the IEEE, vol. 78, 1536-1543.
- Wang D.L. and Hsu C.C. (1988):
A neuron model for computer simulation of neural networks.
Acta Automatica Sinica (both in Chinese and in English), vol. 14, 424-430.
- Hsu C.C. and Wang D.L. (1988):
SLONN: A simulation language for neural networks and its implementation.
Journal of Computers (in Chinese), vol. 11, 741-749.
- Wang D.L. (1986):
A neuron model based on information processing of the nervous system.
Cognitive Science (in Chinese), January
1986, 81-88.
Book Chapters
- Narayanan A. and Wang D.L. (2013):
Computational auditory scene analysis and automatic speech recognition. In Virtanen T., Singh R., and Raj B. (ed.),
Techniques for Noise Robustness in Automatic Speech Recognition, Wiley & Sons, pp. 433-462.
- Wang D.L. (2007):
Computational scene analysis. In Duch W. and Mandziuk J. (ed.),
Challenges for Computational Intelligence, Springer,
Berlin, pp. 163-191.
- Hu G. and Wang D.L. (2006):
An auditory scene analysis approach to monaural speech segregation. In Hansler E. and Schmidt G. (ed.),
Topics in Acoustic Echo and Noise Control, Springer,
Heidelberg, pp. 485-515.
- Brown G.J. and Wang D.L. (2006):
Timing is of the essence: Neural
oscillator models of auditory grouping. In Greenberg S. and Ainsworth W. (ed.),
Listening to Speech: An Auditory Perspective, Lawrence Erlbaum,
Mahwah NJ.
- Brown G.J. and Wang D.L. (2005):
Separation of speech by computational auditory scene analysis.
In Benesty J., Makino S., and Chen J. (ed.),
Speech Enhancement, Springer, New York, pp. 371-402.
- Wang D.L. (2005):
On ideal binary mask as the computational goal of auditory scene analysis. In Divenyi P. (ed.),
Speech Separation by Humans and Machines, pp. 181-197, Kluwer Academic, Norwell MA.
- Wang D.L. (2003):
Temporal pattern processing. In: Arbib M.A. (ed.),
The Handbook of Brain Theory and Neural Networks, 2nd Ed.,
pp. 1163-1167, MIT Press, Cambridge MA.
- Wang D.L. (2003):
Visual scene segmentation. In: Arbib M.A. (ed.),
The Handbook of Brain Theory and Neural Networks, 2nd Ed.,
pp. 1215-1219, MIT Press, Cambridge MA.
- Wang D.L. (2000):
Anticipation model for sequential learning of complex sequences.
In Run R. and Giles C.L. (eds.),
Sequence Learning, LNAI 1828, pp. 53-79,
Springer-Verlag, Berlin Heidelberg.
- Wang D.L. (1999):
Relaxation oscillators and networks. In Webster J. (ed.),
Wiley Encyclopedia of Electrical and Electronics Engineering,
vol. 18, pp. 396-405, Wiley & Sons.
- Wang D.L. (1998):
Stream segregation based on oscillatory correlation. In Rosenthal D. and
Okuno H.G. (eds.), Computational Auditory Scene Analysis,
pp. 71-86, Lawrence Erlbaum, Mahwah NJ.
- Wang D.L. (1996):
Synchronous oscillations based on lateral connections.
In Sirosh J., Miikkulainen R., and Choe Y. (eds.),
Lateral Connections in the Cortex: Structure and Function.
- Wang D.L. (1995):
An oscillatory correlation theory of temporal pattern segmentation.
In Covey E., Hawkins H., and Port R.F. (eds), Neural Representation of
Temporal Patterns, pp. 53-75, Plenum, New York.
- Wang D.L. (1995):
Temporal pattern processing. In: Arbib M.A. (ed.),
The Handbook of Brain Theory and Neural Networks,
pp. 967-971, MIT Press, Cambridge MA.
- Wang D.L. (1995):
Habituation. In: Arbib M.A. (ed.), The
Handbook of Brain Theory and Neural Networks, pp. 441-444,
MIT Press.
- Arbib M.A. and Wang D.L. (1991):
Computational models of visual pattern
discrimination in common toads. In: Ewert J.-P. and Werner H. (eds),
Models of Brain Function and Artificial Neuronal Nets,
pp. 67-97, GhK University Edition, Kassel, Germany.
- Wang D.L., Arbib M.A., and Ewert J.-P. (1991):
Dishabituation
hierarchies for visual pattern discrimination in toads: A dialog between
modeling and experimentation. In: Arbib M.A. and Ewert J.-P. (eds),
Visual Structures and Integrated Functions, Research Notes in
Neural Computing, pp. 427-441, Springer-Verlag, Berlin.
- Wang D.L., Buhmann J., and von der Malsburg C. (1991):
Pattern segmentation in associative memory. In: David J.L. and Eichenbaum H.
(eds), Olfaction: A Model System for Computational
Neuroscience, pp. 213-224, MIT Press, Cambridge, MA. Reprint of
the 1990 Neural Computation article.
- Wang D.L. (1986):
Computer simulation. In: Mao Y.S. (ed), Handbook
for Modern Engineers (in Chinese), pp. 615-617, Beijing Press,
Beijing, China. This book won the 1986 National Award of Excellent and
Best-selling Book of Science and Technology.
Conference Papers
The following abbreviations are used:
CNS: Annual Meeting of Computation and Neural Systems
ICASSP: IEEE International Conference on Acoustics, Speech, and Signal Processing
ICNN: IEEE International Conference on Neural Networks
ICSLP: Internation Conference on Spoken Language Processing
IJCNN: International Joint Conference on Neural Networks
ISMIR: International Conference on Music Information Retrieval
NIPS: Annual Conference on Neural Information Processing Systems
- Taherian H., Ahmadi Kalkhorani V., Pandey A., Wong D., Xu B., and Wang D.L. (2024):
Towards explainable monaural speaker separation with auditory-based training.
Proceedings of INTERSPEECH-24, pp. 572-576.
- Taherian H., Pandey A., Wong D., Xu B., and Wang D.L. (2024):
Leveraging sound localization to improve continuous speaker separation.
Proceedings of ICASSP-24, pp. 621-625.
- Ahmadi Kalkhorani V., Kumar A., Tan K., Xu B., and Wang D.L. (2024):
Audiovisual speaker separation with full- and sub-band modeling in the time-frequency domain.
Proceedings of ICASSP-24, pp. 12001-12005.
- Zhang Y., Yu M., Zhang H., Yu D., and Wang D.L. (2023):
NeuralKalman: A learnable Kalman filter for acoustic echo cancellation.
Proceedings of ASRU-23, 7 pages.
- Yang Y., Pandey A., and Wang D.L. (2023):
Time-domain speech enhancement for robust automatic speech recognition.
Proceedings of INTERSPEECH-23, pp. 4913-4917.
- Taherian H., Pandey A., Wong D., Xu B., and Wang D.L. (2023):
Multi-input multi-output complex spectral mapping for speaker separation.
Proceedings of INTERSPEECH-23, pp. 1070-1074.
- Ahmadi Kalkhorani V., Kumar A., Tan K., Xu B., and Wang D.L. (2023):
Time-domain transformer-based audiovisual speaker separation.
Proceedings of INTERSPEECH-23, pp. 3472-3476.
- Wang H. and Wang D.L. (2023):
Cross-domain diffusion based speech enhancement for very noisy speech.
Proceedings of ICASSP-23, 5 pages.
- Taherian H. and Wang D.L. (2023):
Multi-resolution location-based training for multi-channel continuous speech separation.
Proceedings of ICASSP-23, 5 pages.
- Wang H. et al. (2023):
Data2Vec-SG: Improving self-supervised learning representations for speech generation tasks.
Proceedings of ICASSP-23, 5 pages.
- H. Liu, W. Choi, X. Liu, Q. Kong, Q. Tian, and Wang D.L. (2022):
Neural vocoder is all you need for speech super-resolution.
Proceedings of INTERSPEECH-22, pp. 4227-4231.
- H. Liu, X. Liu, Q. Kong, Q. Tian, Y. Zhao, D.L. Wang, C. Huang, and Y. Wang (2022):
VoiceFixer: A unified framework for high-fidelity speech restoration.
Proceedings of INTERSPEECH-22, pp. 4232-4236.
- Pandey A. and Wang D.L. (2022):
Attentive training: A new training framework for talker-independent speaker extraction.
Proceedings of INTERSPEECH-22, pp. 201-205.
- Pandey A., B. Xu, A. Kumar, J. Donley, P. Calamia, and Wang D.L. (2022):
Time-domain ad-hoc array speech enhancement using a triple-path network.
Proceedings of INTERSPEECH-22, pp. 729-733.
- Zhang H., Pandey A., and Wang D.L. (2022):
Attentive recurrent network for low-latency active noise control.
Proceedings of INTERSPEECH-22, pp. 956-960.
- Zhang Y., Wang H., and Wang D.L. (2022):
Densely-connected convolutional recurrent network for fundamental frequency estimation in noisy speech.
Proceedings of INTERSPEECH-22, pp. 401-405.
- Pandey A., Xu B., Kumar A., Donley J., Calamia P., and Wang D.L. (2022b):
Multichannel speech enhancement without beamforming.
Proceedings of ICASSP-22, pp. 6502-6506.
- Pandey A., Xu B., Kumar A., Donley J., Calamia P., and Wang D.L. (2022a):
TPARN: Triple-path attentive recurrent network for time-domain multichannel speech enhancement.
Proceedings of ICASSP-22, pp. 6497-6501.
- Taherian H., Tan K., and Wang D.L. (2022):
Location-based training for multi-channel talker-independent speaker separation.
Proceedings of ICASSP-22, pp. 696-700.
- Wang H. and Wang D.L. (2022):
Cross-domain speech enhancement with a neural cascade architecture.
Proceedings of ICASSP-22, pp. 7862-7866.
- Wang H., Zhang X., and Wang D.L. (2022):
Attention-based fusion for bone-conducted and air-conducted speech enhancement in the complex domain.
Proceedings of ICASSP-22, pp. 7757-7761.
- Wang H., Qian Y., Wang X., Wang Y., Wang C., Liu S., Yoshioka T., Li J., and Wang D.L. (2022):
Improving noise robustness of contrastive speech representation learning with speech reconstruction.
Proceedings of ICASSP-22, pp. 6062-6066.
- Wang Z.-Q. and Wang D.L. (2022):
Localization based sequential grouping for continuous speech separation.
Proceedings of ICASSP-22, pp. 281-285.
- Yu F. et al. (2022):
Summary on the ICASSP 2022 multi-channel multi-party meeting transcription grand challenge.
Proceedings of ICASSP-22, pp. 9156-9160.
- Zhang H. and Wang D.L. (2022):
Neural cascade architecture for joint acoustic echo and noise suppression.
Proceedings of ICASSP-22, pp. 671-675.
- Zhang H. and Wang D.L. (2021b):
A deep learning approach to multi-channel and multi-microphone acoustic echo cancellation.
Proceedings of INTERSPEECH-21, pp. 1139-1143.
- Zhang H. and Wang D.L. (2021a):
A deep learning method to multi-channel active noise control.
Proceedings of INTERSPEECH-21, pp. 681-685.
- Wang Z.-Q. and Wang D.L. (2021):
Count and separate: incorporating speaker counting for continuous speaker separation.
Proceedings of ICASSP-21, pp. 11-15.
- Zhang Y., Liu Y., and Wang D.L. (2021):
Complex ratio masking for singing voice separation.
Proceedings of ICASSP-21, pp. 41-45.
- Taherian H. and Wang D.L. (2021):
Time-domain loss modulation based on overlap ratio for monaural conversational speaker separation.
Proceedings of ICASSP-21, pp. 5744-5748.
- Tan K., Zhang X., and Wang D.L. (2021):
Real-time speech enhancement for mobile communication based on dual-channel complex spectral mapping.
Proceedings of ICASSP-21, pp. 6134-6138.
- Tan K. and Wang D.L. (2021):
Compressing deep neural networks for efficient speech enhancement.
Proceedings of ICASSP-21, pp. 8358-8362.
- Li H., Wang D.L., Zhang X., and Gao G. (2020):
Frame-level signal-to-noise ratio estimation using deep learning.
Proceedings of INTERSPEECH-20, pp. 4626-4630.
- Pandey A. and Wang D.L. (2020):
Learning complex spectral mapping for speech enhancement with improved cross-corpus generalization.
Proceedings of INTERSPEECH-20, pp. 4511-4515.
- Zhao Y. and Wang D.L. (2020):
Noisy-reverberant speech enhancement using DenseUNet with time-frequency attention.
Proceedings of INTERSPEECH-20, pp. 3261-3265.
- Zhang H. and Wang D.L. (2020):
A deep learning approach to active noise control.
Proceedings of INTERSPEECH-20, pp. 1141-1145.
- Wang Z.-Q. and Wang D.L. (2020):
Multi-microphone complex spectral mapping for speech dereverberation.
Proceedings of ICASSP-20, pp. 486-490.
- Wang H. and Wang D.L. (2020):
Time-frequency loss for CNN based speech super-resolution.
Proceedings of ICASSP-20, pp. 861-865.
- Liu Y., Delfarah M., and Wang D.L. (2020):
Deep CASA for talker-independent monaural speech separation.
Proceedings of ICASSP-20, pp. 6354-6358.
- Pandey A. and Wang D.L. (2020):
Densely connected neural network with dilated convolutions for real-time speech enhancement in the time domain.
Proceedings of ICASSP-20, pp. 6629-6633.
- Tan K. and Wang D.L. (2020):
Improving robustness of deep learning based monaural speech enhancement against processing artifacts.
Proceedings of ICASSP-20, pp. 6914-6918.
- Delfarah M., Liu Y., and Wang D.L. (2020):
Talker-independent speaker separation in reverberant conditions.
Proceedings of ICASSP-20, pp. 8723-8727.
- Wang P., Tan K., and Wang D.L. (2019):
Bridging the gap between monaural speech enhancement and recognition with distortion-independent acoustic modeling.
Proceedings of INTERSPEECH-19, pp. 471-475.
- Wang P. and Wang D.L. (2019):
Enhanced spectral features for distortion-independent acoustic modeling.
Proceedings of INTERSPEECH-19, pp. 476-480.
- Taherian H., Wang Z.-Q., and Wang D.L. (2019):
Deep learning based multi-channel speaker recognition in noisy and reverberant environments.
Proceedings of INTERSPEECH-19, pp. 4070-4074.
- Zhang H., Tan K., and Wang D.L. (2019):
Deep learning for joint acoustic echo and noise cancellation with nonlinear distortions.
Proceedings of INTERSPEECH-19, pp. 4255-4259.
- Wang Z.-Q., Tan K., and Wang D.L. (2019):
Deep learning based phase reconstruction for speaker separation: A trigonometric perspective.
Proceedings of ICASSP-19, pp. 71-75.
- Xie J., Jin D., Zhang W, Zhang X.-L., Chen J., and Wang D.L. (2019):
Robust sparse multichannel active noise control.
Proceedings of ICASSP-19, pp. 521-525.
- Tan K., Zhang X., and Wang D.L. (2019):
Real-time speech enhancement using an efficient convolutional recurrent network for dual-microphone mobile phones in close-talk scenarios.
Proceedings of ICASSP-19, pp. 5751-5755.
- Tan K. and Wang D.L. (2019):
Complex spectral mapping with a convolutional recurrent network for monaural speech enhancement.
Proceedings of ICASSP-19, pp. 6865-6869.
- Pandey A. and Wang D.L. (2019):
TCNN: Temporal convolutional neural network for real-time speech enhancement in the time domain.
Proceedings of ICASSP-19, pp. 6875-6879.
- Pandey A. and Wang D.L. (2019):
Exploring deep complex networks for complex spectrogram enhancement.
Proceedings of ICASSP-19, pp. 6885-6889.
- Pandey A. and Wang D.L. (2018):
A new framework for supervised speech enhancement in the time domain.
Proceedings of INTERSPEECH-18, pp. 1136-1140.
- Tan K. and Wang D.L. (2018):
A convolutional recurrent neural network for real-time speech enhancement.
Proceedings of INTERSPEECH-18, pp. 3229-3233.
(Related Source Code.)
- Tan K. and Wang D.L. (2018):
A two-stage approach to noisy cochannel speech separation with gated residual networks.
Proceedings of INTERSPEECH-18, pp. 3484-3488.
- Wang Z.-Q. and Wang D.L. (2018):
Integrating spectral and spatial features for multi-channel speaker separation.
Proceedings of INTERSPEECH-18, pp. 2718-2722.
- Wang Z.-Q. and Wang D.L. (2018):
All-neural multi-channel speech enhancement.
Proceedings of INTERSPEECH-18, pp. 3234-3238.
- Wang Z.-Q., Zhang X., and Wang D.L. (2018):
Robust TDOA estimation based on time-frequency masking and deep neural networks.
Proceedings of INTERSPEECH-18, pp. 322-326.
- Wang Z.-Q., Le Roux J., Wang D.L., Hershey, J.R. (2018):
End-to-end speech separation with unfolded iterative phase reconstruction.
Proceedings of INTERSPEECH-18, pp. 2708-2712.
- Zhang H. and Wang D.L. (2018):
Deep learning for acoustic echo cancellation in noisy and double-talk scenarios.
Proceedings of INTERSPEECH-18, pp. 3239-3243.
- Chakrabarty S., Wang D.L., Habets E.A.P. (2018):
Time-frequency masking based online speech enhancement with multi-channel data using convolutional neural networks.
Proceedings of IWAENC-18, pp. 476-480.
- Delfarah M. and Wang D.L. (2018):
Recurrent neural networks for cochannel speech separation in reverberant environments.
Proceedings of ICASSP-18, pp. 5404-5408.
- Liu Y. and Wang D.L. (2018):
Permutation invariant training for speaker-independent multi-pitch tracking.
Proceedings of ICASSP-18, pp. 5594-5598.
- Liu Y. and Wang D.L. (2018):
A CASA approach to deep learning based speaker-independent co-channel speech separation.
Proceedings of ICASSP-18, pp. 5399-5403.
- Pandey A. and Wang D.L. (2018):
On adversarial training and loss functions for speech enhancement.
Proceedings of ICASSP-18, pp. 5414-5418.
- Tan K., Chen J., and Wang D.L. (2018):
Gated residual networks with dilated convolutions for supervised speech separation.
Proceedings of ICASSP-18, pp. 21-25.
- Wang P. and Wang D.L. (2018):
Filter-and-convolve: a CNN based multichannel complex concatenation acoustic model.
Proceedings of ICASSP-18, pp. 5564-5568.
- Wang P. and Wang D.L. (2018):
Utterance-wise recurrent dropout and iterative speaker adaptation for robust monaural speech recognition.
Proceedings of ICASSP-18, pp. 4814-4818.
- Wang Z.-Q. and Wang D.L. (2018):
Mask weighted STFT ratios for relative transfer function estimation and its application to robust ASR.
Proceedings of ICASSP-18, pp. 5619-5623.
- Wang Z.-Q. and Wang D.L. (2018):
On spatial features for supervised speech separation and its application to beamforming and robust ASR.
Proceedings of ICASSP-18, pp. 5709-5713.
- Zhao Y. Wang D.L., Xu B., and Zhang T. (2018):
Late reverberation suppression using recurrent neural networks with long short-term memory.
Proceedings of ICASSP-18, pp. 5434-5438.
- Zhang X. and Wang D.L. (2017):
Binaural reverberant speech separation based on deep neural networks.
Proceedings of INTERSPEECH-17, pp. 2018-2022.
- Zhang X., Wang Z.-Q., and Wang D.L. (2017):
A speech enhancement algorithm by iterating single- and multi-microphone processing and its application to robust ASR.
Proceedings of ICASSP-17, pp. 276-280.
- Wang Z.-Q. and Wang D.L. (2017):
Unsupervised speaker adaptation of batch normalized acoustic models for robust ASR.
Proceedings of ICASSP-17, pp. 4890-4894.
- Wang Z.-Q. and Wang D.L. (2017):
Recurrent deep stacking networks for supervised speech separation.
Proceedings of ICASSP-17, pp. 71-75.
- Zhao Y., Wang Z.-Q., and Wang D.L. (2017):
A two-stage algorithm for noisy and reverberant speech enhancement.
Proceedings of ICASSP-17, pp. 5580-5584.
- Williamson D. and Wang D.L. (2017):
Speech dereverberation and denoising using complex ratio masks.
Proceedings of ICASSP-17, pp. 5590-5594.
- Chang J. and Wang D.L. (2017):
Robust speaker recognition based on DNN/i-vectors and speech separation.
Proceedings of ICASSP-17, pp. 5415-5419.
- Liu Y. and Wang D.L. (2017):
Time and frequency domain long short-term memory for noise robust pitch tracking.
Proceedings of ICASSP-17, pp. 5600-5604.
- Chen J. and Wang D.L. (2016):
Long short-term memory for speaker generalization in supervised speech separation.
Proceedings of INTERSPEECH-16, pp. 3314-3318.
- Delfarah M. and Wang D.L. (2016):
A feature study for masking-based reverberant speech separation.
Proceedings of INTERSPEECH-16, pp. 555-559.
- Williamson D.S., Wang Y., and Wang D.L. (2016):
Complex ratio masking for joint enhancement of magnitude and phase.
Proceedings of ICASSP-16, pp. 5220-5224.
- Liu Y. and Wang D.L. (2016):
Robust pitch tracking in noisy speech using speaker-dependent deep neural networks.
Proceedings of ICASSP-16, pp. 5255-5259.
- Wang Z.-Q., Zhao Y., and Wang D.L. (2016):
Phoneme-specific speech separation.
Proceedings of ICASSP-16, pp. 146-150.
- Wang Z.-Q. and Wang D.L. (2016):
Robust speech recognition from ratio masks.
Proceedings of ICASSP-16, pp. 5720-5724.
- Zhao Y., Wang D.L., Merks I., and Zhang T. (2016):
DNN-based enhancement of noisy and reverberant speech.
Proceedings of ICASSP-16, pp. 6525-6529.
- Wang Z.-Q. and Wang D.L. (2015):
Joint training of speech separation, filterbank and acoustic model for robust
automatic speech recognition.
Proceedings of INTERSPEECH-15, pp. 2839-2843.
- Liu Y. and Wang D.L. (2015):
Speaker-dependent multipitch tracking using deep neural networks.
Proceedings of INTERSPEECH-15, pp. 3279-3283.
- Zhang X.-L. and Wang D.L. (2015):
Multi-resolution stacking for speech separation based on boosted DNN.
Proceedings of INTERSPEECH-15, pp. 1745-1749.
- Han K., He Y., Bagchi D., Fosler-Lussier E., and Wang D.L. (2015):
Deep neural network based spectral feature mapping for robust speech
recognition.
Proceedings of INTERSPEECH-15, pp. 2484-2488.
- Chen J., Wang, Y., and Wang D.L. (2015):
Noise perturbation improves supervised speech separation.
Proceedings of LVA/ICA-15, pp. 83-90.
- Wang Y. and Wang D.L. (2015):
A deep neural network for time-domain signal reconstruction.
Proceedings of ICASSP-15, pp. 4390-4394
(The first author, Yuxuan Wang, receives the 2015 ICASSP
Starkey Signal Processing Research Award for this paper as a graduate student).
- Zhao X., Wang Y., and Wang D.L. (2015):
Deep neural networks for cochannel speaker identification.
Proceedings of ICASSP-15, pp. 4824-4828.
- Williamson D.S., Wang Y., and Wang (2015):
Deep neural networks for estimating speech model activations.
Proceedings of ICASSP-15, pp. 5113-5117.
- Zhang X.-L. and Wang D.L. (2014):
Boosted deep neural networks and multi-resolution cochleagram features for
voice activity detection.
Proceedings of INTERSPEECH-14, pp. 1534-1538.
- Jiang Y., Wang D.L., and Liu R.S. (2014):
Binaural deep neural network classification for reverberant speech segregation.
Proceedings of INTERSPEECH-14, pp. 2400-2404.
- Han K. and Wang D.L. (2014):
Neural networks for supervised pitch tracking in noise.
Proceedings of ICASSP-14, pp. 1502-1506.
- Han K., Wang Y., and Wang D.L. (2014):
Learning spectral mapping for speech dereverberation.
Proceedings of ICASSP-14, pp. 4661-4665.
- Williamson D.S., Wang Y., and Wang D.L. (2014):
A two-stage approach for improving the perceptual quality of separated speech.
Proceedings of ICASSP-14, pp. 7084-7088.
- Wang Y. and Wang D.L. (2014):
A structure-preserving training target for supervised speech separation.
Proceedings of ICASSP-14, pp. 6148-6152.
- Chen J., Wang, Y., and Wang D.L. (2014):
A feature study for classification-based speech separation at very low signal-to-noise ratio.
Proceedings of ICASSP-14, pp. 7089-7093.
- Zhao X., Wang Y., and Wang D.L. (2014):
Robust speaker identification in noisy and reverberant conditions.
Proceedings of ICASSP-14, pp. 4025-4029.
- Narayanan A. and Wang D.L. (2014):
Joint noise adaptive training for robust automatic speech recognition.
Proceedings of ICASSP-14, pp. 2523-2527.
- Narayanan A. and Wang D.L. (2013):
Coupling binary masking and robust ASR.
Proceedings of ICASSP-13, pp. 6817-6821.
- Narayanan A. and Wang D.L. (2013):
Ideal ratio mask estimation using deep neural networks for robust speech recognition.
Proceedings of ICASSP-13, pp. 7092-7096.
- Williamson D.S., Wang Y. and Wang D.L. (2013):
A sparse representation approach for perceptual quality improvement of separated speech.
Proceedings of ICASSP-13, pp. 7015-7019.
- Zhao X. and Wang D.L. (2013):
Analyzing noise robustness of MFCC and GFCC features in speaker identification.
Proceedings of ICASSP-13, pp. 7204-7208.
- Wang Y. and Wang D.L. (2013):
Feature denoising for speech separation in unknown noisy environments.
Proceedings of ICASSP-13, pp. 7472-7476.
- Han K. and Wang D.L. (2013):
Learning invariant features for speech separation.
Proceedings of ICASSP-13, pp. 7492-7496.
- Wang Y. and Wang D.L. (2012):
Cocktail party processing via structured prediction.
Proceedings of NIPS-12, pp. 224-232.
- Wang Y. and Wang D.L. (2012):
Boosting classification based speech separation using temporal dynamics.
Proceedings of INTERSPEECH-12, pp. 1528-1531.
- Wang Y., Han K., and Wang D.L. (2012):
Acoustic features for classification based speech separation.
Proceedings of INTERSPEECH-12, pp. 1532-1535.
- Narayanan A. and Wang D.L. (2012):
On the role of binary mask pattern in automatic speech recognition.
Proceedings of INTERSPEECH-12, pp. 1239-1242.
- Hu K. and Wang D.L. (2012):
SVM-based separation of unvoiced-voiced speech in cochannel conditions.
Proceedings of ICASSP-12, pp. 4545-4548.
- Han K. and Wang D.L. (2012):
On generalization of classification based speech separation.
Proceedings of ICASSP-12, pp. 4541-4544.
- Woodruff J. and Wang D.L. (2012):
Binaural speech segregation based on pitch and azimuth tracking.
Proceedings of ICASSP-12, pp. 241-244.
- Yuan J., Wang D.L., and Li R. (2011):
Image segmentation based on local spectral histograms and linear regression.
Proceedings of IJCNN-11, pp. 482-488.
- Woodruff J. and Wang D.L. (2011):
Directionality-based speech enhancement for hearing aids.
Proceedings of ICASSP-11, pp. 297-300.
- Hsu C.-L., Wang D.L., and Jang J.-S.R. (2011):
A trend estimation algorithm for singing pitch detection in musical recordings.
Proceedings of ICASSP-11, pp. 393-396.
- Narayanan A. and Wang D.L. (2011):
On the use of ideal binary masks for improving phonetic classification.
Proceedings of ICASSP-11, pp. 4632-4635.
- Hu K. and Wang D.L. (2011):
An approach to sequential grouping in cochannel speech.
Proceedings of ICASSP-11, pp. 4636-4639.
- Narayanan A., Zhao X., Wang D.L., and Fosler-Lussier E. (2011):
Robust speech recognition using multiple prior models for speech reconstruction.
Proceedings of ICASSP-11, pp. 4800-4803.
- Han K. and Wang D.L. (2011):
An SVM based classification approach to speech separation.
Proceedings of ICASSP-11, pp. 5212-5215.
- Zhao X., Shao Y., and Wang D.L. (2011):
Robust speaker identification using a CASA front-end.
Proceedings of ICASSP-11, pp. 5468-5471.
- Woodruff J., Prabhavalkar R., Fosler-Lussier E., and Wang D.L. (2010):
Combining monaural and binaural evidence for reverberant speech segregation.
Proceedings of INTERSPEECH-10, pp. 406-409.
- Hu K. and Wang D.L. (2010):
Unvoiced speech segregation based on CASA and spectral subtraction.
Proceedings of INTERSPEECH-10, pp. 2786-2789.
- Hu K. and Wang D.L. (2010):
Unsupervised sequential organization for cochannel speech separation.
Proceedings of INTERSPEECH-10, pp. 2790-2793.
- Kjems U., Pedersen M.S., Boldt J.B., Lunner T., and Wang D.L. (2010):
Speech intelligibility of ideal binary masked mixtures.
Proceedings of EUSIPCO-10, pp. 1909-1913.
- Yan L., Yuan J., Cheng L., Wang D.L., Li R. (2010):
A biologically and geometrically inspired approach to target extraction from multiple-source remote-sensing imagery.
Proceedings of ASPRS-10.
- Jin Z. and Wang D.L. (2010):
A multipitch tracking algorithm for noisy and reverberant speech.
Proceedings of ICASSP-10, pp. 4218-4221.
- Woodruff J. and Wang D.L. (2010):
Integrating monaural and binaural analysis for localizing multiple reverberant
sound sources.
Proceedings of ICASSP-10, pp. 2706-2709.
- Quiles M.G., Wang D.L., Zhao L., Romero R.A.F., and Huang D.-S. (2009):
An oscillatory correlation model of object-based attention.
Proceedings of IJCNN-09, pp. 2596-2602.
- Yuan J., Wang D.L., Wu B., Yan L., and Li R. (2009):
Automatic road extraction from satellite imagery using LEGION networks.
Proceedings of IJCNN-09, pp. 3471-3476.
- Jan T., Wang W., Wang D.L. (2009):
A multistage approach for blind separation of convolutive speech mixtures.
Proceedings of ICASSP-09, pp. 1713-1716.
- Woodruff J. and Wang D.L. (2009):
On the role of localization cues in binaural segregation of reverberant speech.
Proceedings of ICASSP-09, pp. 2205-2208.
- Hu K. and Wang D.L. (2009):
Incorporating spectral subtraction and noise type for unvoiced speech segregation.
Proceedings of ICASSP-09, pp. 4425-4428.
- Shao Y., Jin Z., Wang D.L., and Srinivasan S. (2009):
An auditory-based feature for robust speech recognition.
Proceedings of ICASSP-09, pp. 4625-4628.
- Jin Z. and Wang D.L. (2009):
Learning to maximize signal-to-noise ratio for reverberant speech segregation.
Proceedings of ICASSP-09, pp. 4689-4692.
- Wu B., Zhou Y., Yan L., Yuan J., Li R. and Wang D.L. (2009):
Object detection from HS/MS and multi-platform remote sensing imagery by the integration of biologically and geometrically inspired approaches.
Proceedings of ASPRS-09.
- Hu K., Divenyi P., Ellis D.P.W., Jin Z., Shinn-Cunningham B.G., and Wang D.L. (2008): Preliminary intelligibility tests of a monaural speech segregation system.
ISCA Tutorial and Research Workshop on Statistical and
Perceptual Audition (SAPA-08).
- Woodruff J., Li Y., and Wang D.L. (2008):
Resolving overlapping harmonics for monaural musical sound separation using pitch and common amplitude modulation.
Proceedings of ISMIR-08, pp. 538-543.
(Related Sound Demo.)
- Boldt J.B., Kjems U., Pedersen M.S., Lunner T., and Wang D.L. (2008):
Estimation of the ideal binary mask using directional systems.
Proceedings of IWAENC-08.
- Li Y. and Wang D.L. (2008):
Musical sound separation using pitch-based labeling and binary
time-frequency masking.
Proceedings of ICASSP-08, pp. 173-176.
- Li Y. and Wang D.L. (2008):
On the optimality of ideal binary time-frequency masks.
Proceedings of ICASSP-08, pp. 3501-3504.
- Shao Y. and Wang D.L. (2008):
Robust speaker identification using auditory features and computational
auditory scene analysis.
Proceedings of ICASSP-08, pp. 1589-1592.
- Jin Z. and Wang D.L. (2007):
A supervised learning approach to monaural segregation of reverberant speech.
Proceedings of ICASSP-07, pp. IV.921-924.
- Li Y. and Wang D.L. (2007):
Pitch detection in polyphonic music using instrument tone models.
Proceedings of ICASSP-07, pp. II.481-484.
- Shao Y., Srinivasan S., and Wang D.L. (2007):
Incorporating auditory feature uncertainties in robust speaker identification.
Proceedings of ICASSP-07, pp. IV.277-280. (Related Source Code.)
- Srinivasan S., Roman N., and Wang D.L. (2007):
Exploiting uncertainties for binaural speech recognition.
Proceedings of ICASSP-07, pp. IV.789-792.
- Li Y. and Wang D.L. (2006):
Singing voice separation from monaural recordings.
Proceedings of ISMIR-06, pp. 176-179.
- Srinivasan S., Shao Y., Jin Z., and Wang D.L. (2006):
A computational auditory scene analysis system for robust speech recognition.
Proceedings of Interspeech-06, pp. 73-76.
- Roman N., Srinivasan S., and Wang D.L. (2006):
Speech recognition in multisource reverberant environments with binaural inputs.
Proceedings of ICASSP-06, pp. I.309-312.
- Srinivasan S. and Wang D.L. (2006):
A supervised learning approach to uncertainty decoding for robust speech recognition.
Proceedings of ICASSP-06, pp. I.297-300.
- Shao Y. and Wang D.L. (2006):
Robust speaker recognition using binary time-frequency masks.
Proceedings of ICASSP-06, pp. I.645-648.
- Wang D.L. and Hu G. (2006):
Unvoiced speech segregation.
Proceedings of ICASSP-06, pp. V.953-956.
- Pedersen M.S., Wang D.L., Larsen J., and Kjems U. (2006):
Separating underdetermined convolutive speech mixtures.
Proceedings of Independent Component Analysis and Blind Signal Separation, pp. 674-681.
- Pedersen M.S., Wang D.L., Larsen J., and Kjems U. (2005):
Overcomplete blind source separation by combining ICA and binary
time-frequency masking.
Proceedings of IEEE Workshop on Machine Learning for Signal Processing, pp. 15-20.
(Related
Sound Demo.)
- Roman N. and Wang D.L. (2005):
A pitch-based model for separation of reverberant speech.
Proceedings of Interspeech-05, pp. 2109-2112.
- Srinivasan S. and Wang D.L. (2005):
Modeling the perception of multitalker speech.
Proceedings of Interspeech-05, pp. 1265-1268.
- Srinivasan S. and Wang D.L. (2005):
Robust speech recognition by integrating speech separation
and hypothesis testing.
Proceedings of ICASSP-05, pp. I.89-92.
- Hu G. and Wang D.L. (2005):
Separation of fricatives and affricates.
Proceedings of ICASSP-05, pp. I.1001-1004.
- Wu M. and Wang D.L. (2005):
A two-state algorithm for enhancement of reverberant speech.
Proceedings of ICASSP-05, pp. I.1085-1088.
- Li Y. and Wang D.L. (2005):
Detecting pitch of singing voice in polyphonic audio.
Proceedings of ICASSP-05, pp. III.17-20.
- Shao Y. and Wang D.L. (2004):
Model-based sequential organization for cochannel speaker identification.
Proceedings of ICSLP-04.
- Srinivasan S., Roman N., and Wang D.L. (2004):
On binary and ratio time-frequency masks for robust speech recognition.
Proceedings of ICSLP-04.
- Hu G. and Wang D.L. (2004): Auditory segmentation based on event detection.
ISCA Tutorial and Research Workshop on Statistical and
Perceptual Audio Processing (SAPA-4).
- Wang D.L. (2004):
A comparison of CNN and LEGION networks.
Proceedings of IJCNN-04, pp. 1735-1740.
- Roman N. and Wang D.L. (2004):
Binaural sound segregation for multisource reverberant environments.
Proceedings of ICASSP-04, pp. II.373-376.
- Roman N., Wang D.L., and Brown G.J. (2004):
A classification-based cocktail-party processor.
Proceedings of NIPS-03.
(Related
Sound Demo.)
- Srinivasan S. and Wang D.L. (2003):
Schema-based modeling of phonemic restoration.
Proceedings of EUROSPEECH-03, pp. 2053-2056.
- Liu X., Srivastava A., and Wang D.L. (2003):
On intrinsic generalization of low dimensional representations
of images for recognition.
Proceedings of IJCNN-03, pp. 182-187.
- Roman N. and Wang D.L. (2003):
Binaural tracking of multiple moving sources.
Proceedings of ICASSP-03, pp. V.149-152.
- Wu M. and Wang D.L. (2003):
A one-microphone algorithm for reverberant speech enhancement.
Proceedings of ICASSP-03, pp. I. 844-847.
- Hu G. and Wang D.L. (2003):
Separation of stop consonants.
Proceedings of ICASSP-03, pp. II.749-752.
- Shao Y. and Wang D.L. (2003):
Co-channel speaker identification using usable speech extraction based
on multi-pitch tracking.
Proceedings of ICASSP-03, vol. II.205-208.
- Hu G. and Wang D.L. (2003):
Monaural speech separation.
Proceedings of NIPS-02.
(Related
Sound Demo.)
- Roman N., Wang D.L., and Brown G.J. (2002):
Localization-based sound segregation.
Proceedings of ICASSP-02, pp. I.1013-1016.
- Hu G. and Wang D.L. (2002):
Monaural speech segregation based on pitch tracking and amplitude
modulation.
Proceedings of ICASSP-02, pp. I.553-556.
- Wu M., Wang D.L., and Brown G.J. (2002):
A multi-pitch tracking algorithm for noisy speech.
Proceedings of ICASSP-02, pp. I.369-372.
- Roman N., Wang D.L., and Brown G.J. (2002):
Localization-based sound segregation.
Proceedings of IJCNN-02, pp. 2299-2303.
- Hu G. and Wang D.L. (2002):
On amplitude modulation for monaural speech segregation.
Proceedings of IJCNN-02, pp. 69-74 .
- Chen K., Wang D.L., and Liu X. (2001):
Image segmentation by weight adaptation and oscillatory correlation.
Proceedings of International Conference on Neural Information Processing, invited paper.
- Liu X., Wang D.L., Srivastava A. (2001):
Image segmentation using local spectral histograms.
Proceedings of International Conference on Image Processing,
pp. 70-73.
- Palomaki K., Brown G.J., and Wang D.L. (2001):
A binaural model for missing data speech recognition in noisy and
reverberant conditions.
Web Proceedings of Workshop on Consistent and Reliable Acoustic Cues
for Sound Analysis.
- Hu G. and Wang D.L. (2001):
Speech segregation based on pitch tracking and amplitude modulation.
Proceedings of IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA-01) , pp. 79-82.
- Brown G.J., Barker J., and Wang D.L. (2001):
A neural oscillator sound separator for missing data speech
recognition.
Proceedings of IJCNN-01, pp. 2907-2912.
- Roman N., Wang D.L., and Brown G.J. (2001):
Speech segregation based on sound localization.
Proceedings of IJCNN-01, pp. 2861-2866.
- Liu X. and Wang D.L. (2001):
Appearance-based recognition using perceptual components.
Proceedings of IJCNN-01, pp. 1943-1948.
- Hu G. and Wang D.L. (2001):
An extended model for speech segregation.
Proceedings of IJCNN-01, pp. 1089-1094.
- Liu X. and Wang D.L. (2001):
A spectral histogram model for textons and texture
discrimination.
Proceedings of IJCNN-01, pp. 1083-1088.
- Wu M., Wang D.L., and Brown G.J. (2001):
Pitch tracking based on statistical anticipation.
Proceedings of IJCNN-01, pp. 866-871.
- Liu X. and Wang D.L. (2000): Spectral histograms for texton modeling
and discrimination.
Proceedings of the Sixteenth World Computer Congress
(Intelligent Information Processing), pp. 202-205.
- Cesmeli E., Lindsey D.L., and Wang D.L. (2000):
An oscillatory correlation model of human motion perception.
Proceedings of IJCNN-00, pp. IV.267-IV.272.
- Brown G.J. and Wang D.L. (2000):
An oscillatory correlation
framework for computational auditory scene analysis.
Proceedings of NIPS-99, pp. 747-753.
- Liu X. and Wang D.L. (2000):
Perceptual organization based on
temporal dynamics. Proceedings of NIPS-99, 38-44.
- Chen K. and Wang D.L. (1999):
Perceiving without learning: from spirals to
inside/outside relations. Advances in Neural Information Processing
Systems 11 (NIPS-98), pp. 10-16.
- Liu X. and Wang D.L. (1999):
Perceptual organization based on temporal dynamics.
Proceedings of IJCNN-99.
- Liu X. and Wang D.L. (1999):
A boundary pair representation for perception modeling.
Proceedings of IJCNN-99.
- Cesmeli E. and Wang D.L. (1999):
Image segmentation based on
motion/luminance integration and oscillatory correlation.
Proceedings of IJCNN-99.
- Brown G.J. and Wang D.L. (1999):
The separation of speech from
interferring sounds: an oscillatory correlation approach.
Proceedings of IJCNN-99.
- Chen K. and Wang D.L. (1999):
Image segmentation based on a dynamically
coupled neural oscillator network.
Proceedings of IJCNN-99.
- Wang D.L. (1998):
Object selection by oscillatory correlation
Proceedings of IJCNN-98, pp. 1182-1187.
- Campbell S. and Wang D.L. (1998):
Synchrony and desynchrony in integrate-and-fire
oscillators. Proceedings of IJCNN-98, pp. 1498-1503.
- Liu X.W., Wang D.L., and Ramirez J.R. (1998):
Extracting hydrographic
objects from satellite images using a two-layer neural network.
Proceedings of IJCNN-98, pp. 897-902.
- Chen K. and Wang D.L. (1998):
Perceiving spirals and inside/outside
relations by a neural oscillator network.
Proceedings of IJCNN-98, pp.619-624.
- Cesmeli E., Wang D.L., Lindsey D.L., and Todd J.T. (1998):
Motion segmentation using temporal block matching and LEGION.
Proceedings of IJCNN-98, 2069-2074.
- Liu X., Wang D.L., and Ramirez J R. (1998):
A two-layer neural network
for robust image segmentation and its application in revising hydrographic
features. International Archives of Photogrammetry and Remote
Sensing,
vol. 32, part 3/1, pp. 464-472.
- Liu X., Wang D.L., and Ramirez J.R. (1998):
Oriented statistical nonlinear smoothing filter. Proceedings of the International Conference on
Image Processing, vol. 2, pp. 848-852.
- Cesmeli E. and Wang D.L. (1998):
"Gauss Markov Rasgele Alanlari ve
Salingan Sinir Aglariyla Doku Bolutlemesi," SIU-98, Ankara
TURKEY (in Turkish).
- van der Kouwe A.J.W. and Wang D.L. (1997):
Temporal alignment, spatial
spread and the linear independence criterion for blind separation of voices.
Proceedings of IEEE EMBS, pp. 1994-1996.
- Wang D.L. (1997):
Object selection by a neural oscillator network.
Proceedings of International Conference on Neural Information
Processing, pp. 1137-1140.
- Liu X.W. and Wang D.L. (1997):
Range image segmentation using an
oscillatory network. Proceedings of ICNN-97, pp. 1656-1660.
- Cesmeli E. and Wang D.L. (1997):
Texture segmentation using Gaussian
Markov random fields and LEGION. Proceedings of ICNN-97,
pp. 1529-1534.
- Campbell S. and Wang D.L. (1997):
Relaxation oscillator networks with
time delays. Proceedings of ICNN-97, pp. 645-650.
- Brown G.J. and Wang D.L. (1997):
Modelling the perceptual separation of
concurrent vowels with a network of neural oscillators.
Proceedings of ICNN-97, pp. 569-574.
- Wang D.L. and Yuwono B. (1996):
Incremental learning of complex temporal patterns. Proceedings of WCNN-96, pp. 757-762.
- Shareef N., Wang D.L., and Yagel R. (1996):
Segmentation of medical data using locally excitatory globally inhibitory oscillator networks.
Proceedings of WCNN-96, pp. 1245-1248.
- Campbell S. and Wang D.L. (1996):
Loose synchrony in networks of relaxation oscillators with time delays. Proceedings of
WCNN-96, pp. 717-720.
- Wang D.L. and Yuwono B. (1996):
A neural model of sequential memory. Proceedings of ICNN-96, pp. 828-833.
- Campbell S. and Wang D.L. (1996):
Loose synchrony in relaxation oscillator networks with time delays. Proceedings of ICNN-96,
pp. 828-833.
- Wang D.L. and Terman D. (1996):
Image segmentation by neural oscillator
networks. Proceedings of ICNN-96, pp. 1534-1539.
- Wang D.L. and D. Terman. (1995):
Image segmentation by a neural
oscillator network. Proceedings of International Conference
on Neural Information Processing, pp. 722-726.
- Campbell S. and D.L. Wang (1995):
Relaxation oscillators with time
delay coupling.
Proceedings of WCNN-95, pp. I. 258-261.
- Wang D.L. and D. Terman. (1995):
Image segmentation based on
oscillatory correlation.
Proceedings of WCNN-95, pp. II.521-525.
- Liu X.M., D.L. Wang and S.C. Ahalt (1995):
On the temporal
generalization capability of simple recurrent networks.
Proceedings of the 1995 SPIE Conference on Applications and Science
of Artificial Neural Networks IV. Orlando FL, pp. 392-403.
- Wang D.L. and Terman D. (1994):
Locally excitatory globally inhibitory
oscillator networks: Theory and application to pattern segmentation.
Proceedings of IEEE Conference on Neural Networks for Signal
Processing, pp. 136-145.
- Wang D.L. (1994):
Auditory stream segregation based on oscillatory
correlation. Proceedings of IEEE Conference on Neural Networks
for Signal Processing, pp. 624-632.
- D.L. Wang and D. Terman (1994):
Synchrony and desynchrony in neural
oscillator networks. Advances in Neural Information Processing
Systems 7 (NIPS-94), pp. 199-206.
- D.L. Wang and D. Terman (1994):
Locally excitatory globally inhibitory
oscillator networks: Theory and application to scene segmentation.
Proceedings of the 23rd Artificial Intelligence and Pattern
Recognition Workshop, SPIE Proceedings 2368, pp. 624-632.
- Wang D.L. (1994):
An oscillation model of auditory stream segregation.
Proceedings of the International Conference on Pattern
Recognition, pp. C.198-200. Jerusalem, Israel.
- Wang D.L. (1994):
Modeling global synchrony in the visual cortex by
locally coupled neural oscillators. In: Eeckman F.H. (ed.),
Proceedings of CNS-93, pp. 109-114, Kluwer.
- Wang D.L. and Terman D. (1994):
Locally excitatory globally inhibitory
oscillator networks: Theory and application to pattern segmentation.
Proceedings of ICNN-94, pp. 945-950. Orlando, FL.
- Campbell S. and Wang D.L. (1994):
Synchronization and desynchronization
in locally coupled Wilson-Cowan oscillators. Proceedings of
ICNN-94, pp. 964-969.
- Wang D.L. and Yuwono B. (1994):
Temporal pattern generation based on
anticipation. Proceedings of ICNN-94, pp. 3148-3153.
- Wang D.L. and Terman D. (1994):
Locally excitatory globally inhibitory
oscillator networks. Proceedings of WCNN-94,
pp. IV. 745-750. San Diego, CA.
- Wang D.L. and Yuwono B. (1994):
Self-organization of temporal pattern
generation based on anticipation. Proceedings of
WCNN-94, pp. IV.149-154.
- Wang D.L. (1993):
Modeling global synchrony in the visual cortex by
locally coupled neural oscillators. Proceedings of the 15th
Annual Conference of the Cognitive Science Society, pp. 1058-1063.
Boulder, CO.
- Wang D.L. (1993):
Modeling stimulus specific habituation: The role of
primordial hippocampus. In: Eeckman F.H. and Bower J.M. (eds.),
Proceedings of CNS-92, pp. 103-107,
Kluwer, Boston.
- Wang D.L. (1993):
A neural architecture for complex temporal pattern
generation. Proceedings of the 3rd International Conference for
Young Computer Scientists, pp. 3.25-3.38, Beijing.
- Wang D.L. (1993):
Global phase synchrony in the visual cortex: A local
mechanism. Proceedings of WCNN-93, pp. I29-I32.
- Wang D.L. and Arbib M.A. (1991):
Hierarchical dishabituation of visual
discrimination in toads. In: Meyer J.-A. and Wilson S. (eds),
Simulation of adaptive behavior: From animals to animats,
pp. 77-88, MIT Press, Cambridge, MA.
- Wang D.L. and Arbib M.A. (1991):
A neural model of temporal sequence
generation with interval maintenance. Proceedings of the 13th
Annual Conference of the Cognitive Science Society, pp. 944-948.
Chicago, IL.
- Wang D.L. and Arbib M.A. (1990):
Mechanisms of pattern discrimination in
the toad's visual system. IJCNN-90, pp. II.477-482.
San Diego, CA.
- Wang D.L. and Arbib M.A. (1990):
A computational model of visual pattern
discrimination in toads. Proceedings of the 12th Annual Conference of
the Cognitive Science Society, pp. 598-605. Cambridge, MA.
- Wang D.L. (1989):
An extended model of the Neocognitron for pattern
partitioning and pattern composition. Proceedings of IJCNN-90
, pp. II.267-274. Washington, DC.
- Wang D.L. and King I. (1988):
Three neural models which process temporal
information. In Proceedings of the First Annual Conference of
the International Neural Network Society, pp.227, Boston, MA.
Archived/Technical Reports
Original: July 1995
|