Shi-Xiong (Austin) Zhang

Cited by

	All	Since 2019
Citations	2698	2324
h-index	29	27
i10-index	50	45

660

330

165

495

20092010201120122013201420152016201720182019202020212022202320247 7 11 18 15 50 44 55 62 94 138 242 378 555 648 357

Public access

View all

7 articles

1 article

available

not available

Based on funding mandates

Co-authors

Meng YUTencent AI LabVerified email at tencent.com
Yong XuPrincipal Researcher, Tencent America, Bellevue, USAVerified email at tencent.com
Dong Yu (俞栋)Distinguished Scientist @ Tencent AI Lab, ACM/IEEE/ISCA FellowVerified email at global.tencent.com
Rongzhi GuTencent AI LabVerified email at pku.edu.cn
Yifan GongPrincipal Science Manager, Microsoft Corp.Verified email at microsoft.com
Mark GalesCambridge UniversityVerified email at eng.cam.ac.uk
Jinyu LiPartner Applied Science Manager, MicrosoftVerified email at microsoft.com
Shinji WatanabeCarnegie Mellon UniversityVerified email at cmu.edu
M.W. MakThe Hong Kong Polytechnic UniversityVerified email at polyu.edu.hk
Xunying LiuChinese University of Hong KongVerified email at se.cuhk.edu.hk
Yong ZhaoMicrosoft CorporationVerified email at microsoft.com
Kaisheng YaoGoogleVerified email at google.com
Fahimeh BahmaninezhadMicrosoftVerified email at microsoft.com
Jianwei YuTencent AI labVerified email at tencent.com
Kate KnillUniversity of CambridgeVerified email at eng.cam.ac.uk
Philip WoodlandProfessor of Information Engineering, Cambridge University Engineering DepartmentVerified email at eng.cam.ac.uk
Yajie MiaoCarnegie Mellon UniversityVerified email at cs.cmu.edu
Rui ZhaomicrosoftVerified email at microsoft.com
Rogier van DalenSamsung AI CenterVerified email at samsung.com

Shi-Xiong (Austin) Zhang

Other namesShi-Xiong Zhang, Shixiong Zhang

Sr. Director | AI Foundations@Capital One | ex-Microsoft, ex-Tencent, Cambridge PhD

Verified email at capitalone.com

Multi-modal Foundation Models ASR Speech Processing NLP


Title Sort by citations Sort by year Sort by title	Cited by Cited by	Year
An overview of deep-learning-based audio-visual speech enhancement and separation D Michelsanti, ZH Tan, SX Zhang, Y Xu, M Yu, D Yu, J Jensen IEEE/ACM Transactions on Audio, Speech, and Language Processing 29, 1368-1396, 2021	242	2021
End-to-end attention based text-dependent speaker verification SX Zhang, Z Chen, Y Zhao, J Li, Y Gong 2016 IEEE Spoken Language Technology Workshop (SLT), 171-178, 2016	204	2016
ADL-MVDR: All deep learning MVDR beamformer for target speech separation Z Zhang, Y Xu, M Yu, SX Zhang, L Chen, D Yu ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and …, 2021	121	2021
Time Domain Audio Visual Speech Separation J Wu, Y Xu, SX Zhang, LW Chen, M Yu, L Xie, D Yu Automatic Speech Recognition and Understanding Workshop, ASRU 2019,, 2019	120	2019
Computerized intelligent assistant for conferences A Diamant, KM Ben-Dor, E Krupka, R Halaly, Y Smolin, I Gurvich, ... US Patent 10,867,610, 2020	106	2020
Multi-modal multi-channel target speech separation R Gu, SX Zhang, Y Xu, L Chen, Y Zou, D Yu IEEE Journal of Selected Topics in Signal Processing 14 (3), 530-541, 2020	106	2020
Investigation of Multilingual Deep Neural Networks for Spoken Term Detection K Knill, MJF Gales, S Rath, P Woodland, SX Zhang ASRU, 2013	102	2013
Audio-visual Recognition of Overlapped speech for the LRS2 dataset J Yu, SX Zhang, J Wu, S Ghorbani, B Wu, S Kang, S Liu, X Liu, H Meng, ... ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and …, 2020	100	2020
SIMPLIFYING LONG SHORT-TERM MEMORY ACOUSTIC MODELS FOR FAST TRAINING AND DECODING Y Miao, J Li, Y Wang, S Zhang, Y Gong ICASSP, 2016	100	2016
Neural Spatial Filter: Target Speaker Speech Separation Assisted with Directional Information R Gu, L Chen, SX Zhang, J Zheng, Y Xu, M Yu, D Su, Y Zou, D Yu	95	2019
A comprehensive study of speech separation: spectrogram vs waveform separation F Bahmaninezhad, J Wu, R Gu, SX Zhang, Y Xu, M Yu, D Yu arXiv preprint arXiv:1905.07497, 2019	90	2019
End-to-end multi-channel speech separation R Gu, J Wu, SX Zhang, L Chen, Y Xu, M Yu, D Su, Y Zou, D Yu arXiv preprint arXiv:1905.06286, 2019	86	2019
New era for robust speech recognition: exploiting deep learning S Watanabe, M Delcroix, F Metze, JR Hershey, et al. Springer, 2017	64*	2017
Enhancing End-to-End Multi-Channel Speech Separation Via Spatial Feature Learning R Gu, SX Zhang, L Chen, Y Xu, M Yu, D Su, Y Zou, D Yu ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and …, 2020	62	2020
Audio-visual speech separation and dereverberation with a two-stage multimodal network K Tan, Y Xu, SX Zhang, M Yu, D Yu IEEE Journal of Selected Topics in Signal Processing 14 (3), 542-553, 2020	55	2020
FAST-RIR: Fast neural diffuse room impulse response generator A Ratnarajah, SX Zhang, M Yu, Z Tang, D Manocha, D Yu ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and …, 2022	50	2022
Structured SVMs for automatic speech recognition SX Zhang, MJF Gales IEEE Transactions on Audio, Speech, and Language Processing 21 (3), 544-555, 2012	50	2012
DEEP NEURAL SUPPORT VECTOR MACHINES FOR SPEECH RECOGNITION SX Zhang, C Liu, K Yao, Y Gong ICASSP 2015, 2015	46	2015
Neural Spatio-Temporal Beamformer for Target Speech Separation Y Xu, M Yu, SX Zhang, L Chen, C Weng, J Liu, D Yu arXiv preprint arXiv:2005.03889, 2020	42	2020
Far-Field Location Guided Target Speech Extraction Using End-to-End Speech Recognition Objectives AS Subramanian, C Weng, M Yu, SX Zhang, Y Xu, S Watanabe, D Yu ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and …, 2020	42	2020

The system can't perform the operation now. Try again later.

Articles 1–20

Citations per year

Duplicate citations

Merged citations

Add co-authorsCo-authors

Follow

Cited by

Co-authors