Yu Zhang
Yu Zhang
Google
Verified email at csail.mit.edu - Homepage
Title
Cited by
Cited by
Year
Specaugment: A simple data augmentation method for automatic speech recognition
DS Park, W Chan, Y Zhang, CC Chiu, B Zoph, ED Cubuk, QV Le
arXiv preprint arXiv:1904.08779, 2019
14122019
Natural tts synthesis by conditioning wavenet on mel spectrogram predictions
J Shen, R Pang, RJ Weiss, M Schuster, N Jaitly, Z Yang, Z Chen, Y Zhang, ...
2018 IEEE International Conference on Acoustics, Speech and Signal …, 2018
13842018
An introduction to computational networks and the computational network toolkit
MS Dong Yu, Adam Eversole, Mike Seltzer, Kaisheng Yao, Zhiheng Huang, Brian ...
Tech. Rep. MSR, Microsoft Research, 2014, http://codebox/cntk, 2014
451*2014
Very deep convolutional networks for end-to-end speech recognition
Y Zhang, W Chan, N Jaitly
2017 IEEE international conference on acoustics, speech and signal …, 2017
4192017
Style tokens: Unsupervised style modeling, control and transfer in end-to-end speech synthesis
Y Wang, D Stanton, Y Zhang, RJS Ryan, E Battenberg, J Shor, Y Xiao, ...
International Conference on Machine Learning, 5180-5189, 2018
4112018
Conformer: Convolution-augmented transformer for speech recognition
A Gulati, J Qin, CC Chiu, N Parmar, Y Zhang, J Yu, W Han, S Wang, ...
arXiv preprint arXiv:2005.08100, 2020
3962020
Transfer learning from speaker verification to multispeaker text-to-speech synthesis
Y Jia, Y Zhang, RJ Weiss, Q Wang, J Shen, F Ren, Z Chen, P Nguyen, ...
arXiv preprint arXiv:1806.04558, 2018
3962018
Spoken language understanding using long short-term memory neural networks
K Yao, B Peng, Y Zhang, D Yu, G Zweig, Y Shi
IEEE SLT, 2014
3172014
Highway long short-term memory rnns for distant speech recognition
Y Zhang, G Chen, D Yu, K Yao, S Khudanpur, J Glass
2016 IEEE International Conference on Acoustics, Speech and Signal …, 2016
3002016
Unsupervised learning of disentangled and interpretable representations from sequential data
WN Hsu, Y Zhang, J Glass
arXiv preprint arXiv:1709.07902, 2017
2742017
Advances in joint CTC-attention based end-to-end speech recognition with a deep CNN encoder and RNN-LM
T Hori, S Watanabe, Y Zhang, W Chan
arXiv preprint arXiv:1706.02737, 2017
2552017
LibriTTS: A corpus derived from LibriSpeech for text-to-speech
H Zen, V Dang, R Clark, Y Zhang, RJ Weiss, Y Jia, Z Chen, Y Wu
arXiv preprint arXiv:1904.02882, 2019
2032019
Simple recurrent units for highly parallelizable recurrence
T Lei, Y Zhang, SI Wang, H Dai, Y Artzi
arXiv preprint arXiv:1709.02755, 2017
1692017
Training rnns as fast as cnns
T Lei, Y Zhang, Y Artzi
1642018
Deep beamforming networks for multi-channel speech recognition
X Xiao, S Watanabe, H Erdogan, L Lu, J Hershey, ML Seltzer, G Chen, ...
2016 IEEE International Conference on Acoustics, Speech and Signal …, 2016
1532016
Hierarchical generative modeling for controllable speech synthesis
WN Hsu, Y Zhang, RJ Weiss, H Zen, Y Wu, Y Wang, Y Cao, Y Jia, Z Chen, ...
arXiv preprint arXiv:1810.07217, 2018
1352018
I-Vector Based Clustering Training Data in Speech Recognition
Q Huo, ZJ Yan, Y Zhang, J Xu
US Patent App. 13/640,804, 2015
1352015
Learning latent representations for speech generation and transformation
WN Hsu, Y Zhang, J Glass
arXiv preprint arXiv:1704.04222, 2017
1262017
Lingvo: a modular and scalable framework for sequence-to-sequence modeling
J Shen, P Nguyen, Y Wu, Z Chen, MX Chen, Y Jia, A Kannan, T Sainath, ...
arXiv preprint arXiv:1902.08295, 2019
1122019
A streaming on-device end-to-end model surpassing server-side conventional model quality and latency
TN Sainath, Y He, B Li, A Narayanan, R Pang, A Bruguier, S Chang, W Li, ...
ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and …, 2020
1012020
The system can't perform the operation now. Try again later.
Articles 1–20