Follow
Xiaozhe Ren
Title
Cited by
Cited by
Year
PanGu-: Large-scale Autoregressive Pretrained Chinese Language Models with Auto-parallel Computation
W Zeng, X Ren, T Su, H Wang, Y Liao, Z Wang, X Jiang, ZZ Yang, K Wang, ...
arXiv preprint arXiv:2104.12369, 2021
1812021
Nezha: Neural contextualized representation for chinese language understanding
J Wei, X Ren, X Li, W Huang, Y Liao, Y Wang, J Lin, X Jiang, X Chen, ...
arXiv preprint arXiv:1909.00204, 2019
1222019
PanGu-: Towards Trillion Parameter Language Model with Sparse Heterogeneous Computing
X Ren, P Zhou, X Meng, X Huang, Y Wang, W Wang, P Li, X Zhang, ...
arXiv preprint arXiv:2303.10845, 2023
372023
Sparsebert: Rethinking the importance analysis in self-attention
H Shi, J Gao, X Ren, H Xu, X Liang, Z Li, JTY Kwok
International Conference on Machine Learning, 9547-9557, 2021
372021
Autobert-zero: Evolving bert backbone from scratch
J Gao, H Xu, H Shi, X Ren, LH Philip, X Liang, X Jiang, Z Li
Proceedings of the AAAI Conference on Artificial Intelligence 36 (10), 10663 …, 2022
312022
Efficientbert: Progressively searching multilayer perceptron via warm-up knowledge distillation
C Dong, G Wang, H Xu, J Peng, X Ren, X Liang
arXiv preprint arXiv:2109.07222, 2021
192021
Numgpt: Improving numeracy ability of generative pre-trained models
Z Jin, X Jiang, X Wang, Q Liu, Y Wang, X Ren, H Qu
arXiv preprint arXiv:2109.03137, 2021
142021
A survey of reasoning with foundation models
J Sun, C Zheng, E Xie, Z Liu, R Chu, J Qiu, J Xu, M Ding, H Li, M Geng, ...
arXiv preprint arXiv:2312.11562, 2023
122023
Large-scale deep learning optimizations: A comprehensive survey
X He, F Xue, X Ren, Y You
arXiv preprint arXiv:2111.00856, 2021
122021
One student knows all experts know: From sparse to dense
F Xue, X He, X Ren, Y Lou, Y You
arXiv preprint arXiv:2201.10890, 2022
92022
Pangu
W Zeng, X Ren, T Su, H Wang, Y Liao, Z Wang, X Jiang, ZZ Yang, K Wang, ...
Large-scale Autoregressive Pretrained Chinese Language Models with Auto …, 2021
72021
Pangu-α: Large-scale autoregressive pretrained Chinese language models with auto-parallel computation. arXiv 2021
W Zeng, X Ren, T Su, H Wang, Y Liao, Z Wang, X Jiang, Z Yang, K Wang, ...
arXiv preprint arXiv:2104.12369, 0
7
Deeper vs wider: A revisit of transformer configuration
F Xue, J Chen, A Sun, X Ren, Z Zheng, X He, X Jiang, Y You
arXiv preprint arXiv:2205.10505 2 (3), 2022
62022
EdgeFM: Leveraging foundation model for open-set learning on the edge
B Yang, L He, N Ling, Z Yan, G Xing, X Shuai, X Ren, X Jiang
arXiv preprint arXiv:2311.10986, 2023
52023
Response length perception and sequence scheduling: An llm-empowered llm inference pipeline
Z Zheng, X Ren, F Xue, Y Luo, X Jiang, Y You
Advances in Neural Information Processing Systems 36, 2024
42024
PixArt-\Sigma: Weak-to-Strong Training of Diffusion Transformer for 4K Text-to-Image Generation
J Chen, C Ge, E Xie, Y Wu, L Yao, X Ren, Z Wang, P Luo, H Lu, Z Li
arXiv preprint arXiv:2403.04692, 2024
22024
CAME: Confidence-guided Adaptive Memory Efficient Optimization
Y Luo, X Ren, Z Zheng, Z Jiang, X Jiang, Y You
arXiv preprint arXiv:2307.02047, 2023
22023
A study on transformer configuration and training objective
F Xue, J Chen, A Sun, X Ren, Z Zheng, X He, Y Chen, X Jiang, Y You
International Conference on Machine Learning, 38913-38925, 2023
22023
A Survey of Reasoning with Foundation Models: Concepts, Methodologies, and Outlook
J Sun, C Zheng, E Xie, Z Liu, R Chu, J Liu, J Xu, M Ding, H Li, M Geng, ...
OSF, 0
The system can't perform the operation now. Try again later.
Articles 1–19