Zhengyuan Yang

Citeras av

	Alla	Sedan 2019
Citat	4043	4029
h-index	26	26
i10-index	35	35

1900

950

475

1425

201820192020202120222023202412 59 130 305 632 1863 1034

Offentlig åtkomst

Visa alla

14 artiklar

0 artiklar

tillgänglig

inte tillgänglig

Enligt krav från finansiärer

Medförfattare

Lijuan WangMicrosoft GenAIVerifierad e-postadress på microsoft.com
Jianfeng WangMicrosoftVerifierad e-postadress på microsoft.com
Zicheng LiuMicrosoftVerifierad e-postadress på microsoft.com
Jiebo LuoAlbert Arendt Hopeman Professor of Engineering, University of RochesterVerifierad e-postadress på cs.rochester.edu
Linjie (Lindsey) LiSenior Researcher, MicrosoftVerifierad e-postadress på microsoft.com
Kevin LinMicrosoftVerifierad e-postadress på microsoft.com
Zhe GanResearch Scientist, AppleVerifierad e-postadress på apple.com
Liwei WangAssistant Professor at The Chinese University of Hong KongVerifierad e-postadress på cse.cuhk.edu.hk
Ce LiuPartner Research Manager, Microsoft GenAI; IEEE FellowVerifierad e-postadress på microsoft.com
Jinsong SuXiamen UniversityVerifierad e-postadress på xmu.edu.cn
Jiajun Deng (邓家俊)University of Adelaide, Australian Institute for Machine LearningVerifierad e-postadress på adelaide.edu.au
Yuncheng LiGoogleVerifierad e-postadress på google.com
Jianwei YangPrincipal Researcher, Microsoft Research, RedmondVerifierad e-postadress på microsoft.com
Chenglei SiStanford UniversityVerifierad e-postadress på stanford.edu
Boqing GongResearch Scientist, GoogleVerifierad e-postadress på google.com

Följ

Zhengyuan Yang

Researcher, Microsoft

Verifierad e-postadress på microsoft.com - Startsida

Computer Vision Multimedia Vision + Language Multimodal


Titel Sortera efter citat Sortera efter år Sortera efter titel	Citeras av Citeras av	År
Git: A generative image-to-text transformer for vision and language J Wang, Z Yang, X Hu, L Li, K Lin, Z Gan, Z Liu, C Liu, L Wang Transactions on Machine Learning Research (TMLR), 2022	350	2022
A fast and accurate one-stage approach to visual grounding Z Yang, B Gong, L Wang, W Huang, D Yu, J Luo IEEE International Conference on Computer Vision (ICCV), 4683-4693, 2019	310	2019
An empirical study of gpt-3 for few-shot knowledge-based vqa Z Yang, Z Gan, J Wang, X Hu, Y Lu, Z Liu, L Wang Proceedings of the AAAI Conference on Artificial Intelligence 36 (3), 3081-3089, 2022	295	2022
TransVG: End-to-End Visual Grounding with Transformers J Deng, Z Yang, T Chen, W Zhou, H Li IEEE International Conference on Computer Vision (ICCV), 2021	247	2021
The dawn of lmms: Preliminary explorations with gpt-4v (ision) Z Yang, L Li, K Lin, J Wang, CC Lin, Z Liu, L Wang arXiv preprint arXiv:2309.17421 9 (1), 1, 2023	225	2023
Scaling up vision-language pre-training for image captioning X Hu, Z Gan, J Wang, Z Yang, Z Liu, Y Lu, L Wang Proceedings of the IEEE/CVF conference on computer vision and pattern …, 2022	215	2022
Mm-react: Prompting chatgpt for multimodal reasoning and action Z Yang, L Li, J Wang, K Lin, E Azarnasab, F Ahmed, Z Liu, C Liu, M Zeng, ... arXiv preprint arXiv:2303.11381, 2023	195	2023
Improving One-stage Visual Grounding by Recursive Sub-query Construction Z Yang, T Chen, L Wang, J Luo European Conference on Computer Vision (ECCV), 2020	187	2020
End-to-end multi-modal multi-task vehicle control for self-driving cars with visual perceptions Z Yang, Y Zhang, J Yu, J Cai, J Luo 2018 24th international conference on pattern recognition (ICPR), 2289-2294, 2018	185	2018
Action recognition with spatio–temporal visual attention on skeleton image sequences Z Yang, Y Li, J Yang, J Luo IEEE Transactions on Circuits and Systems for Video Technology 29 (8), 2405-2415, 2018	180	2018
Attentive relational networks for mapping images to scene graphs M Qi, W Li, Z Yang, Y Wang, J Luo IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 3957-3966, 2019	169	2019
Prompting gpt-3 to be reliable C Si, Z Gan, Z Yang, S Wang, J Wang, J Boyd-Graber, L Wang International Conference on Learning Representations (ICLR 23), 2022	155	2022
TAP: Text-Aware Pre-training for Text-VQA and Text-Caption Z Yang, Y Lu, J Wang, X Yin, D Florencio, L Wang, C Zhang, L Zhang, ... IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021	147	2021
Mm-vet: Evaluating large multimodal models for integrated capabilities W Yu, Z Yang, L Li, J Wang, K Lin, Z Liu, X Wang, L Wang arXiv preprint arXiv:2308.02490, 2023	124	2023
A Novel Graph-based Multi-modal Fusion Encoder for Neural Machine Translation Y Yin, F Meng, J Su, C Zhou, Z Yang, J Zhou, J Luo Annual Meeting of the Association for Computational Linguistics (ACL), 2020	124	2020
UniTAB: Unifying Text and Box Outputs for Grounded Vision-Language Modeling Z Yang, Z Gan, J Wang, X Hu, F Ahmed, Z Liu, Y Lu, L Wang European Conference on Computer Vision (ECCV), 521--539, 2022	121*	2022
Multimodal foundation models: From specialists to general-purpose assistants C Li, Z Gan, Z Yang, J Yang, L Li, L Wang, J Gao arXiv preprint arXiv:2309.10020 1 (2), 2, 2023	73	2023
Promptcap: Prompt-guided task-aware image captioning Y Hu, H Hua, Z Yang, W Shi, NA Smith, J Luo arXiv preprint arXiv:2211.09699, 2022	72*	2022
Dynamic context-guided capsule network for multimodal machine translation H Lin, F Meng, J Su, Y Yin, Z Yang, Y Ge, J Zhou, J Luo Proceedings of the 28th ACM International Conference on Multimedia, 1320-1329, 2020	70	2020
SAT: 2D Semantics Assisted Training for 3D Visual Grounding Z Yang, S Zhang, L Wang, J Luo IEEE International Conference on Computer Vision (ICCV), 2021	68	2021

Systemet kan inte utföra åtgärden just nu. Försök igen senare.

Artiklar 1–20

Citat per år

Dubblettcitat

Sammanfogade citat

Lägg till medförfattareMedförfattare

Följ

Citeras av

Medförfattare