Följ
Jihyung Kil
Titel
Citeras av
Citeras av
År
GPT-4V (ision) is a Generalist Web Agent, if Grounded
B Zheng, B Gou, J Kil, H Sun, Y Su
arXiv preprint arXiv:2401.01614, 2024
222024
Discovering the Unknown Knowns: Turning Implicit Knowledge in the Dataset into Explicit Training Examples for Visual Question Answering
J Kil, C Zhang, D Xuan, WL Chao
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2021
212021
PreSTU: Pre-Training for Scene-Text Understanding
J Kil, S Changpinyo, X Chen, H Hu, S Goodman, WL Chao, R Soricut
IEEE/CVF International Conference on Computer Vision (ICCV), 2023
192023
One Step at a Time: Long-Horizon Vision-and-Language Navigation with Milestones
CH Song, J Kil, TY Pan, BM Sadler, WL Chao, Y Su
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022
192022
Revisiting Document Representations for Large-Scale Zero-Shot Learning
J Kil, WL Chao
NAACL, 2021
62021
Dual-View Visual Contextualization for Web Navigation
J Kil, CH Song, B Zheng, X Deng, Y Su, WL Chao
IEEE / CVF Computer Vision and Pattern Recognition Conference (CVPR), 2024
12024
II-MMR: Identifying and Improving Multi-modal Multi-hop Reasoning in Visual Question Answering
J Kil, F Tavazoee, D Kang, JK Kim
arXiv preprint arXiv:2402.11058, 2024
2024
Systemet kan inte utföra åtgärden just nu. Försök igen senare.
Artiklar 1–7