Hi, I'm Kai Zhang.

I'm a founding MTS at NeoCognition, building agents.

Previously, I got my PhD at OSU NLP and worked at Meta MSL, MSR, and Google DeepMind.

Find me on , and .

What's New

Apr 2026

I graduated and joined NeoCognition as a founding MTS (we are hiring). Early Experience was accpeted to ICML'26.

Sept 2025

WebDreamer was accepted to TMLR'25; ARM (Spotlight), Mind2Web 2, and CPathAgent were accepted to NeurIPS'25. I will serve as an Area Chair for ICLR'26.

Jan 2025

PathGen-1.6M (Oral) and MuirBench were accepted to ICLR'25, and Planning Analysis was accepted to NAACL'25.

Aug 2024

PathMMU was accepted to ECCV'24 as Best Paper Finalist (0.2%).

May 2024

MagicLens (Oral) and TravelPlanner (Spotlight) were accepted to ICML'24.

Mar 2024

Excited to present MagicLens done at Google DeepMind: next-generation image retrieval models with SOTA results on 10 benchmarks across multimodality-to-image, image-to-image, and text-to-image.

Feb 2024

MMMU was accepted to CVPR'24 as Best Paper Finalist (0.2%) and I will be in MSR this summer. See you in Seattle :)

Jan 2024

Three papers got accpeted to ICLR'24: KnowledgeConflict (Spotlight), MUFFIN, and ImagenHub.

Sept 2023

MagicBrush was accepted to NeurIPS'23 Datasets and Benchmarks Track.

Aug 2023

Excited to start my internship at Google DeepMind (Previously Google Brain)!


Selected Publications

See full list in Publications.

  • Agent Learning via Early Experience

    Kai Zhang, Xiangchao Chen, Bo Liu, Tianci Xue, Zeyi Liao, Zhihan Liu, Xiyao Wang, Yuting Ning, Zhaorun Chen, Xiaohan Fu, Jian Xie, Yuxuan Sun, Boyu Gou, et al.

  • Is Your LLM Secretly a World Model of the Internet? Model-Based Planning for Web Agents

    Yu Gu*, Kai Zhang*, Yuting Ning*, Boyuan Zheng*, Boyu Gou, Tianci Xue, Cheng Chang, Sanjari Srivastava, Yanan Xie, Peng Qi, Huan Sun, Yu Su

  • MagicLens: Self-Supervised Image Retrieval with Open-Ended Instructions

    Kai Zhang, Yi Luan, Hexiang Hu, Kenton Lee, Siyuan Qiao, Wenhu Chen, Yu Su, Ming-Wei Chang

  • TravelPlanner: A Benchmark for Real-World Planning with Language Agents

    Jian Xie*, Kai Zhang*, Jiangjie Chen, Tinghui Zhu, Renze Lou, Yuandong Tian, Yanghua Xiao, Yu Su

  • MMMU: A Massive Multi-discipline Multimodal Understanding and Reasoning Benchmark for Expert AGI

    Xiang Yue*, Yuansheng Ni*, Kai Zhang*, Tianyu Zheng*, Ruoqi Liu, Ge Zhang, Samuel Stevens, Dongfu Jiang, Weiming Ren, Yuxuan Sun, Cong Wei, Botao Yu, Ruibin Yuan, Renliang Sun, Ming Yin, Boyuan Zheng, Zhenzhu Yang, Yibo Liu, Wenhao Huang, Huan Sun, Yu Su, Wenhu Chen

  • Adaptive Chameleon or Stubborn Sloth: Revealing the Behavior of Large Language Models in Knowledge Conflicts

    Jian Xie*, Kai Zhang*, Jiangjie Chen, Renze Lou, Yu Su

Contact
Email: [LAST_NAME].13253@osu.edu OR drogo[LAST_NAME]@gmail.com