WaveWave 2

Research

Our research focuses on building grounded AI systems that enable users to interact through natural language with digital and physical environments. We develop AI agents that translate language and perception into executable code and actions, empowering people to perform data science, control computers, and collaborate with robots. Our work spans three core areas: code generation for data science, grounding language in the digital world, and grounding language in the physical world.

Papers

All
Code Generation for Data Science
Grounding Language in the Digital World
Grounding Language in the Physical World
Others : RAG, Tool use, Efficient LLM, Data synthesis etc.
OpenCUA: Open Foundations for Computer-Use Agents

OpenCUA: Open Foundations for Computer-Use Agents

Xinyuan Wang*, Bowen Wang*, Dunjie Lu*, Junlin Yang*, Tianbao Xie*, Junli Wang*, Jiaqi Deng, Xiaole Guo, Yiheng Xu, Chen Henry Wu, Zhennan Shen, Zhuokai Li, Ryan Li, Xiaochuan Li, Junda Chen, Boyuan Zheng, Peihang Li, Fangyu Lei, Ruisheng Cao, Yeqiao Fu, Dongchan Shin, Martin Shin, Jiarui Hu, Yuyan Wang, Jixuan Chen, Yuxiao Ye, Danyang Zhang, Dikang Du, Hao Hu, Huarong Chen, Zaida Zhou, Haotian Yao, Ziwei Chen, Qizheng Gu, Yipu Wang, Heng Wang, Diyi Yang, Victor Zhong, Flood Sung, Y.Charles, Zhilin Yang, Tao Yu

NeurIPS 2025 Spotlight

Scaling Computer-Use Grounding via User Interface Decomposition and Synthesis

Scaling Computer-Use Grounding via User Interface Decomposition and Synthesis

Tianbao Xie*, Jiaqi Deng*, Xiaochuan Li*, Junlin Yang*, Haoyuan Wu, Jixuan Chen, Wenjing Hu, Xinyuan Wang, Yuhui Xu, Zekun Wang, Yiheng Xu, Junli Wang, Doyen Sahoo, Tao Yu, Caiming Xiong

NeurIPS 2025 Spotlight

Aguvis: Unified Pure Vision Agents for Autonomous GUI Interaction

Aguvis: Unified Pure Vision Agents for Autonomous GUI Interaction

Yiheng Xu*, Zekun Wang*, Junli Wang*, Dunjie Lu, Tianbao Xie, Amrita Saha, Doyen Sahoo, Tao Yu, Caiming Xiong

ICML 2025

Spider 2.0: Evaluating Language Models on Real-World Enterprise Text-to-SQL Workflows

Spider 2.0: Evaluating Language Models on Real-World Enterprise Text-to-SQL Workflows

Fangyu Lei*, Jixuan Chen*, Yuxiao Ye, Ruisheng Cao, Dongchan Shin, Hongjin Su, Zhaoqing Suo, Hongcheng Gao, Wenjing Hu, Pengcheng Yin, Victor Zhong, Caiming Xiong, Ruoxi Sun, Qian Liu, Sida Wang, Tao Yu

ICLR 2025 Oral

BRIGHT: A Realistic and Challenging Benchmark for Reasoning-Intensive Retrieval

BRIGHT: A Realistic and Challenging Benchmark for Reasoning-Intensive Retrieval

Hongjin Su*, Howard Yen*, Mengzhou Xia*, Weijia Shi, Niklas Muennighoff, Han-yu Wang, Haisu Liu, Quan Shi, Zachary S. Siegel, Michael Tang, Ruoxi Sun, Jinsung Yoon, Sercan Ö. Arik, Danqi Chen, Tao Yu

ICLR 2025 Spotlight

Learn-by-interact: A Data-Centric Framework for Self-Adaptive Agents in Realistic Environments

Learn-by-interact: A Data-Centric Framework for Self-Adaptive Agents in Realistic Environments

Hongjin Su, Ruoxi Sun, Jinsung Yoon, Pengcheng Yin, Tao Yu and Sercan Ö. Arık

ICLR 2025

AgentTrek: Agent Trajectory Synthesis via Guiding Replay with Web Tutorials

AgentTrek: Agent Trajectory Synthesis via Guiding Replay with Web Tutorials

Yiheng Xu*, Dunjie Lu*, Zhennan Shen*, Junli Wang, Zekun Wang, Yuchen Mao, Caiming Xiong, Tao Yu

ICLR 2025

Generative Representational Instruction Tuning

Generative Representational Instruction Tuning

Niklas Muennighoff, Hongjin Su, Liang Wang, Nan Yang, Furu Wei, Tao Yu, Amanpreet Singh, Douwe Kiela

ICLR 2025

Attacking Vision-Language Computer Agents via Pop-ups

Attacking Vision-Language Computer Agents via Pop-ups

Yanzhe Zhang, Tao Yu, Diyi Yang

ACL 2025

EVOR: Evolving Retrieval for Code Generation

EVOR: Evolving Retrieval for Code Generation

Hongjin Su, Shuyang Jiang, Yuhang Lai, Haoyuan Wu, Boao Shi, Che Liu, Qian Liu, Tao Yu

EMNLP 2024 Findings

Spider2-V: How Far Are Multimodal Agents From Automating Data Science and Engineering Workflows?

Spider2-V: How Far Are Multimodal Agents From Automating Data Science and Engineering Workflows?

Ruisheng Cao, Fangyu Lei, Haoyuan Wu, Jixuan Chen, Yeqiao Fu, Hongcheng Gao, Xinzhuang Xiong, Hanchong Zhang, Yuchen Mao, Wenjing Hu, Tianbao Xie, Hongsheng Xu, Danyang Zhang, Sida Wang, Ruoxi Sun, Pengcheng Yin, Caiming Xiong, Ansong Ni, Qian Liu, Victor Zhong, Lu Chen, Kai Yu, Tao Yu

NeurIPS 2024

OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments

OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments

Tianbao Xie, Danyang Zhang, Jixuan Chen, Xiaochuan Li, Siheng Zhao, Ruisheng Cao, Toh Jing Hua, Zhoujun Cheng, Dongchan Shin, Fangyu Lei, Yitao Liu, Yiheng Xu, Shuyan Zhou, Silvio Savarese, Caiming Xiong, Victor Zhong, Tao Yu

NeurIPS 2024

OpenAgents: An Open Platform for Language Agents in the Wild

OpenAgents: An Open Platform for Language Agents in the Wild

Tianbao Xie*, Fan Zhou*, Zhoujun Cheng*, Peng Shi*, Luoxuan Weng*, Yitao Liu*, Toh Jing Hua, Junning Zhao, Qian Liu, Che Liu, Leo Z. Liu, Yiheng Xu, Hongjin Su, Dongchan Shin, Caiming Xiong, Tao Yu

COLM 2024

Lemur: Harmonizing Natural Language and Code for Language Agents

Lemur: Harmonizing Natural Language and Code for Language Agents

Yiheng Xu*, Hongjin Su*, Chen Xing*, Boyu Mi, Qian Liu, Weijia Shi, Binyuan Hui, Fan Zhou, Yitao Liu, Tianbao Xie, Zhoujun Cheng, Siheng Zhao, Lingpeng Kong, Bailin Wang, Caiming Xiong, Tao Yu

ICLR 2024 Spotlight

Text2Reward: Automated Dense Reward Function Generation for Reinforcement Learning

Text2Reward: Automated Dense Reward Function Generation for Reinforcement Learning

Tianbao Xie*, Siheng Zhao*, Chen Henry Wu, Yitao Liu, Qian Luo, Victor Zhong, Yanchao Yang, Tao Yu

ICLR 2024 Spotlight

Instructor Embeddings: One Embedder, Any Task: Instruction-Finetuned Text Embeddings

Instructor Embeddings: One Embedder, Any Task: Instruction-Finetuned Text Embeddings

Hongjin Su*, Weijia Shi*, Jungo Kasai, Yizhong Wang, Yushi Hu, Mari Ostendorf, Wen-tau Yih, Noah A. Smith, Luke Zettlemoyer, Tao Yu

ACL 2023 Findings

DS-1000: A Natural and Reliable Benchmark for Data Science Code Generation

DS-1000: A Natural and Reliable Benchmark for Data Science Code Generation

Yuhang Lai*, Chengxi Li*, Yiming Wang*, Tianyi Zhang*, Ruiqi Zhong*, Luke Zettlemoyer, Scott Wen-tau Yih, Daniel Fried, Sida Wang, Tao Yu

ICML 2023

Coder Reviewer Reranking for Code Generation

Coder Reviewer Reranking for Code Generation

Tianyi Zhang, Tao Yu, Tatsunori B. Hashimoto, Mike Lewis, Wen-tau Yih, Daniel Fried, Sida I. Wang

ICML 2023

Compositional Exemplars for In-context Learning

Compositional Exemplars for In-context Learning

Jiacheng Ye, Zhiyong Wu, Jiangtao Feng, Tao Yu, Lingpeng Kong.

ICML 2023

Binder: Binding Language Models in Symbolic Languages

Binder: Binding Language Models in Symbolic Languages

Zhoujun Cheng*, Tianbao Xie*, Peng Shi, Chengzu Li, Rahul Nadkarni, Yushi Hu, Caiming Xiong, Dragomir Radev, Mari Ostendorf, Luke Zettlemoyer, Noah A Smith, Tao Yu

ICLR 2023

Selective Annotation Makes Language Models Better Few-Shot Learners

Selective Annotation Makes Language Models Better Few-Shot Learners

Hongjin Su, Jungo Kasai, Chen Henry Wu, Weijia Shi, Tianlu Wang, Jiayi Xin, Rui Zhang, Mari Ostendorf, Luke Zettlemoyer, Noah A. Smith, Tao Yu

ICLR 2023

UnifiedSKG: Unifying and Multi-Tasking Structured Knowledge Grounding with Text-to-Text Language Models

UnifiedSKG: Unifying and Multi-Tasking Structured Knowledge Grounding with Text-to-Text Language Models

Tianbao Xie*, Chen Henry Wu*, Peng Shi, Ruiqi Zhong, Torsten Scholak, Michihiro Yasunaga, Chien-Sheng Wu, Ming Zhong, Pengcheng Yin, Sida I. Wang, Victor Zhong, Bailin Wang, Chengzu Li, Connor Boyle, Ansong Ni, Ziyu Yao, Dragomir Radev, Caiming Xiong, Lingpeng Kong, Rui Zhang, Noah A. Smith, Luke Zettlemoyer, Tao Yu.

EMNLP 2022

ZeroGen: Efficient Zero-shot Learning via Dataset Generation

ZeroGen: Efficient Zero-shot Learning via Dataset Generation

Jiacheng Ye*, Jiahui Gao*, Qintong Li, Hang Xu, Jiangtao Feng, Zhiyong Wu, Tao Yu, Lingpeng Kong

EMNLP 2022

In-Context Learning for Few-Shot Dialogue State Tracking

In-Context Learning for Few-Shot Dialogue State Tracking

Yushi Hu, Chia-Hsuan Lee, Tianbao Xie, Tao Yu, Noah A. Smith, Mari Ostendorf

EMNLP Findings 2022

NL2INTERFACE: Interactive Visualization Interface Generation from Natural Language Queries

NL2INTERFACE: Interactive Visualization Interface Generation from Natural Language Queries

Yiru Chen, Ryan Li, Austin Mac, Tianbao Xie, Tao Yu, Eugene Wu

IEEE Visualization Conference NLVIZ Workshop 2022

GraPPa: Grammar-Augmented Pre-Training for Table Semantic Parsing

GraPPa: Grammar-Augmented Pre-Training for Table Semantic Parsing

Tao Yu, Chien-Sheng Wu, Xi Victoria Lin, Bailin Wang, Yi Chern Tan, Xinyi Yang, Dragomir Radev, Richard Socher, Caiming Xiong

ICLR 2021

Semantic Evaluation for Text-to-SQL with Distilled Test Suites

Semantic Evaluation for Text-to-SQL with Distilled Test Suites

Ruiqi Zhong, Tao Yu, Dan Klein

EMNLP 2020

CoSQL: A Conversational Text-to-SQL Challenge Towards Cross-Domain Natural Language Interfaces to Databases

CoSQL: A Conversational Text-to-SQL Challenge Towards Cross-Domain Natural Language Interfaces to Databases

Tao Yu, Rui Zhang, Heyang Er, Suyi Li, Eric Xue, Bo Pang, Xi Victoria Lin, Yi Chern Tan, Tianze Shi, Zihan Li, Youxuan Jiang, Michihiro Yasunaga, Sungrok Shim, Tao Chen, Alexander Fabbri, Zifan Li, Luyao Chen, Yuwen Zhang, Shreya Dixit, Vincent Zhang, Caiming Xiong, Richard Socher, Walter Lasecki, Dragomir Radev

EMNLP 2019

SParC: Cross-Domain Semantic Parsing in Context

SParC: Cross-Domain Semantic Parsing in Context

Tao Yu, Rui Zhang, Michihiro Yasunaga, Yi Chern Tan, Xi Victoria Lin, Suyi Li, Heyang Er, Irene Li, Bo Pang, Tao Chen, Emily Ji, Shreya Dixit, David Proctor, Sungrok Shim, Jonathan Kraft, Vincent Zhang, Caiming Xiong, Richard Socher, Dragomir Radev

ACL 2018

Spider: A Large-Scale Human-Labeled Dataset for Complex and Cross-Domain Semantic Parsing and Text-to-SQL Task

Spider: A Large-Scale Human-Labeled Dataset for Complex and Cross-Domain Semantic Parsing and Text-to-SQL Task

Tao Yu, Rui Zhang, Kai Yang, Michihiro Yasunaga, Dongxu Wang, Zifan Li, James Ma, Irene Li, Qingning Yao, Shanelle Roman, Zilin Zhang, Dragomir Radev

EMNLP 2018

Xlang
© Copyright 2023 XLANG Lab. All right reserved.