A recent focus in AI is autonomous agents, which, usually powered by large language models (LLMs), can follow language instructions to autonomously carry out complex tasks in real-world environments. However, the concept of agent has been introduced into AI since its dawn, so what's different this time around? I argue that the most fundamental change is the capability of using language. Contemporary AI agents use language as a vehicle for both communication and thought, which enables them to do complex reasoning, understand heterogeneous environmental percepts, and easily communicate with humans. The use of language has proven to be immensely critical for the evolution of biological intelligence: while animals like mice or even worms already have limited reasoning and learning capabilities, human intelligence is much more sophisticated with the help of language. Now artificial intelligence is following a similar evolutionary path. Therefore, I suggest that these contemporary AI agents should be called "language agents," for language being their most salient trait. Developing and understanding such language agents is a necessary step towards more general artificial intelligence. In this talk, I will first describe a potential conceptual framework for language agents and briefly touch on important topics such as memory, tool use, grounding, and reasoning. I will then briefly introduce several of our recent work on language agents, including 1) Mind2Web, that aims to develop generalist language agents that work on any real-world website, 2) LLM-Planner, that leverages LLMs for robot planning to interact with physical environments, and 3) Pangu, a generic neurosymbolic framework for developing language agents for different environments (e.g., knowledge graphs, databases, and websites), which features a symbolic agent and a neural LM working in a concerted fashion.
Yu Su is a Distinguished Assistant Professor of Engineering at the Ohio State University. He obtained his Ph.D. from University of California, Santa Barbara and his bachelor's degree from Tsinghua University. He co-directs the OSU NLP group and serves in leadership roles in multiple national AI institutes. He has broad interests in developing artificial intelligence, with a primary interest in the role of language, as a vehicle of thought and communication, in both artificial and human intelligence. His work at Microsoft has led to a new conversational interface for Microsoft Outlook. His research has been recognized with awards such as Outstanding Paper Award at ACL 2023 and COLING 2022, Outstanding Dissertation Award from UC Santa Barbara, and the third-place honor of the inaugural Amazon Alexa Prize TaskBot Challenge.