Machine Learning and Friends Lunch: Kuan Fang, Open-World Robot Dexterity via Physically Grounded Reasoning
Content
Speaker
Kuan Fang (Cornell University)
Abstract
Many recent approaches to robot intelligence are motivated by a simple premise: generalization will emerge from scaling data, compute, and model capacity, much as it has in language and vision. However, robotics confronts a combinatorial diversity of environments, behaviors, and instructions that scaling alone cannot easily overcome. In this talk, I will present an alternative paradigm that bridges pre-trained foundation models with dexterous robot control using structured reasoning over affordances, contacts, and motion. I will first describe an approach that reformulates motion planning as a series of visual question-answering problems that pre-trained vision-language models can solve by marking keypoint affordances directly on images. Then, I will introduce a framework that trains trajectory-conditioned policies to enable dexterous quadruped loco-manipulation through flexible inter limb coordination. Lastly, I will propose a structured affordance-grounding framework that represents tasks as affordance–entity pairs, providing a unified, modality-agnostic interface for visuomotor learning across language, pixels, and demonstrations. Together, these works point toward a path for scaling robot dexterity to the open world.
Speaker Bio
Kuan Fang is an Assistant Professor of Computer Science at Cornell University. His research lies at the intersection of robotics, machine learning, and computer vision, focusing on scalable learning-based approaches for robust and generalizable robot perception and control in unstructured environments. He received his Ph.D. and M.S. from Stanford University and his bachelor’s degree from Tsinghua University. Prior to joining Cornell, he was a postdoctoral researcher at UC Berkeley and a researcher at the Robotics and AI Institute. His work has been recognized with a CRA/CCC Computing Innovation Fellowship, as well as faculty fellowships from Amazon and Nvidia.