Pragmatic-Pedagogic Value Alignment
The following Great Innovative Idea is from Jaime Fernandez Fisac, a Ph.D. Candidate in Electrical Engineering and Computer Sciences Department at the University of California, Berkeley, in the area of Control, Intelligent Systems and Robotics. Fisac was one of the Blue Sky Award winners at the International Symposium on Robotics Research (ISRR 17) in Puerto Varas, Chile for his paper, Pragmatic-Pedagogic Value Alignment.
Advances in robotics and AI are making robots increasingly capable and autonomous, but how will we ensure they understand what things they should or should not do? Our insight is that a competent robot collaborator should behave like a keen apprentice: humans are naturally skilled at social collaboration, and robots can exploit this fact to tap into their users’ natural abilities. This can be done by imbuing our robots with the equivalent of what cognitive scientists refer to as a “theory of mind”, the ability to reason about what others may be thinking, and importantly how they may be reasoning about our own thoughts. Cognitive scientists have shown that people can proficiently solve these mutual reasoning problems in order to teach and learn from each other, coordinate their actions, and work together. We have incorporated these cognitive models into the robot’s reasoning and decision making: at the core of our game-theoretic formulation is a naturally arising “pragmatic-pedagogic” equilibrium between what actions the human should take to indicate to the robot what the objective is and how the robot should, in turn, interpret these actions to update its understanding on the fly.
As robots step out of factory floors and into our homes, workplaces and public space, they will gradually begin to work more closely with people and assist us in a growing range of tasks. This increase in autonomy and capability comes with an important challenge, referred to as “value alignment”: once a robot is able to take actions in the world very quickly and efficiently with little or no oversight, how does it ensure its actions are appropriate for its user’s objective? As the instructions we give our robots evolve from “hand me that object” to “help me make a sandwich”, many underlying decisions will be left to the robot’s interpretation: in order to achieve satisfactory performance, let alone maintain safety, the robot will need to follow its user’s lead to understand what kind of sandwich she wants, what part of the process it should help her with and, of course, not to wave the cutting knife uncomfortably close to her face. And since the user’s goals and preferences will change from one task to the next, the robot must infer them from her behavior, and the user will expect it to keep up! Our work on pragmatic-pedagogic reasoning integrates cognitive studies of human theory of mind into the robot’s planning, resulting in a tractable solution approach to the human-robot value alignment problem. We see this as a first step in what we believe will become an increasingly important challenge in robotics over the coming years.
I am interested in the challenge of introducing robotics and AI into the public space, allowing autonomous systems to safely and efficiently interact with people. My research seeks to bring together techniques from control and decision theory, machine learning, and cognitive science to design human-centered systems that can leverage synergies and guarantee safety. In particular, I have done research in safe reinforcement learning for robotics, collision-free trajectory generation for large multi-vehicle fleets and increasing fluency in human-robot collaboration.
I am a Ph.D. Candidate in AI and Robotics at UC Berkeley. A control theorist by training, I obtained my engineering degree from the Universidad Politécnica de Madrid in Spain and a masters in aerospace from Cranfield University in the UK. As a part of my training, I have also worked in the industry for brief periods of time, including Aerialtronics in 2012 and Apple in 2016.