What Will You Do Next? A Cognitive Model for Understanding Others’ Intentions Based on Shared Representations

Abstract

Goal-directed action selection is the problem of what to do next in order to progress towardsgoal achievement. This problem is computationally more complex in case of joint action settingswhere two or more agents coordinate their actions in space and time to bring about a common goal:actions performed by one agent influence the action possibilities of the other agents, and ultimately thegoal achievement. While humans apparently effortlessly engage in complex joint actions, a number ofquestions remain to be solved to achieve similar performances in artificial agents: How agents representand understand actions being performed by others? How this understanding influences the choice ofagent’s own future actions? How is the interaction process biased by prior information about the task?What is the role of more abstract cues such as others’ beliefs or intentions?In the last few years, researchers in computational neuroscience have begun investigating how controltheoreticmodels of individual motor control can be extended to explain various complex social phenomena,including action and intention understanding, imitation and joint action. The two cornerstones ofcontrol-theoretic models of motor control are the goal-directed nature of action and a widespread use ofinternal modeling. Indeed, when the control-theoretic view is applied to the realm of social interactions,it is assumed that inverse and forward internal models used in individual action planning and controlare re-enacted in simulation in order to understand others’ actions and to infer their intentions. Thismotor simulation view of social cognition has been adopted to explain a number of advanced mindreadingabilities such as action, intention, and belief recognition, often in contrast with more classicalcognitive theories - derived from rationality principles and conceptual theories of others’ minds - thatemphasize the dichotomy between action and perception.Here we embrace the idea that implementing mindreading abilities is a necessary step towards a morenatural collaboration between humans and robots in joint tasks. To efficiently collaborate, agents need tocontinuously estimate their teammates’ proximal goals and distal intentions in order to choose what todo next.We present a probabilistic hierarchical architecture for joint action which takes inspiration fromthe idea of motor simulation above. The architecture models the casual relations between observables(e.g., observed movements) and their hidden causes (e.g., action goals, intentions and beliefs) at twodeeply intertwined levels: at the lowest level the same circuitry used to execute my own actions isre-enacted in simulation to infer and predict (proximal) actions performed by my interaction partner,while the highest level encodes more abstract task representations which govern each agent’s observablebehavior. Here we assume that the decision of what to do next can be taken by knowing 1) what thecurrent task is and 2) what my teammate is currently doing. While these could be inferred via a costly(and inaccurate) process of inverting the generative model above, given the observed data, we willshow how our organization facilitates such an inferential process by allowing agents to share a subset ofhidden variables alleviating the need of complex inferential processes, such as explicit task allocation,or sophisticated communication strategies.

title = "What Will You Do Next? A Cognitive Model for Understanding Others{\textquoteright} Intentions Based on Shared Representations",

abstract = "Goal-directed action selection is the problem of what to do next in order to progress towardsgoal achievement. This problem is computationally more complex in case of joint action settingswhere two or more agents coordinate their actions in space and time to bring about a common goal:actions performed by one agent influence the action possibilities of the other agents, and ultimately thegoal achievement. While humans apparently effortlessly engage in complex joint actions, a number ofquestions remain to be solved to achieve similar performances in artificial agents: How agents representand understand actions being performed by others? How this understanding influences the choice ofagent{\textquoteright}s own future actions? How is the interaction process biased by prior information about the task?What is the role of more abstract cues such as others{\textquoteright} beliefs or intentions?In the last few years, researchers in computational neuroscience have begun investigating how controltheoreticmodels of individual motor control can be extended to explain various complex social phenomena,including action and intention understanding, imitation and joint action. The two cornerstones ofcontrol-theoretic models of motor control are the goal-directed nature of action and a widespread use ofinternal modeling. Indeed, when the control-theoretic view is applied to the realm of social interactions,it is assumed that inverse and forward internal models used in individual action planning and controlare re-enacted in simulation in order to understand others{\textquoteright} actions and to infer their intentions. Thismotor simulation view of social cognition has been adopted to explain a number of advanced mindreadingabilities such as action, intention, and belief recognition, often in contrast with more classicalcognitive theories - derived from rationality principles and conceptual theories of others{\textquoteright} minds - thatemphasize the dichotomy between action and perception.Here we embrace the idea that implementing mindreading abilities is a necessary step towards a morenatural collaboration between humans and robots in joint tasks. To efficiently collaborate, agents need tocontinuously estimate their teammates{\textquoteright} proximal goals and distal intentions in order to choose what todo next.We present a probabilistic hierarchical architecture for joint action which takes inspiration fromthe idea of motor simulation above. The architecture models the casual relations between observables(e.g., observed movements) and their hidden causes (e.g., action goals, intentions and beliefs) at twodeeply intertwined levels: at the lowest level the same circuitry used to execute my own actions isre-enacted in simulation to infer and predict (proximal) actions performed by my interaction partner,while the highest level encodes more abstract task representations which govern each agent{\textquoteright}s observablebehavior. Here we assume that the decision of what to do next can be taken by knowing 1) what thecurrent task is and 2) what my teammate is currently doing. While these could be inferred via a costly(and inaccurate) process of inverting the generative model above, given the observed data, we willshow how our organization facilitates such an inferential process by allowing agents to share a subset ofhidden variables alleviating the need of complex inferential processes, such as explicit task allocation,or sophisticated communication strategies.",

T1 - What Will You Do Next? A Cognitive Model for Understanding Others’ Intentions Based on Shared Representations

AU - Chella, Antonio

AU - Dindo, Haris

PY - 2013

Y1 - 2013

N2 - Goal-directed action selection is the problem of what to do next in order to progress towardsgoal achievement. This problem is computationally more complex in case of joint action settingswhere two or more agents coordinate their actions in space and time to bring about a common goal:actions performed by one agent influence the action possibilities of the other agents, and ultimately thegoal achievement. While humans apparently effortlessly engage in complex joint actions, a number ofquestions remain to be solved to achieve similar performances in artificial agents: How agents representand understand actions being performed by others? How this understanding influences the choice ofagent’s own future actions? How is the interaction process biased by prior information about the task?What is the role of more abstract cues such as others’ beliefs or intentions?In the last few years, researchers in computational neuroscience have begun investigating how controltheoreticmodels of individual motor control can be extended to explain various complex social phenomena,including action and intention understanding, imitation and joint action. The two cornerstones ofcontrol-theoretic models of motor control are the goal-directed nature of action and a widespread use ofinternal modeling. Indeed, when the control-theoretic view is applied to the realm of social interactions,it is assumed that inverse and forward internal models used in individual action planning and controlare re-enacted in simulation in order to understand others’ actions and to infer their intentions. Thismotor simulation view of social cognition has been adopted to explain a number of advanced mindreadingabilities such as action, intention, and belief recognition, often in contrast with more classicalcognitive theories - derived from rationality principles and conceptual theories of others’ minds - thatemphasize the dichotomy between action and perception.Here we embrace the idea that implementing mindreading abilities is a necessary step towards a morenatural collaboration between humans and robots in joint tasks. To efficiently collaborate, agents need tocontinuously estimate their teammates’ proximal goals and distal intentions in order to choose what todo next.We present a probabilistic hierarchical architecture for joint action which takes inspiration fromthe idea of motor simulation above. The architecture models the casual relations between observables(e.g., observed movements) and their hidden causes (e.g., action goals, intentions and beliefs) at twodeeply intertwined levels: at the lowest level the same circuitry used to execute my own actions isre-enacted in simulation to infer and predict (proximal) actions performed by my interaction partner,while the highest level encodes more abstract task representations which govern each agent’s observablebehavior. Here we assume that the decision of what to do next can be taken by knowing 1) what thecurrent task is and 2) what my teammate is currently doing. While these could be inferred via a costly(and inaccurate) process of inverting the generative model above, given the observed data, we willshow how our organization facilitates such an inferential process by allowing agents to share a subset ofhidden variables alleviating the need of complex inferential processes, such as explicit task allocation,or sophisticated communication strategies.

AB - Goal-directed action selection is the problem of what to do next in order to progress towardsgoal achievement. This problem is computationally more complex in case of joint action settingswhere two or more agents coordinate their actions in space and time to bring about a common goal:actions performed by one agent influence the action possibilities of the other agents, and ultimately thegoal achievement. While humans apparently effortlessly engage in complex joint actions, a number ofquestions remain to be solved to achieve similar performances in artificial agents: How agents representand understand actions being performed by others? How this understanding influences the choice ofagent’s own future actions? How is the interaction process biased by prior information about the task?What is the role of more abstract cues such as others’ beliefs or intentions?In the last few years, researchers in computational neuroscience have begun investigating how controltheoreticmodels of individual motor control can be extended to explain various complex social phenomena,including action and intention understanding, imitation and joint action. The two cornerstones ofcontrol-theoretic models of motor control are the goal-directed nature of action and a widespread use ofinternal modeling. Indeed, when the control-theoretic view is applied to the realm of social interactions,it is assumed that inverse and forward internal models used in individual action planning and controlare re-enacted in simulation in order to understand others’ actions and to infer their intentions. Thismotor simulation view of social cognition has been adopted to explain a number of advanced mindreadingabilities such as action, intention, and belief recognition, often in contrast with more classicalcognitive theories - derived from rationality principles and conceptual theories of others’ minds - thatemphasize the dichotomy between action and perception.Here we embrace the idea that implementing mindreading abilities is a necessary step towards a morenatural collaboration between humans and robots in joint tasks. To efficiently collaborate, agents need tocontinuously estimate their teammates’ proximal goals and distal intentions in order to choose what todo next.We present a probabilistic hierarchical architecture for joint action which takes inspiration fromthe idea of motor simulation above. The architecture models the casual relations between observables(e.g., observed movements) and their hidden causes (e.g., action goals, intentions and beliefs) at twodeeply intertwined levels: at the lowest level the same circuitry used to execute my own actions isre-enacted in simulation to infer and predict (proximal) actions performed by my interaction partner,while the highest level encodes more abstract task representations which govern each agent’s observablebehavior. Here we assume that the decision of what to do next can be taken by knowing 1) what thecurrent task is and 2) what my teammate is currently doing. While these could be inferred via a costly(and inaccurate) process of inverting the generative model above, given the observed data, we willshow how our organization facilitates such an inferential process by allowing agents to share a subset ofhidden variables alleviating the need of complex inferential processes, such as explicit task allocation,or sophisticated communication strategies.