Learning in Human Communication


 We can acquire sophisticated communication skills, including gestures and manipulation of human languages, in a surprisingly short time after our birth. The goal of our research group is to elucidate the neural mechanism of this most mysterious human ability by focusing on computational models and non-invasive brain imaging techniques.


Research Topics

1. Neural Mechanism of Reward-based Behavioral Learning

In many cases of cognitive learning, no explicit teacher signal (error) is provided. However, we can achieve efficient learning thorough interactions with the external world. The theory of reinforcement learning aims to learn appropriate behaviors with only a rough evaluation (reward) of actions provided from the external world. The basic idea of the theory is to change behaviors in proportion to the difference between the prediction of a reward and an actual reward.
 To investigate a neural mechanism of human reinforcement learning, we conducted an fMRI experiment for a stochastic decision task, where a start and goal disk appeared on either of two boxes on the computer screen. The subject was required to move the start disk to the goal by pushing either the left or right buttons. The subject obtained a monetary reward with a success, otherwise suffering the same amount of monetary penalty with a failure. The actual disk movement by a button push was controlled by probabilities. Therefore, with trial and error, the subjects had to maximize their rewards by learning stochastic regularities that control the disk movements.
 We investigated four types of information processing that were key to the reinforcement learning theory. The activity in the caudate nucleus (1,2) and prefrontal cortex (7) was correlated with how subjects changed behaviors in the early phase of learning (red, learning rate index). The activity of the caudate nucleus was also correlated with the short-term reward (blue). These observations are the first piece of experimental evidence that the caudate nucleus plays a central role in reinforcement learning that changes behaviors guided by reward information processing. In contrast, the activity in the dorsal premotor cortex (3, 4), supplementary motor area (5) and lateral cerebellum (6) was correlated with how learning progressed and converged (green, learning convergence index). In summary, the caudate nucleus was involved in the early phase of learning that requires a large amount of behavioral change, while the dorsal premotor cortex, supplementary motor area and lateral cerebellum were involved in the later phase of learning, which uses learned memory with learning in progress. The activity in the orbitofrontal cortex (8) was correlated with the accumulated reward (yellow), suggesting that this area is involved in monitoring cumulative reward.


2. Computational Model for Generation and Recognition of Hierarchical Motor Sequences

Another important aspect of cognitive learning is how to extract and utilize hierarchical structures that exist in the external world and the internal representation used in our behavioral selection. For example, we can generate a variety of structured motor sequences such as writing or speech, and learn to combine elemental actions in novel orders.
 We proposed a computational model called HMOSAIC (Hierarchical MOdular Selection and Identification for Control) to explain such hierarchical information processing. Each layer of HMOSAIC consists of a set of paired control and predictive models. At the lowest level, the control model computes a motor command, while the predictive model predicts the consequence of the ongoing command. A responsibility signal (posterior probability) of a module represents the accuracy of the prediction generated by that particular module's forward model. These responsibility signals are used not only to weight the outputs from each control model, but also to guide competitive learning of the predictive and control models resulting in a self-organization of elementary movements.
 In contrast, the higher-level receives two inputs: an abstract (symbolic) desired trajectory and posterior probabilities of its subordinate level, which represent the modules that are playing a crucial role in the lower level under the current behavioral situation. The higher control model generates, as a motor command, prior probabilities for the lower-level modules, and therefore prioritizes which lower-level modules should be selected. The higher predictive model learns to estimate the posterior probability at the next time step. The outputs from controllers, In addition to the learning of both predictors and controllers, are weighted by the precision of the prediction. Thus, the lower and higher-level modules interact bi-directionally during learning and controlling hierarchically organized movements. Our simulation confirmed that HMOSAIC can automatically learn both elementary movements and their hierarchical temporal order through sensorimotor learning, where the sequence-specific neural firing pattern in the higher-level is similar to the neural activity of the monkey supplementary motor area in sequential motor control tasks.

People Involved in the above Topics

- Masahiko Haruno
- Satoshi Tada
- Brian Coe
- Mitsuo Kawato

Collaborator
- Daniel Wolpert