Inferring the best response from a large range of possible actions frequently involves difficult computations that the brain is unlikely to perform rapidly. Nevertheless, humans often do well in such situations. Donoso et al. demonstrate that a computational model designed to integrate reward learning with probabilistic inference (i.e., computing the odds) and a form of hypothesis testing can approximate the optimal solution in a neurobiologically plausible manner. Moreover, the model provides a good fit to human behavior and, as seen by functional magnetic resonance imaging (fMRI), is represented in the activity patterns of specific prefrontal and striatal brain regions.
Figure - Reasoning regions - Activity in specific brain regions tracks the reliability of executed strategies (medial prefrontal cortex), alternative strategies (frontopolar cortex, not shown), the need to explore new strategies (dorsal anterior cingulate cortex), and the confimation of strategies as valid (ventral striatum).The Donoso et al abstract:
The prefrontal cortex (PFC) subserves reasoning in the service of adaptive behavior. Little is known, however, about the architecture of reasoning processes in the PFC. Using computational modeling and neuroimaging, we show here that the human PFC has two concurrent inferential tracks: (i) one from ventromedial to dorsomedial PFC regions that makes probabilistic inferences about the reliability of the ongoing behavioral strategy and arbitrates between adjusting this strategy versus exploring new ones from long-term memory, and (ii) another from polar to lateral PFC regions that makes probabilistic inferences about the reliability of two or three alternative strategies and arbitrates between exploring new strategies versus exploiting these alternative ones. The two tracks interact and, along with the striatum, realize hypothesis testing for accepting versus rejecting newly created strategies.