Deric's MindBlog: vision

Showing posts with label vision. Show all posts

Friday, February 14, 2025

The perceptual primacy of feeling

A fascinating perspective from Conwell et al. (open source). Affectless visual machines explain a majority of variance in human visually evoked affect:

Significance

Human visual experience is defined not only by the light reflecting on our eyes (sensation), but by the feelings (affect) we feel concurrently. Psychological theories about where these feelings come from often focus mostly on the role of changes in our bodily states (physiology) or on our conscious thoughts about the things we are seeing (cognition). Far less frequently do these theories focus on the role of seeing itself (perception). In this research, we show that machine vision systems—which have neither bodily states nor conscious thoughts—can predict with remarkable accuracy how humans will feel about the things they look at. This suggests that perceptual processes (built on rich sensory experiences) may shape what we feel about the world around us far more than many psychological theories suggest.

Abstract

Looking at the world often involves not just seeing things, but feeling things. Modern feedforward machine vision systems that learn to perceive the world in the absence of active physiology, deliberative thought, or any form of feedback that resembles human affective experience offer tools to demystify the relationship between seeing and feeling, and to assess how much of visually evoked affective experiences may be a straightforward function of representation learning over natural image statistics. In this work, we deploy a diverse sample of 180 state-of-the-art deep neural network models trained only on canonical computer vision tasks to predict human ratings of arousal, valence, and beauty for images from multiple categories (objects, faces, landscapes, art) across two datasets. Importantly, we use the features of these models without additional learning, linearly decoding human affective responses from network activity in much the same way neuroscientists decode information from neural recordings. Aggregate analysis across our survey, demonstrates that predictions from purely perceptual models explain a majority of the explainable variance in average ratings of arousal, valence, and beauty alike. Finer-grained analysis within our survey (e.g. comparisons between shallower and deeper layers, or between randomly initialized, category-supervised, and self-supervised models) point to rich, preconceptual abstraction (learned from diversity of visual experience) as a key driver of these predictions. Taken together, these results provide further computational evidence for an information-processing account of visually evoked affect linked directly to efficient representation learning over natural image statistics, and hint at a computational locus of affective and aesthetic valuation immediately proximate to perception.

Wednesday, January 15, 2025

The origin of color categories

Garside et al. make observations suggesting that cognitive mechanisms such as language are required for the expression of consensus color categories.

Significance

A hallmark of intelligence is the use of concepts. Are people innately equipped with concepts? Prior research has addressed the question using color, because color is experienced categorically: color categories reflect concepts of color. This study tested for color categories in macaque monkeys, a species with the same visual-encoding systems as humans. If color categories are innate products of vision, monkeys should have them. The data were analyzed with sensitive computational models, which showed that monkeys do not have consensus color categories, unlike humans. One monkey had a private color category, suggesting that the capacity to form color categories is innate. The results imply that cognitive mechanisms such as language are required for the expression of consensus color categories.

Abstract

To what extent does concept formation require language? Here, we exploit color to address this question and ask whether macaque monkeys have color concepts evident as categories. Macaques have similar cone photoreceptors and central visual circuits to humans, yet they lack language. Whether Old World monkeys such as macaques have consensus color categories is unresolved, but if they do, then language cannot be required. If macaques do not have color categories, then color categories in humans are unlikely to derive from innate properties of visual encoding and likely to depend on cognitive abilities such as language that differ between monkeys and humans. We tested macaques by adapting a match-to-sample paradigm used in humans to uncover color categories from errors in matches, and we analyzed the data using computational simulations that assess the possibility of unrecognized distortions in the perceptual uniformity of color space. The results provide evidence that humans have consensus cognitive color categories and macaques do not. One animal showed evidence for a private color category, demonstrating that monkeys have the capacity to form color categories even if they do not form consensus color categories. Taken together, the results imply that consensus color categories in humans, for which there is ample evidence, must depend upon language or other cognitive abilities.

Wednesday, December 18, 2024

Sculpting new visual categories into the human brain.

Fascinating work from Iordan et al. (open source) I pass on the abstract and the first paragraph of the article that makes more clear what they are doing.

Abstract

Learning requires changing the brain. This typically occurs through experience, study, or instruction. We report an alternate route for humans to acquire visual knowledge, through the direct sculpting of activity patterns in the human brain that mirror those expected to arise through learning. We used neurofeedback from closed-loop real-time functional MRI to create new categories of visual objects in the brain, without the participants’ explicit awareness. After neural sculpting, participants exhibited behavioral and neural biases for the learned, but not for the control categories. The ability to sculpt new perceptual distinctions into the human brain offers a noninvasive research paradigm for causal testing of the link between neural representations and behavior. As such, beyond its current application to perception, our work potentially has broad relevance for advancing understanding in other domains of cognition such as decision-making, memory, and motor control.

“For if someone were to mold a horse [from clay], it would be reasonable for us on seeing this to say that this previously did not exist but now does exist.”

Mnesarchus of Athens, ca. 100 BCE (1).

Humans continuously learn through experience, both implicitly [e.g., through statistical learning (2, 3)] and explicitly [e.g., through instruction (4, 5)]. Brain imaging has provided insight into the neural correlates of acquiring new knowledge (6) and learning new skills (7). As humans learn to group distinct items into a novel category, neural patterns of activity for those items become more similar to one another and, simultaneously, more distinct from patterns of other categories (8–10). We hypothesized that we could leverage this process using neurofeedback to help humans acquire perceptual knowledge, separate from experience, study, or instruction. Specifically, sculpting patterns of activity in the human brain (“molding the neural clay”) that mirror those expected to arise through learning of new visual categories may lead to enhanced perception of the sculpted categories (“they now exist”), relative to similar, control categories that were not sculpted. To test this hypothesis, we implemented a closed-loop system for neurofeedback manipulation (11–18) using functional MRI (fMRI) measurements recorded from the human brain in real time (every 2 s) and used this method to create new neural categories for complex visual objects. Crucially, in contrast to prior neurofeedback studies that focused exclusively on reinforcing or suppressing existing neural representations (11, 12), in the present work, we sought to use neurofeedback to create novel categories of objects that previously did not exist in the brain; we test whether this process can be used to generate significant changes in the neural representations of complex stimuli in the human cortex, and, as a result, alter perception.

Wednesday, July 10, 2024

From nematodes to humans a common brain network motif intertwines hierarchy and modularity.

Pathak et al. (abstract below) suggest the evolved pattern they describe may apply to information processing networks in general, in particular to those of evolving AI implementations.

Significance

Nervous systems are often schematically represented in terms of hierarchically arranged layers with stimuli in the “input” layer sequentially transformed through successive layers, eventually giving rise to response in the “output” layer. Empirical investigations of hierarchy in specific brain regions, e.g., the visual cortex, typically employ detailed anatomical information. However, a general method for identifying the underlying hierarchy from the connectome alone has so far been elusive. By proposing an optimized index that quantifies the hierarchy extant in a network, we reveal an architectural motif underlying the mesoscopic organization of nervous systems across different species. It involves both modular partitioning and hierarchical layered arrangement, suggesting that brains employ an optimal mix of parallel (modular) and sequential (hierarchic) information processing.

Abstract

Networks involved in information processing often have their nodes arranged hierarchically, with the majority of connections occurring in adjacent levels. However, despite being an intuitively appealing concept, the hierarchical organization of large networks, such as those in the brain, is difficult to identify, especially in absence of additional information beyond that provided by the connectome. In this paper, we propose a framework to uncover the hierarchical structure of a given network, that identifies the nodes occupying each level as well as the sequential order of the levels. It involves optimizing a metric that we use to quantify the extent of hierarchy present in a network. Applying this measure to various brain networks, ranging from the nervous system of the nematode Caenorhabditis elegans to the human connectome, we unexpectedly find that they exhibit a common network architectural motif intertwining hierarchy and modularity. This suggests that brain networks may have evolved to simultaneously exploit the functional advantages of these two types of organizations, viz., relatively independent modules performing distributed processing in parallel and a hierarchical structure that allows sequential pooling of these multiple processing streams. An intriguing possibility is that this property we report may be common to information processing networks in general.

Monday, March 11, 2024

How AI’s GPT engines work - Lanier’s forest and trees metaphor.

Jaron Lanier does a piece in The New Yorker titled "How to Picture A.I." (if you hit the paywall by clicking the link, try opening an 'empty tab" on your browser, then copy and paste in the URL that got you the paywall). I tried to do my usual sampling of small chunks of text to give the message, but found that very difficult, and so I pass several early paragraphs and urge you to read the whole article. Lanier's metaphors give me a better sense of what is going on in a GPT engine, but I'm still largely mystified. Anyway, here's some text:

In this piece, I hope to explain how such A.I. works in a way that floats above the often mystifying technical details and instead emphasizes how the technology modifies—and depends on—human input.

Let’s try thinking, in a fanciful way, about distinguishing a picture of a cat from one of a dog. Digital images are made of pixels, and we need to do something to get beyond just a list of them. One approach is to lay a grid over the picture that measures something a little more than mere color. For example, we could start by measuring the degree to which colors change in each grid square—now we have a number in each square that might represent the prominence of sharp edges in that patch of the image. A single layer of such measurements still won’t distinguish cats from dogs. But we can lay down a second grid over the first, measuring something about the first grid, and then another, and another. We can build a tower of layers, the bottommost measuring patches of the image, and each subsequent layer measuring the layer beneath it. This basic idea has been around for half a century, but only recently have we found the right tweaks to get it to work well. No one really knows whether there might be a better way still.

Here I will make our cartoon almost like an illustration in a children’s book. You can think of a tall structure of these grids as a great tree trunk growing out of the image. (The trunk is probably rectangular instead of round, since most pictures are rectangular.) Inside the tree, each little square on each grid is adorned with a number. Picture yourself climbing the tree and looking inside with an X-ray as you ascend: numbers that you find at the highest reaches depend on numbers lower down.

Alas, what we have so far still won’t be able to tell cats from dogs. But now we can start “training” our tree. (As you know, I dislike the anthropomorphic term “training,” but we’ll let it go.) Imagine that the bottom of our tree is flat, and that you can slide pictures under it. Now take a collection of cat and dog pictures that are clearly and correctly labelled “cat” and “dog,” and slide them, one by one, beneath its lowest layer. Measurements will cascade upward toward the top layer of the tree—the canopy layer, if you like, which might be seen by people in helicopters. At first, the results displayed by the canopy won’t be coherent. But we can dive into the tree—with a magic laser, let’s say—to adjust the numbers in its various layers to get a better result. We can boost the numbers that turn out to be most helpful in distinguishing cats from dogs. The process is not straightforward, since changing a number on one layer might cause a ripple of changes on other layers. Eventually, if we succeed, the numbers on the leaves of the canopy will all be ones when there’s a dog in the photo, and they will all be twos when there’s a cat.

Now, amazingly, we have created a tool—a trained tree—that distinguishes cats from dogs. Computer scientists call the grid elements found at each level “neurons,” in order to suggest a connection with biological brains, but the similarity is limited. While biological neurons are sometimes organized in “layers,” such as in the cortex, they are not always; in fact, there are fewer layers in the cortex than in an artificial neural network. With A.I., however, it’s turned out that adding a lot of layers vastly improves performance, which is why you see the term “deep” so often, as in “deep learning”—it means a lot of layers.

Monday, October 16, 2023

Using AI to find retinal biomarkers for patient sex that opthalmologists can be trained to see.

Delavari et al. do a demonstration of using AI to find aspects of human retina images that identify whether they are male or female retinas, a distinction that had not previously been accomplished by clinical opthalmologists. Here is their abstract:

We present a structured approach to combine explainability of artificial intelligence (AI) with the scientific method for scientific discovery. We demonstrate the utility of this approach in a proof-of-concept study where we uncover biomarkers from a convolutional neural network (CNN) model trained to classify patient sex in retinal images. This is a trait that is not currently recognized by diagnosticians in retinal images, yet, one successfully classified by CNNs. Our methodology consists of four phases: In Phase 1, CNN development, we train a visual geometry group (VGG) model to recognize patient sex in retinal images. In Phase 2, Inspiration, we review visualizations obtained from post hoc interpretability tools to make observations, and articulate exploratory hypotheses. Here, we listed 14 hypotheses retinal sex differences. In Phase 3, Exploration, we test all exploratory hypotheses on an independent dataset. Out of 14 exploratory hypotheses, nine revealed significant differences. In Phase 4, Verification, we re-tested the nine flagged hypotheses on a new dataset. Five were verified, revealing (i) significantly greater length, (ii) more nodes, and (iii) more branches of retinal vasculature, (iv) greater retinal area covered by the vessels in the superior temporal quadrant, and (v) darker peripapillary region in male eyes. Finally, we trained a group of ophthalmologists (⁠N=26⁠) to recognize the novel retinal features for sex classification. While their pretraining performance was not different from chance level or the performance of a nonexpert group (⁠N=31⁠), after training, their performance increased significantly (⁠p<0.001⁠, d=2.63⁠). These findings showcase the potential for retinal biomarker discovery through CNN applications, with the added utility of empowering medical practitioners with new diagnostic capabilities to enhance their clinical toolkit.

Monday, October 10, 2022

A sleeping touch improves vision.

Interesting work reported by Onuki et al. in the Journal of Neuroscience:

SIGNIFICANCE STATEMENT

Tactile sensations can bias our visual perception as a form of cross-modal interaction. However, it was reported only in the awake state. Here we show that repetitive directional tactile motion stimulation on the fingertip during slow wave sleep selectively enhanced subsequent visual motion perception. Moreover, the visual improvement was positively associated with sleep slow wave activity. The tactile motion stimulation during slow wave activity increased the activation at the high beta frequency over the occipital electrodes. The visual improvement occurred in agreement with a hand-centered reference frame. These results suggest that our sleeping brain can interpret tactile information based on a hand-centered reference frame, which can cause the sleep-dependent improvement of visual motion detection.

ABSTRACT

Tactile sensations can bias visual perception in the awake state while visual sensitivity is known to be facilitated by sleep. It remains unknown, however, whether the tactile sensation during sleep can bias the visual improvement after sleep. Here, we performed nap experiments in human participants (n = 56, 18 males, 38 females) to demonstrate that repetitive tactile motion stimulation on the fingertip during slow wave sleep selectively enhanced subsequent visual motion detection. The visual improvement was associated with slow wave activity. The high activation at the high beta frequency was found in the occipital electrodes after the tactile motion stimulation during sleep, indicating a visual-tactile cross-modal interaction during sleep. Furthermore, a second experiment (n = 14, 14 females) to examine whether a hand- or head-centered coordination is dominant for the interpretation of tactile motion direction showed that the biasing effect on visual improvement occurs according to the hand-centered coordination. These results suggest that tactile information can be interpreted during sleep, and can induce the selective improvement of post-sleep visual motion detection.

Wednesday, March 02, 2022

AI-synthesized faces are indistinguishable from real faces and more trustworthy

A sobering open source aricle from Nightingale and Farid than suggests the impossibility of sorting our the real from the fake. Synthesized faces tend to look more like average faces which themselves are deemed more trustworthy.

Artificial intelligence (AI)–synthesized text, audio, image, and video are being weaponized for the purposes of nonconsensual intimate imagery, financial fraud, and disinformation campaigns. Our evaluation of the photorealism of AI-synthesized faces indicates that synthesis engines have passed through the uncanny valley and are capable of creating faces that are indistinguishable—and more trustworthy—than real faces.

Wednesday, July 07, 2021

Some people have no "Minds Eye"

Carl Zimmer does a nice piece on the tens of millions of people who don't experience a mental camera. The condition has been named aphantasia, and millions more experience extraordinarily strong mental imagery, called hyperphantasia. British neurologist Adam Zeman estimates that 2.6 percent of people have hyperphantasia and that 0.7 percent have aphantasia...a website called the Aphantasia Network has grown into a hub for people with the condition and for researchers studying them.

The vast majority of people who report a lack of a mind’s eye have no memory of ever having had one, suggesting that they had been born without it. Yet...they had little trouble recalling things they had seen. When asked whether grass or pine tree needles are a darker shade of green, for example, they correctly answered that the needles are.

Researchers are .. starting to use brain scans to find the circuitry that gives rise to aphantasia and hyperphantasia. So far, that work suggests that mental imagery emerges from a network of brain regions that talk to each other...Decision-making regions at the front of the brain send signals to regions at the back, which normally make sense of information from the eyes. Those top-down signals can cause the visual regions to produce images that aren’t there.

In a study published in May, Dr. Zeman and his colleagues scanned the brains of 24 people with aphantasia, 25 people with hyperphantasia and 20 people with neither condition...The people with hyperphantasia had stronger activity in regions linking the front and back of the brain. They may be able to send more potent signals from decision-making regions of the front of the brain to the visual centers at the back.

Thursday, June 03, 2021

Optogenetics used to induce pair bonding and restore vision.

I want to note two striking technical advances that make use of the light activated protein rhodopsin that I spent 36 years of my laboratory life studying. Using genetic techniques, a version of this protein found in algae, called channelrhodopsin, can be inserted into nerve cells so that they become activated by light. Hughes does a lucid explanation of a technical tour de force in bioengineering reported by Yang et al. They used transgenic mice in which light sensitive dopaminergic (DA) neurons in the ventral tegmental area (VTA) brain region (involved in processing reward and promoting social behavior) can be activated by blue light pulses from a tiny LED device implanted under the skull. It is known that some VTA areas fire in synchrony when two mice (or humans) are cooperating or bonding. When two male mice were dropped into a cage, they exhibited mild animus towards each other, but when both were zapped with blue light at the same high frequency they clung to and started grooming each other! (Aside from being forbidden and impractical in humans, how about this means of getting someone to like you!...all you would have to do is control the transmitters controlling VTA DA neuron activity in yourself and your intended.)

A second striking use of optogenetics is reported in Zimmer's summary of work of Sahel et al., who have partially restored sight in one eye of a blind man with retinitis pigmentosa, a hereditary disease that destroys light sensitive photoreceptor cells in the retina but spares the ganglion cell layer whose axons normally send visual information to the brain. Here is the Sahel et. al. abstract:

Optogenetics may enable mutation-independent, circuit-specific restoration of neuronal function in neurological diseases. Retinitis pigmentosa is a neurodegenerative eye disease where loss of photoreceptors can lead to complete blindness. In a blind patient, we combined intraocular injection of an adeno-associated viral vector encoding ChrimsonR with light stimulation via engineered goggles. The goggles detect local changes in light intensity and project corresponding light pulses onto the retina in real time to activate optogenetically transduced retinal ganglion cells. The patient perceived, located, counted and touched different objects using the vector-treated eye alone while wearing the goggles. During visual perception, multichannel electroencephalographic recordings revealed object-related activity above the visual cortex. The patient could not visually detect any objects before injection with or without the goggles or after injection without the goggles. This is the first reported case of partial functional recovery in a neurodegenerative disease after optogenetic therapy.

Friday, April 16, 2021

Vision: What’s so special about words?

Readers are sensitive to the statistics of written language. New work by Vidal et al. suggests that this sensitivity may be driven by the same domain-general mechanisms that enable the visual system to detect statistical regularities in the visual environment.

Highlights

• Readers presented with orthographic-like stimuli are sensitive to bigram frequencies

• An analogous effect emerges with images of made-up objects and visual gratings

• These data suggest that the reading system might rely on general-purpose mechanisms

• This calls for considering reading in the broader context of visual neuroscience

Summary

As writing systems are a relatively novel invention (slightly over 5 kya), they could not have influenced the evolution of our species. Instead, reading might recycle evolutionary older mechanisms that originally supported other tasks and preceded the emergence of written language. Accordingly, it has been shown that baboons and pigeons can be trained to distinguish words from nonwords based on orthographic regularities in letter co-occurrence. This suggests that part of what is usually considered reading-specific processing could be performed by domain-general visual mechanisms. Here, we tested this hypothesis in humans: if the reading system relies on domain-general visual mechanisms, some of the effects that are often found with orthographic material should also be observable with non-orthographic visual stimuli. We performed three experiments using the same exact design but with visual stimuli that progressively departed from orthographic material. Subjects were passively familiarized with a set of composite visual items and tested in an oddball paradigm for their ability to detect novel stimuli. Participants showed robust sensitivity to the co-occurrence of features (“bigram” coding) with strings of letter-like symbols but also with made-up 3D objects and sinusoidal gratings. This suggests that the processing mechanisms involved in the visual recognition of novel words also support the recognition of other novel visual objects. These mechanisms would allow the visual system to capture statistical regularities in the visual environment. We hope that this work will inspire models of reading that, although addressing its unique aspects, place it within the broader context of vision.

Tuesday, January 19, 2021

A third visual pathway specialized for social perception.

Fourty years after Ungerleider and Mishkin proposed our current model of the primate cortex as using two major visual pathway along its ventral and dorsal surfaces that respectively specialize in computing the 'what' and 'where' content of visual stimuili, Pitcher and Ungerleider now summarize evidence that this picture has to be expanded to include a third pathway specialized for moving social visual perceptions, especially of faces. Here are their core points, following by a descriptive graphic from their article.

The two-visual pathway model of primate visual cortex needs to be updated. We propose the existence of a third visual pathway on the lateral brain surface that is anatomically segregated from the dorsal and ventral pathways.

The third pathway exists in human and non-human primates. In humans, the third pathway projects from early visual cortex into the superior temporal sulcus (STS). In macaques the third pathway projects from early visual cortex into the dorsal bank and fundus of the STS.

The third pathway has distinct functional properties. It selectively responds to moving faces and bodies. Visual field-mapping studies show that the third pathway responds to faces across the visual field to a greater extent than the ventral pathway.

The third pathway computes a range of higher sociocognitive functions based on dynamic social cues. These include facial expression recognition, eye gaze discrimination, the audiovisual integration of speech, and interpreting the actions and behaviors of other biological organisms.

ADDED NOTE: Leslie Ungerleider died as 2020 drew to a close. She was a towering figure in the neuroscience community. This obituary by Sabine Kastner in Neuron pays her a fitting tribute.

Friday, August 21, 2020

Gaze deflection reveals how gaze cueing is tuned to extract the mind behind the eyes

Here is a fascinating bit from Colombatto et al.:

Significance

We report an empirical study of gaze deflection—a common experience in which you turn to look in a different direction when someone “catches” you staring at them. We show that gaze cueing (the automatic orienting of attention to locations at which others are looking) is far weaker for such displays, even when the actual eye and head movements are identical to more typical intentional gazes. This demonstrates how gaze cueing is driven by the perception of minds, not eyes, and it serves as a case study of both how social dynamics can shape visual attention in a sophisticated manner and how vision science can contribute to our understanding of common social phenomena.

Abstract

Suppose you are surreptitiously looking at someone, and then when they catch you staring at them, you immediately turn away. This is a social phenomenon that almost everyone experiences occasionally. In such experiences—which we will call gaze deflection—the “deflected” gaze is not directed at anything in particular but simply away from the other person. As such, this is a rare instance where we may turn to look in a direction without intending to look there specifically. Here we show that gaze cues are markedly less effective at orienting an observer’s attention when they are seen as deflected in this way—even controlling for low-level visual properties. We conclude that gaze cueing is a sophisticated mental phenomenon: It is not merely driven by perceived eye or head motions but is rather well tuned to extract the “mind” behind the eyes.

Friday, April 17, 2020

Looking at pictures makes your brain’s visual cortex swell!

Wow, talk about dynamic neuroplasticity...Mansson et al (open source) take observations of how rapidly our brains can change to a whole new level. They show that our visual cortex gets bigger when viewing a picture versus a simple fixation cross.

Measuring brain morphology with non-invasive structural magnetic resonance imaging is common practice, and can be used to investigate neuroplasticity. Brain morphology changes have been reported over the course of weeks, days, and hours in both animals and humans. If such short-term changes occur even faster, rapid morphological changes while being scanned could have important implications. In a randomized within-subject study on 47 healthy individuals, two high-resolution T1-weighted anatomical images were acquired (á 263 s) per individual. The images were acquired during passive viewing of pictures or a fixation cross. Two common pipelines for analyzing brain images were used: voxel-based morphometry on gray matter (GM) volume and surface-based cortical thickness. We found that the measures of both GM volume and cortical thickness showed increases in the visual cortex while viewing pictures relative to a fixation cross. The increase was distributed across the two hemispheres and significant at a corrected level. Thus, brain morphology enlargements were detected in less than 263 s. Neuroplasticity is a far more dynamic process than previously shown, suggesting that individuals’ current mental state affects indices of brain morphology. This needs to be taken into account in future morphology studies and in everyday clinical practice.

Friday, December 13, 2019

Our visual system uses recurrence in its representational dynamics

Fundamental work from Kietzmann et al. shows how recurrence - lateral and top-down feedback from higher to the more primary visual areas of the brain that first register visual input - is occurring during forming visual representations. This process is missing from engineering and neuroscience models that emphasize feedforward neural network models. (Click the link to the article and scroll down to see a fascinating video of their real time magnetoencephalography (MEG) measurements. )

Significance

Understanding the computational principles that underlie human vision is a key challenge for neuroscience and could help improve machine vision. Feedforward neural network models process their input through a deep cascade of computations. These models can recognize objects in images and explain aspects of human rapid recognition. However, the human brain contains recurrent connections within and between stages of the cascade, which are missing from the models that dominate both engineering and neuroscience. Here, we measure and model the dynamics of human brain activity during visual perception. We compare feedforward and recurrent neural network models and find that only recurrent models can account for the dynamic transformations of representations among multiple regions of visual cortex.

Abstract

The human visual system is an intricate network of brain regions that enables us to recognize the world around us. Despite its abundant lateral and feedback connections, object processing is commonly viewed and studied as a feedforward process. Here, we measure and model the rapid representational dynamics across multiple stages of the human ventral stream using time-resolved brain imaging and deep learning. We observe substantial representational transformations during the first 300 ms of processing within and across ventral-stream regions. Categorical divisions emerge in sequence, cascading forward and in reverse across regions, and Granger causality analysis suggests bidirectional information flow between regions. Finally, recurrent deep neural network models clearly outperform parameter-matched feedforward models in terms of their ability to capture the multiregion cortical dynamics. Targeted virtual cooling experiments on the recurrent deep network models further substantiate the importance of their lateral and top-down connections. These results establish that recurrent models are required to understand information processing in the human ventral stream.

Friday, August 23, 2019

What we see is biased by ongoing neural activity.

Rassi et al. (open source) show that ongoing neural activity in our brain, as in the fusiform face area, can influence what we perceive in an ambiguous sensory stimulus such as the Rubin face/vase illusion.

Significance

Ongoing neural activity influences stimulus detection—that is, whether or not an object is seen. Here, we uncover how it could influence the content of what is seen. In ambiguous situations, for instance, ongoing neural fluctuations might bias perception toward one or the other interpretation. Indeed, we show increased information flow from category-selective brain regions (here, the fusiform face area [FFA]) to the primary visual cortex before participants subsequently report seeing faces rather than a vase in the Rubin face/vase illusion. Our results identify a neural connectivity pathway that biases future perception and helps determine mental content.

Abstract

Ongoing fluctuations in neural excitability and in networkwide activity patterns before stimulus onset have been proposed to underlie variability in near-threshold stimulus detection paradigms—that is, whether or not an object is perceived. Here, we investigated the impact of prestimulus neural fluctuations on the content of perception—that is, whether one or another object is perceived. We recorded neural activity with magnetoencephalography (MEG) before and while participants briefly viewed an ambiguous image, the Rubin face/vase illusion, and required them to report their perceived interpretation in each trial. Using multivariate pattern analysis, we showed robust decoding of the perceptual report during the poststimulus period. Applying source localization to the classifier weights suggested early recruitment of primary visual cortex (V1) and ∼160-ms recruitment of the category-sensitive fusiform face area (FFA). These poststimulus effects were accompanied by stronger oscillatory power in the gamma frequency band for face vs. vase reports. In prestimulus intervals, we found no differences in oscillatory power between face vs. vase reports in V1 or in FFA, indicating similar levels of neural excitability. Despite this, we found stronger connectivity between V1 and FFA before face reports for low-frequency oscillations. Specifically, the strength of prestimulus feedback connectivity (i.e., Granger causality) from FFA to V1 predicted not only the category of the upcoming percept but also the strength of poststimulus neural activity associated with the percept. Our work shows that prestimulus network states can help shape future processing in category-sensitive brain regions and in this way bias the content of visual experiences.

Friday, July 26, 2019

Deindividuation of outgroup faces occurs at the earliest stages of visual perception.

From Hughes et al:

A hallmark of intergroup biases is the tendency to individuate members of one’s own group but process members of other groups categorically. While the consequences of these biases for stereotyping and discrimination are well-documented, their early perceptual underpinnings remain less understood. Here, we investigated the neural mechanisms of this effect by testing whether high-level visual cortex is differentially tuned in its sensitivity to variation in own-race versus other-race faces. Using a functional MRI adaptation paradigm, we measured White participants’ habituation to blocks of White and Black faces that parametrically varied in their groupwise similarity. Participants showed a greater tendency to individuate own-race faces in perception, showing both greater release from adaptation to unique identities and increased sensitivity in the adaptation response to physical difference among faces. These group differences emerge in the tuning of early face-selective cortex and mirror behavioral differences in the memory and perception of own- versus other-race faces. Our results suggest that biases for other-race faces emerge at some of the earliest stages of sensory perception.

Friday, July 19, 2019

It’s never simple...The tidy textbook story about the primary visual cortex is wrong.

When I was a postdoc in the Harvard Neurobiology department in the mid-1960’s I used to have afternoon tea with the Hubel and Weisel group. These are the guys who got a Nobel prize for, among other things, finding that the primary visual cortex is organized into cortical columns of cells that responded to lines that prefer different orientations. Another grouping of columns, called ‘blobs’ responded selectively to color and brightness but not orientation. These two different kinds of groups sent their outputs to higher visual areas that were supposed to integrate the information. My neurobiology course lectures and my Biology of Mind book showed drawings illustrating these tidy distinctions.

Sigh… now Garg et al. come along with two-photon calcium imaging to probe a very large spatial and chromatic visual stimulus space and map functional microarchitecture of thousands of neurons with single-cell resolution. They show that processing of orientation and color is combined at the earliest stages of visual processing, totally challenging the existing model. Their abstract:

Previous studies support the textbook model that shape and color are extracted by distinct neurons in primate primary visual cortex (V1). However, rigorous testing of this model requires sampling a larger stimulus space than previously possible. We used stable GCaMP6f expression and two-photon calcium imaging to probe a very large spatial and chromatic visual stimulus space and map functional microarchitecture of thousands of neurons with single-cell resolution. Notable proportions of V1 neurons strongly preferred equiluminant color over achromatic stimuli and were also orientation selective, indicating that orientation and color in V1 are mutually processed by overlapping circuits. Single neurons could precisely and unambiguously code for both color and orientation. Further analyses revealed systematic spatial relationships between color tuning, orientation selectivity, and cytochrome oxidase histology.

Tuesday, March 19, 2019

MRI can detect the content of decisions 11 seconds before they are made.

Goldhill points to and summarizes work of Koening-Rober and Pearson. Their abstract:

Is it possible to predict the freely chosen content of voluntary imagery from prior neural signals? Here we show that the content and strength of future voluntary imagery can be decoded from activity patterns in visual and frontal areas well before participants engage in voluntary imagery. Participants freely chose which of two images to imagine. Using functional magnetic resonance (fMRI) and multi-voxel pattern analysis, we decoded imagery content as far as 11 seconds before the voluntary decision, in visual, frontal and subcortical areas. Decoding in visual areas in addition to perception-imagery generalization suggested that predictive patterns correspond to visual representations. Importantly, activity patterns in the primary visual cortex (V1) from before the decision, predicted future imagery vividness. Our results suggest that the contents and strength of mental imagery are influenced by sensory-like neural representations that emerge spontaneously before volition.

Thursday, February 21, 2019

Watching social influence start to bias perceptual integration as children develop

From Large et al.:

The opinions of others have a profound influence on decision making in adults. The impact of social influence appears to change during childhood, but the underlying mechanisms and their development remain unclear. We tested 125 neurotypical children between the ages of 6 and 14 years on a perceptual decision task about 3D-motion figures under informational social influence. In these children, a systematic bias in favor of the response of another person emerged at around 12 years of age, regardless of whether the other person was an age-matched peer or an adult. Drift diffusion modeling indicated that this social influence effect in neurotypical children was due to changes in the integration of sensory information, rather than solely a change in decision behavior. When we tested a smaller cohort of 30 age- and IQ-matched autistic children on the same task, we found some early decision bias to social influence, but no evidence for the development of systematic integration of social influence into sensory processing for any age group. Our results suggest that by the early teens, typical neurodevelopment allows social influence to systematically bias perceptual processes in a visual task previously linked to the dorsal visual stream. That the same bias did not appear to emerge in autistic adolescents in this study may explain some of their difficulties in social interactions.

Friday, February 14, 2025

Significance

Abstract

Wednesday, January 15, 2025

Significance

Abstract

Wednesday, December 18, 2024

Abstract

Wednesday, July 10, 2024

Monday, March 11, 2024

Monday, October 16, 2023

Monday, October 10, 2022

Wednesday, March 02, 2022

Wednesday, July 07, 2021

Thursday, June 03, 2021

Friday, April 16, 2021

Tuesday, January 19, 2021

Friday, August 21, 2020

Friday, April 17, 2020

Friday, December 13, 2019

Friday, August 23, 2019

Friday, July 26, 2019

Friday, July 19, 2019

Tuesday, March 19, 2019

Thursday, February 21, 2019

Twitter Updates