2016 — Minerva Foundation

2016: Simoncelli

Every day, our sensory systems trick us into believing that what we perceive is a direct reflection of the physical world around us. However, scientists have recognized for centuries that perception is a process of inference, in which incoming information is fused with internal expectations. Since the early 1990s, Eero Simoncelli, a professor of Neural Science, Mathematics, and Psychology at New York University, has used theories of coding efficiency and statistical inference to understand the means by which percepts arise from neural responses.

“Our sensory systems provide us with a remarkably reliable interpretation of the world, allowing us to make predictions, recognize patterns, and perform difficult tasks with surprising accuracy,” Simoncelli said. “How do these capabilities arise from the underlying neural circuitry? Specifically, how do populations of neurons encode sensory information, and how do subsequent populations extract that information for recognition, decisions, and action?”

The goal of the Simoncelli lab is to answer these questions, using a combination of computational theory and modeling, coupled with perceptual and physiological experiments. Over the past several decades, he has made important contributions to our understanding of how the mammalian visual system optimally processes images projected onto the retina and translates them into percepts of the physical world. To test his statistical models, Simoncelli has also developed innovative experimental paradigms, including novel stimuli and analysis methods for both physiological experiments and perceptual studies in humans.

“Eero Simoncelli is the foremost investigator in the world in the field of computational vision,” said Bill Newsome, a professor of neurobiology at Stanford University and former Golden Brain Award recipient. “He has a unique perspective as a theory and data analytics maven, which enables him to frame fundamental problems in vision science in a clear, concise manner that has had a dramatic influence on the course of vision research over the past two decades.”

For his seminal contributions to the field of visual neuroscience, Simoncelli has been named the recipient of the 2016 Golden Brain Award from the Berkeley, California-based Minerva Foundation. The award, now in its 32nd year, recognizes outstanding contributions in vision and brain research. Simoncelli was honored in a private ceremony in New York City. “It’s an honor to receive this award,” Simoncelli said. “It’s especially nice to be recognized, given that my work is highly interdisciplinary and not restricted to a specific traditional field.”

Simoncelli started his higher education as a physics major at Harvard University, and then attended Cambridge University to study mathematics for a year and a half. After building a strong base of quantitative skills, he decided that he wanted to focus his research on understanding the brain as a signal processing engine. To do so, he returned to the United States to pursue a doctorate degree in electrical engineering and computer science at the Massachusetts Institute of Technology.

“For my PhD work, I studied the representation of visual motion, in terms of a common set of principles that had implications for computer vision, neurobiology, and perception,” Simoncelli said. “The synergies of that cross-disciplinary experience became a prototype for the kind of work I wanted to do, and have done ever since.”

After earning his PhD in 1993, Simoncelli joined the faculty of the Computer and Information Science department at the University of Pennsylvania. In 1996, he joined the Center for Neural Science, as part of the Sloan-Swartz Center for Theoretical Visual Neuroscience, at New York University. He received a National Science Foundation Faculty Early Career Development (CAREER) grant in 1996 for research and teaching in visual information processing, a Sloan Research Fellowship in 1998, and became an Investigator of the Howard Hughes Medical Institute in 2000 under its new computational biology program. In 2008, Simoncelli was elected a Fellow of the Institute of Electrical and Electronics Engineers.

Over the years, Simoncelli has been widely recognized by the scientific community for constructing computational models of vision that are consistent with the properties of the visual world, the requirements of visual tasks, and the constraints of biological implementation. Through a series of incisive theoretical and statistical studies, he has shown that regularities in the statistics of natural visual scenes place huge constraints on how any given pattern of light that falls on the retina can be interpreted by the brain, thereby narrowing down the very large number of possible interpretations about what is actually out there in the 3D visual world.

In collaboration with physiologists, he has shown that the properties of neurons in both the retina and the central visual system incorporate these constraints in the representation of the visual image. “This combined theoretical/experimental insight has explained a number of observations about the central visual system that are otherwise anomalous,” Newsome said.

One major arm of Simoncelli’s research has focused on the optimal encoding of visual information. It has long been assumed that visual systems are adapted, at evolutionary, developmental, and behavioral timescales, to the images to which they are exposed. Since not all images are equally likely, it is natural to assume that the visual system uses its limited resources to process best those images that occur most frequently, using the statistical properties of the environment.

Since the mid-1990s, Simoncelli has developed successively more powerful models describing the statistical properties of local regions of natural images. Moreover, he has demonstrated the power of these models by using them to understand the structure and function of both visual and auditory neurons, and to develop state-of-the-art solutions to classical engineering problems, such as compression, transmission, and image enhancement.

Simoncelli’s results have provided strong support for the ecological hypothesis that neural computations are well matched to the statistics of the environment. He has found that sensory systems are optimized to represent signals that occur more frequently in the natural environment of an organism. For example, a study published in Neural Computation in 2014 revealed that more cells are dedicated to processing more common stimuli, resulting in enhanced perceptual sensitivity for these stimuli.

His body of work has shed light on how sensory systems maximize information transmitted to the brain. For example, a study published in Nature in 2008 showed that the activity of populations of retinal neurons is far more precise and predictable than the highly variable responses of individual neurons. In other words, the whole is greater than the sum of its parts: Correlated activity among neural populations enables the visual system to extract more information from a scene than uncorrelated activity.

Simoncelli’s statistical models have been used to explain human perception of visual texture patterns, motion speed, the orientation of contours, and complex sounds. For example, he has shown that people are more efficient at processing vertical and horizontal contours, which are more prevalent in the natural environment. His models also account for the responses of neurons in the retina and primary visual cortex (area V1), motion-sensitive neurons in the middle temporal cortex (area MT), texture-sensitive neurons in secondary visual cortex (area V2), and auditory neurons that respond to complex sounds such as rain, swarms of insects, or an audience applauding.

“We offer concrete examples of fairly abstract principles that govern the operation of sensory systems, or indeed, any machine that must process visual signals,” Simoncelli said. “Our work has had an impact on neuroscience and perception, but also engineering, including image processing, computer vision, and the design of visual displays.”

These findings have been used to design better man-made systems for processing sensory signals. Last year, Simoncelli received an Engineering Emmy Award from the Television Academy for developing an algorithm for estimating the perceived quality of images and videos. This algorithm, known as Structural Similarity (SSIM), uses powerful neuroscience-inspired models of the human visual system to achieve breakthrough quality prediction performance. Its computational simplicity and ability to accurately predict human assessment of visual quality has made it a standard tool in broadcast and post-production houses throughout the television industry.

Unlike previous complex error models that required special hardware, SSIM can be easily applied in real time on common processor software. The algorithm is now a widely used perceptual video quality measure, used to test and refine video quality throughout the global cable and satellite TV industry, and directly affects the viewing experiences of tens of millions of viewers daily. This honor reflects the interdisciplinary and applied nature of Simoncelli’s work, and it has not gone unnoticed among his peers. “Eero is the only vision scientist I know of who has won an Emmy Award,” Newsome said.