Related Conferences

Keynote Speakers

Stephen Grossberg

Affiliation

Center for Adaptive Systems, Graduate Program in Cognitive and Neural Systems Center for Computational Neuroscience and Neural Technology Departments of Mathematics, Psychology, and Biomedical Engineering, Boston University

Abstract : How do we learn what a visually seen object is? How do our brains learn without supervision to link multiple views of the same object into an invariant object category while our eyes search a scene, even before we have a concept of the object? Indeed, why do we not link together views of different objects when there is no teacher to correct us? Why do not our eyes move around randomly? How do they explore interesting features of novel objects and thereby enable us to learn view-, size-, and positionally-invariant object categories? How do representations of a scene remain binocularly fused as our eyes explore it? How is the gist of a scene computed? How do we solve the Where’s Waldo problem and thereby efficiently search for desired objects in a scene using object and spatial contextual information? This talk will summarize the ARTSCAN family of neural models that clarifies how the brain solves these problems in a unified way by coordinating processes of 3D vision and figure-ground separation, spatial and object attention, object and scene category learning, predictive remapping, and eye movement search. ARTSCAN illustrates revolutionary new computational paradigms whereby the brain computes: Complementary Computing clarifies the nature of brain specialization, and Laminar Computing clarifies why all neocortical circuits exhibit a layered architecture. ARTSCAN also provides unified explanations and simulations of brain and behavioral data, and computer simulation benchmarks that support the model, which provides a blueprint for developing a new type of system for active vision and autonomous learning and recognition.

Biography: Stephen Grossberg is Wang Professor of Cognitive and Neural Systems; Professor of Mathematics, Psychology, and Biomedical Engineering; and Director of the Center for Adaptive Systems at Boston University. He is a principal founder and current research leader in computational neuroscience, connectionist cognitive science, and neuromorphic technology. Grossberg introduced the paradigm of using nonlinear systems of differential equations to model how brain mechanisms give rise to behavioral functions. In 1957-1958, he introduced widely used equations for short-term memory (STM), or neuronal activation; medium-term memory (MTM), or activity-dependent habituation; and long-term memory (LTM), or neuronal learning. His work focuses upon how individuals adapt autonomously in real-time to unexpected environmental challenges, and includes models of vision and visual cognition; object, scene, and event recognition; audition, speech and language; development; cognitive information processing; reinforcement learning and cognitive-emotional interactions; navigation; social cognition; sensory-motor control and planning; mental disorders; and neuromorphic technology. Grossberg founded key infrastructure of the field of neural networks, including the International Neural Network Society and the journal Neural Networks. He is a fellow of AERA, APA, APS, IEEE, INNS, MDRS, and SEP. He has published 17 books or journal special issues, over 500 research articles, and has 7 patents. See the following web pages for further information:

Abstract: Texture is an important visual attribute both for human perception and image analysis systems. We present new structural texture similarity metrics and applications that critically depend on such metrics, with emphasis on image compression and content-based retrieval. The new metrics account for human visual perception and the stochastic nature of textures. They rely entirely on local image statistics and allow substantial point-by-point deviations between textures that according to human judgment are similar or essentially identical. We also present new testing procedures for objective texture similarity metrics. We identify three operating domains for evaluating the performance of such similarity metrics: the top of the similarity scale, where a monotonic relationship between metric values and subjective scores is desired; the ability to distinguish between perceptually similar and dissimilar textures; and the ability to retrieve "identical" textures. Each domain has different performance goals and requires different testing procedures. Experimental results demonstrate both the performance of the proposed metrics and the effectiveness of the proposed subjective testing procedures.

Biography: Thrasos Pappas received the Ph.D. degree in electrical engineering and computer science from MIT in 1987. From 1987 until 1999, he was a Member of the Technical Staff at Bell Laboratories, Murray Hill, NJ. He joined the EECS Department at Northwestern in 1999. His research interests are in image and video quality and compression, image and video analysis, content-based retrieval, perceptual models for multimedia processing, model-based halftoning, and tactile and multimodal interfaces. Prof. Pappas is a Fellow of the IEEE and SPIE. He has served as editor-in-chief of the IEEE Transactions on Image Processing (2010-12), elected member of the Board of Governors of the Signal Processing Society of IEEE (2004-07), chair of the IEEE Image and Multidimensional Signal Processing Technical Committee (2002-03), and technical program co-chair of ICIP-01 and ICIP-09. Since 1997 he has been co-chair of the SPIE/IS&T Conference on Human Vision and Electronic Imaging.

Jean Ponce

Affiliation:

Département d'informatiqueÉcole normale supérieure, Paris, France

Title: Weakly supervised modeling and interpretation of video streams

Abstract: This talk addresses the problem of understanding the visual content of videos using a weak form of supervision such as the textual information available in television or film scripts. I will discuss two instances of this problem, the joint localization and identification of movie characters and their actions, and the assignment of action labels to video frames using temporal ordering constraints. Both problems can be tackled using a discriminative clustering framework, and I will present the underlying models, appropriate relaxations of the corresponding combinatorial optimization problems associated with learning these models, and efficient algorithms for solving the corresponding convex optimization problems. I will also present experimental results on feature-length films.

Biography: Jean Ponce is professor and head of the computer science department of École Normale Supérieure, where he leads the joint ENS/CNRS/Inria project-team Willow. His main research interest is computer vision, and he is the author of the textbook "Computer Vision: A Modern Approach" that has been translated in Chinese, Japanese, and Russian. Jean Ponce is an IEEE Fellow, a senior member of the Institut Universitaire de France, and the recipient of an ERC Advanced Grant.