Advanced Lecture Series in Pattern Recognition
题 目 (TITLE)：Mud-proof, Moments, and Memory: The Three Essential Ingredients for Bringing Computer Vision to the Masses
讲 座 人 (SPEAKER)：Prof. Jianbo Shi (University of Pennsylvania, USA)
主 持 人 (CHAIR)：Prof. Chenglin Liu
时 间 (TIME)：March 13(Friday), 2015, 10:00AM
地 点 (VENUE)：No.1 Conference Room (3rd floor), Intelligence Building
In first part, I will discuss 3D photography using a camera array, and myths and lessons learned. In the second part, we focus on social vision, relating human motion and action with social attention. We predict social saliency, the likelihood of joint attention in a third person image/video, by learning from the social interaction captured by first person cameras. Inspired by electric dipole moment, we introduce a social formation feature that is designed to capture the geometric relationship between joint attention and social formation. We train an ensemble classifier to learn the locations of joint attention based on the social formation features from the first person social interaction data where we can precisely measure joint attention and locations of its associated members in 3D. We apply this classifier to predict social saliency in real-world social scenes with multiple social cliques including sporting game scenes. Our representation does not require directional measurements such as gaze directions. A geometric analysis of social interactions in terms of existing qualitative studies such as F-formation and proxemics is also presented.
Jianbo Shi studied Computer Science and Mathematics as an undergraduate at Cornell University where he received his B.A. in 1994. He received his Ph.D. degree in Computer Science from University of California at Berkeley in 1998. From 1999 to 2002, he was a research faculty at Robotics Institute at Carnegie Mellon University. In 2003 he joined University of Pennsylvania where he is currently a Professor of Computer and Information Science. In 2007, he was awarded the Longuet-Higgins Prize for his work on Normalized Cuts. According to Google Scholar, his papers have been cited over 24,000 times. His primary research interests are in computer vision, including 1) Human recognition with the ultimate goal of developing computation algorithms to understand human behavior in video; 2) Image Segmentation and Object Recognition with the goal to extract “interesting” patterns from data, and guide the grouping process to achieve specific vision tasks, such as recognizing familiar object shapes.