Advanced Lecture Series in Pattern Recognition
题 目 (TITLE)：Computer Vision: The Next Decade - Towards Human-Level AI
讲 座 人 (SPEAKER)：Prof. Larry Davis (University of Maryland, USA)
主 持 人 (CHAIR)： Dr. Ran He
时 间 (TIME)：July 12 (Tuesday), 2016, 3:00 PM
地 点 (VENUE)：No.1 Conference Room (3rd floor), Intelligence Building
The field of computer vision has advanced remarkably during the past 10-15 years. This is due to a variety of factors including the availability of the large annotated data sets needed to train deep learning models, software like AMT that enables the collection of these data sets at reasonable costs, important engineering improvements to the training methodologies of deep networks, dramatic decreases in price/performance ratios of computing systems (especially GPU's) and memory systems, widespread availability of source code that researchers make available to one another worldwide, and inexpensive sensors and robotic platforms like Kinect, Go-pro's and UAV's. So, while the fundamental vision problems of detection and recognition of objects and human movements are not solved, they have improved to the point where it is important to ask: What's next? A workshop was held in the US late last year to address exactly that question (chaired by me, Fei Fei Li and Devi Parikh) and this talk will discuss the conclusions of that workshop, and illustrate research in some of those future directions with work from the University of Maryland, in particular research on visual search. I will describe a general strategy for object detection, that instead of passively evaluating all object detectors at all possible locations in an image, employs a divide-and-conquer approach by actively and sequentially evaluating contextual cues related to the query based on the scene and previous evaluations-like playing a “20 Questions” game-to decide where to search for the object. The problem is formulated as a Markov Decision Process and a search policy is learned by reinforcement learning. To demonstrate the efficacy of the algorithm, it is applied to the 20 questions approach in the recent framework of simultaneous object detection and segmentation.
Larry S. Davis received his B.A. from Colgate University in 1970 and his M. S. and Ph. D. in Computer Science from the University of Maryland in 1974 and 1976 respectively. From 1977-1981 he was an Assistant Professor in the Department of Computer Science at the University of Texas, Austin. He returned to the University of Maryland as an Associate Professor in 1981. From 1985-1994 he was the Director of the University of Maryland Institute for Advanced Computer Studies. He was Chair of the Department of Computer Science from 1999-2012. He is currently a Professor in the Institute and the Computer Science Department, as well as Director of the Center for Automation Research. He was named a Fellow of the IEEE in 1997 and of the ACM in 2013. Prof. Davis is known for his research in computer vision and high performance computing. He has published over 100 papers in journals and 200 conference papers and has supervised over 40 Ph. D. students. During the past ten years his research has focused on visual surveillance and general video analysis. He has served as program or general chair for most of the field's major conferences, including the 5th&11th&14th International Conference on Computer Vision, the 2004&2010&2019 Computer Vision and Pattern Recognition Conference.