Advanced Topics in Computer Vision

Seminar, Winter 2006, 3 credits/contact hours.

Synopsis: We will read seminal research papers in computer vision and showcase some recent inventions and discoveries from the most important conferences in the field. The format will be alternating instructor and student presentations: I will give an introduction to a paper's general area, the background and the problem setting, then we'll read the paper as homework, and one student will present it in the following lecture. We will meet two hours a week - you will get credit for the third hour because of the amount of reading involved. This is a pass/fail course.

Prerequisites: CS4330 Computer Vision or basic knowledge of pattern classification and recognition and an interest in computer vision applications. If you are unsure about the math that will be required to understand most of the methods, have a look at Gonzalez' and Woods' review of the essentials. That should suffice.

Sign up now in Python for "CS4921 - Advanced Topics in Computer Science I." Even though it doesn't say so, that is the Computer Vision seminar.

Meeting Times: Tuesdays and Wednesdays at 11:00am in the MOVES Conference room (WA-273). Exception: Wed 1/18/06 location TBA.

The textbook (Forsyth, Ponce: Computer Vision, A Modern Approach) is an excellent resource for anybody doing work in this area. It is listed as required because this makes it easier for you to get it from the Exchange. While it sure will come in handy throughout the seminar, you don't necessarily have to purchase it.

Schedule
datetopic and presenter
1/17/06Donnie: localization and data fusion
1/24/06Patty: ASM
1/31/06Doug: Viola-Jones detector
2/7/06Ryan: multi-flash
2/14/06Kevin: SIFT
2/21/06Donnie: terrain maps
2/28/06Patty: stereo
3/7/06Doug: uncertainty in robotics
3/14/06Ryan: J Lim, D Kriegman: Tracking Humans using Prior and Learned Representations of Shape and Appearance

Topics and Papers: If you have a particular interest, you may suggest a paper of your own chosing. Otherwise, we will pick papers from the following list:
Robert Hanek and Thorsten Schmitt: Computer vision for Robocup players is the topic of their paper titled "Vision-Based Localization and Data Fusion in a System of Cooperating Mobile Robots." It showcases a number of techniques and their application to a "real-world" problem. If you haven't heard of Robocup, check out these amazing videos from one of the long-time champions.
Y. Morvan: More computer vision for RoboCup, this time on an embedded processor (DSP). It's a long report (click on Cached: PDF), but it's easy to read and not difficult to understand.
Takeo Kanade et al: How to make a helicopter fly based on computer vision: AAV Vision. AAV? Yes, it's autonomous, not just unmanned!
David Lowe: SIFT. This is one of the best feature detectors available. Go to his web page to download various papers and a demo program. We will discuss the journal paper in the course.
Ramesh Raskar et al.: multi-flash imaging. A simple yet very powerful method of using structured light, that is, scene illumination with known light source parameters. This SIGGRAPH 2004 paper explains the principles and shows a diverse set of applications of this technique. This video gives a quick overview.
J. Sun, Y. Li, S.B. Kang, and H.-Y. Shum: Symmetric stereo matching for occlusion handling. One of the many stereo matching algorithms. This one produces very good results, but does not run in real-time. Presented at one of the major computer vision conferences, CVPR 2005.
Paul Viola and Michael Jones: Robust Real-time Object Detection. This is a very fast, very accurate method to detect objects such as faces in images and video. A good description of the techniques is given in papers that were presented at IJCV 2004 and at CVPR 2001 (shorter) .
Tim Cootes: Active Shape Models. ASMs are a statistical model of how shapes can deform. They can help in detection and tracking of deformable objects such as faces, cell bodies etc. Tim Cootes' ASM page has links to papers and a little demo program kit. We will discuss this paper.
Wren et al: Pfinder (paper) was a groundbreaking people detection and tracking system in 1997. It demonstrated computer vision as a live interaction mechanism.