203.4780
Course overview
Useful links
Syllabus Detailed schedule
Meeting Times: Monday 9-12, Room 462
Instruction Hour: Wednesday 11:00-12:00, Room 410 (Jacobs)
§
No
class 5.5;
§
The
reviews for the topic: Patch-based
Representations are due to 5.5;
§
The reviews for the topic: Detection as a binary decision are due to 12.5;
§
Both
topics will be presented on 12.5. ( If we don’t have enough time for the second
topic, we will continue it on 19.5);
§ All announcements and guidelines
will be distributed by email.
§ Those who do not send their contact
address on time will not be added to the contact list!!!
§ You must send me an email to (rita[at]cs[dot]haifa.ac.il)
by March 1 from your active address with the subject "course
4780"
General: This is a graduate
course in computer vision. We will survey and discuss vision papers
relating to object and activity recognition and scene understanding. The
goal of the course is to understand classical and modern approaches to some
important problems, analyzing their strengths and weaknesses, and identifying
interesting open questions.
Requirements: Students will be responsible for writing a paper review each week, participating in discussions, completing a programming project, and presenting one topic in a class.
Note that presentations are due one week before the slot your presentation is scheduled. This means you will need to read the papers, create slides, etc. one week before the date you are signed up for, to leave time for improvement. Note, that you should get my approval for your presentation.
More details on the requirements and grading breakdown are here.
A. Recognizing specific objects
Global features:
1. Linear Subspaces
2. Detection
as a binary decision
Local features:
3. Local features, matching for object
instances
4. Visual Vocabularies and Bag of Words
Region-based methods:
5. Mid-Level Representations
B. Beyond Single objects (using
additional information)
1. Saliency
2. Attributes
3. Context
C. Scalability problems
1. Scaling with the large number of
categories
D. Action recognition in video and
images
Schedule and papers:
Note: * = required reading.
Additional papers are provided for reference, and as a starting point for
background reading for projects.
Paper presentations: Cover the starred papers.
Date |
Topics |
Papers
and links |
Presenters |
|||||||||
3.3 |
Course
intro |
[slides] |
Instructor |
|||||||||
10.3 |
Introduction
to Object and Event Recognition |
[slides] |
Instructor |
|||||||||
17.3 |
Introduction
to Object and Event Recognition |
|
Instructor |
|||||||||
24.3 |
No
class |
|
|
|||||||||
Linear Subspaces Global appearance models for object recognition, dimensionality reduction.
|
o
*Eigenfaces
for Recognition, Turk and Pentland, 1991. [pdf] o
*P.N. Belhumeur,
J.P. Hespanha, D.J. Kriegman,
Eigenfaces vs. Fisherfaces:
Recognition using Class Specific Linear Projection, 1996 [pdf] o
Face Database [here]
|
Mor
[pdf] |
||||||||||
7.4 |
Cyber Day |
o
|
|
|||||||||
28.4 |
Local features and matching for
object instances: |
o
*Object Recognition from Local
Scale-Invariant Features, Lowe, ICCV 1999. [pdf] [code] [other
implementations of SIFT] [IJCV] o
*Selected pages from: Local
Invariant Feature Detectors: A Survey, Tuytelaars
and Mikolajczyk. Foundations and Trends in
Computer Graphics and Vision, 2008. [pdf]
[Oxford code]
[Read pp. 178-188, 216-220, 254-255] o
o
Oxford group interest point software o
Andrea Vedaldi's VLFeat code,
including SIFT, MSER, hierarchical k-means. o
INRIA LEAR team's software, including interest
points, shape features o
FLANN - Fast
Library for Approximate Nearest Neighbors. Marius Muja et al. o
Kooaba
|
Guy[pdf] |
|||||||||
12.5 |
Patch-based
Representations visual
vocabularies, bag-of-words and SPK for scene classification
|
o
*Visual Categorization with Bags of Keypoints, C. Dance, J. Willamowski,
L. Fan, C. Bray, and G. Csurka, ECCV International
Workshop on Statistical Learning in Computer Vision, 2004. [pdf] o
*Beyond Bags of Features: Spatial
Pyramid Matching for Recognizing Natural Scene Categories, Lazebnik, Schmid, and Ponce,
CVPR [pdf],
[code],[data].
|
Assaf [pdf] |
|||||||||
12.5 |
Detection as a
binary decision Sliding window detection, detection as a binary decision
problem. |
o
*Histograms of Oriented Gradients
for Human Detection, Dalal and Triggs,
CVPR 2005. [pdf]
[code] [PASCAL datasets] o
*Rapid Object Detection Using a
Boosted Cascade of Simple Features, Viola and Jones, CVPR 2001. [pdf]
[code] o
LIBSVM library for
support vector machines o
PASCAL
VOC Visual Object Classes Challenge
|
Majd[pdf] |
|||||||||
Context
and scenes |
o
*Object Bank: A High-Level Image
Representation for Scene Classification & Semantic Feature Sparsification. L-J. Li, H. Su, E. Xing, L. Fei-Fei. NIPS 2010. [pdf]
[code]
|
[pdf] |
||||||||||
|
||||||||||||
|
||||||||||||
Recognizing
and localizing human actions in video or static images |
|
|||||||||||
Importance
and saliency |
|
|||||||||||
Large-scale
image/object search and mining: Scalable
retrieval algorithms, mining for visual themes, particularly for object
instances |
o
*Semi-supervised
hashing for large-scale search. J Wang, S Kumar, SF Chang [pdf] o
*Supervised
Hashing with Kernels. W. Liu, J. Wang, R. Ji, Y. Jiang, S.-F.
Chang. CVPR 2012 [pdf]
|
*
This course is
based on UT-Austin course: Special Topics in Computer Vision, by Kristen Grauman: