
On some
deterministic dictionaries supporting sparsity


Nir Sochen,
Shamgar Gurevich, and
Ronny Hadani – TAU


We describe two deterministic constructions of
dictionaries
of functions on the finite line which supports certain degree of
sparsity.


Learning
Appearances with LowRank SVM


Lior Wolf, Hueihan
Jhuang, Tamir Hazan  TAU


Several authors have noticed that the common
representation of images as vectors is suboptimal. The process of vectorization eliminates spatial relations between some
of the nearby image measurements and produces a vector of a dimension which
is the product of the measurements' dimensions. It seems that images may be
better represented when taking into account their structure as a 2D (or
multiD) array.
Our framework, "LowRank separators", studies
the use of a separating hyperplane which are
constrained to have the structure of lowrank matrices. We first prove that
the lowrank constraint provides preferable generalization properties. We
then define two "Lowrank SVM problems" and propose algorithms to
solve these. Finally, we provide supporting experimental evidence for the
framework.


On Biometry, Isometry, and Intrinsic Symmetry


R. Kimmel, Dan Raviv A.
Bronstein and M. Bronstein  Technion


It can be shown that a person's identity is associated
with the intrinsic geometry the facial surface, while facial expressions are related
to the extrinsic geometry. Our first attempt in face recognition was to
represent the intrinsic geometry of the surface by isometrically
embedding it into a lowdimensional Euclidean space. The result is an expressioninvariant
representation of the face we called canonical form. Next, we embedded surfaces into nonEuclidean
simple spaces. Particularly, two and threedimensional spheres were found to
be appealing for the representation of faces, as the resulting metric distortion
is usually smaller compared to a Euclidean space. Last year, we introduced
the Generalized Multidimensional Scaling (GMDS), which allows embedding into
manifolds with an arbitrary geometric structure. Instead of embedding the
facial surfaces into a common embedding space, we embed one surface into the
other and use the metric distortion as a measure of their dissimilarity. The
GMDS allows surface recognition even when significant parts are missing.
Finally, mapping a surface into its own geometry can be used in order to
study self (isometric) symmetries of a given object.
In this talk I will review the relation between these
measures, the underlying theory, the resulting numerical machinery, and
applications ranging from recognition of faces in shape analysis, to
morphing, warping, intrinsic symmetry, and texture mapping.


Superresolution
with no explicit motion estimation


Matan Protter, Miki Elad – Technion


Superresolution reconstruction proposes a fusion of
several low quality
images into one higher quality result with better optical resolution.
Classic super resolution techniques strongly rely on the availability of
accurate motion estimation for this fusion task. When the motion is
estimated inaccurately, as often happens for nonglobal motion fields, this
results with annoying artifacts in the superresolved outcome. Encouraged by
recent developments on the video denoising problem,
where stateoftheart algorithms are formed with no explicit motion
estimation, we seek a superresolution algorithm of similar nature. In this
talk we present our solution, a novel superresolution algorithm that is
based on the NonLocalMeans (NLM) algorithm. We show how this denoising method is generalized to become a relatively
simple super resolution algorithm with no explicit motion estimation. Results
on several various movies show that the proposed method is very successful in
providing superresolution on general sequences.


Dense
mirroring surface recovery from 1D homographies and
sparse correspondences


Stas Rozenfeld, Ilan Shimshoni, Micha Lindenbaum – Haifa +
Technion


In this work we recover the 3D shape of mirroring objects
such as
mirrors, sunglasses, and stainless steel objects. A computer
monitor displays several images of parallel stripes,
each image at a different angle. Reflections of these stripes in a
mirroring surface are captured by the camera. For every image point,
the directions of the displayed stripes and their reflections in the
image are related by a 1D homography which can be
computed robustly
and using the statistically accurate heteroscedastic
model, without
monitorimage correspondence, which is generally required by other
techniques. In addition, each image of a stripes is followed by its
negative image. The direction of the stripes is estimated using the
subtraction of two images. This allows to perform the shape recovery
procedure not only for pure mirroring surfaces. Focusing on a small
set of image points for which monitorimage correspondence is
computed, the depth and the local shape may be calculated relying on
this homography. This is done by an optimization
process which is
related to the one proposed by Savarese, Chen and Perona
2005 but is different and more stable due to the minimization of an angle
error which is more "geometric" measure then the matrix degeneracy
constraint. Connection between our cost function and the one used by Savarese, Chen and Perona is
addressed. Then dense surface recovery is performed using constrained
interpolation, which does not simply interpolate the surface depth
values, but rather solves for the depth, the correspondence, and the local
surface shape, simultaneously at each interpolated point. Consistency with
the 1D homography is thus required. The proposed
method as well as the method described in by Savarese,
Chen and Perona are inherently unstable on a
small part of the surface. We propose a method to detect these instabilities
and correct them. Additionally, we provide an algebraic constraint sufficient
for the point to be unstable, and give simple geometric interpretation of
this
constraint in case of planar and spherical surface. The method was
implemented and the shapes of a mirror, sunglasses, and a stainless
steel ashtray were recovered at submillimeter accuracy.


Optimizing
Gabor Filter Design for Texture Edge Detection and Classification


Roman Sandler
, Micha Lindenbaum  Technion


An effective and efficient texture analysis method, based
on a new
criterion for designing Gabor filter sets, is proposed.
The commonly used filter sets are usually designed for optimal
signal representation. We propose here an alternative criterion for
designing the filter set. We consider a set of filters and its
response to pairs of harmonic signals. Two signals are considered
separable if the corresponding two sets of vector responses are
disjoint in at least one of the components. We propose an algorithm
for deriving the set of Gabor filters that maximizes the fraction of
separable harmonic signal pairs in a given frequency range. The
resulting filters differ significantly from the traditional ones.
We test these maximal harmonic discrimination (MHD) filters in
several texture analysis tasks: clustering, recognition, and edge
detection. It turns out that the proposed filters perform much
better than the traditional ones in these tasks. They can achieve
performance similar to that of stateoftheart, distribution based
(texton) methods, while being simpler and more
computationally
efficient.


Matching
Local SelfSimilarities across Images and Videos


Eli Shechtman,
Michal Irani  Weizmann


We present an approach for measuring similarity between
visual entities (images or videos) based on matching internal
selfsimilarities. What is correlated across images (or across video
sequences) is the internal
layout of local selfsimilarities (up to some distortions),
even though the patterns generating those local selfsimilarities are quite
different in each of the images/videos. These internal selfsimilarities are
efficiently captured by a compact local "selfsimilarity
descriptor", measured densely throughout the image/video, at multiple
scales, while accounting for local and global geometric distortions. This
gives rise to matching capabilities of complex visual data, including
detection of objects in real cluttered images using only rough handsketches,
handling textured objects with no clear boundaries, and detecting complex
actions in cluttered video data with no prior learning. We compare our
measure to commonly used imagebased and videobased
similarity measures, and demonstrate its applicability to object detection,
retrieval, and action detection.


Webcam
Synopsis and Indexing


Yael Pritch, Alex RavAcha, Shmuel Peleg  HUJI


The world is covered with millions of webcams (or
security cameras). However, the video recorded by such cameras is rarely
watched as there are more cameras than people to watch them and their
contents is not so interesting most of the time.
Video Synopsis can be used to create a synopsis of
endless video streams, as generated by webcams and by security cameras. It
can address queries like ``Show in one minute the highlights of this camera
broadcast during the past day''. This
process includes two major phases: (i) An online
conversion of the endless video stream into a database of objects and
activities (rather than frames). (ii)
A response phase, generating the video synopsis as a response to the user's
query.
The synopsis video can also be used as an index into the
original video. Several methodologies for video indexing based on video
synopsis will be described.




Discriminative
approach for Wavelet denoising


Yacov HelOr, Doron Shaked – IDC + HP


This work suggests a discriminative
approach for wavelet denoising
where a set of shrinkage functions (SFs)
are designed to perform optimally (in a MSE sense) with respect
to a given set of images. Using the suggested scheme a new set of SFs are generated which are shown to be different from
the traditional soft/hard thresholding in the
overcomplete case. These SFs
are demonstrated to obtain the stateoftheart denoising
performance. As opposed to the descriptive approaches modeling image or
noise priors are not required here and the SFs are
learned directly from the example set. Thus, the framework enables the
shrinkage operation to be customized seamlessly to a new set of restoration
problems, such as: image deblurring, JPEG artifacts removal, and different
types of additive noise.


Applying
Property Testing to Image and Video Segmentation


Daniel Keren – University of Haifa


Property testing is a rapidly growing field
of research. Typically,
a property testing algorithm proceeds by quickly determining whether
an input can satisfy some condition, under the assumption that most
inputs do not satisfy it. If the input is "far" from satisfying
the condition, the algorithm is guaranteed to reject it with high
probability.
We suggest that property testing is especially suitable to image
detection, since typically most inputs are far from the sought
pattern. We analyze the problem of deciding whether a binary
image can be segmented to a given rectangular grid, and reduce
it to a problem whose size is a constant independent of the size
of the original image, and which is  with high probability 
equivalent to the original problem.


Robust
Real Time Pattern Matching using Bayesian Sequential Hypothesis Testing


Ofir Pele, Michael Werman – HUJI


The talk describes a method for robust real
time pattern matching. We first introduce a family of image distance
measures, the "Image Hamming Distance Family". Members of this
family are robust to occlusion, small geometrical transforms, light changes
and nonrigid deformations. We then present a novel Bayesian framework for
sequential hypothesis testing on finite populations. Based on this framework,
we design an optimal rejection/acceptance sampling algorithm. This algorithm
quickly determines whether two images are similar with respect to a member of
the Image Hamming Distance Family. We also present a fast framework that
designs a nearoptimal sampling algorithm. Finally we describe a method that
accelerates pattern matching exploiting image smoothness to adaptively slide
the window often by more than one pixel. The decision how much we can slide
is based on a novel rank we define for each feature in the pattern. Extensive
experimental results show that the method performance is excellent.
Implemented on a Pentium 4 3GHz processor, detection of a pattern with 2197
pixels, in 640x480 pixel frames, where in each frame the pattern rotated and
was highly occluded, proceeds at only 7.2ms per frame.


Brightness
contrastcontrast induction model predicts assimilation and inverted
assimilation effects


Yuval Barkan, Hedva Spitzer  TAU


In classical assimilation effects, intermediate luminance
patches appear lighter when their immediate surround is comprised of white
patches, and appears darker when its immediate surround is comprised of dark
patches. With patches either darker or lighter than both inducing patches,
the direction of the brightness effect is reversed and termed as “inverted
assimilation effect”. Several explanations and models have been suggested,
some are relevant to specific stimulus geometry, anchoring theory and models
which involve high level cortical processing (such as scission, etc.).None of
these studies predicted the variety types of assimilation effects and their
inverted effects. We suggest here, a
compound brightness model, which is based on contrastcontrast induction
(second order adaptation mechanism) which predicts the various types of
brightness assimilation effects and their inverted effects, in addition to
the prediction of the dual effects (contrast enhancement and suppression) of
contrastcontrast induction. The model is composed of three main stages:
composing the Oncenter retinal opponent cells receptive fields, performing
second order adaptation (gain control of “curveshifting of the contrast
domain). The final stage is transformation
of the “perceived” adapted response to luminance image, by utilizing a variation
of Jacobi iteration process, to enable elegant edge integration.


An
Evolutionary Patch Pattern Approach for Texture Discrimination


Maxim Shoshani  Technion


A new evolutionary approach is presented, based on
implicit pattern – process relationships. For implementing this approach, any
gray level texture image is decomposed into a progressive sequence of binary
patch patterns that describe a process of change from background to
foreground domination. Each of the binary patterns throughout these
sequences is parameterized, using several metrics that describe, for example,
its fragmentation level, both for the background (e.g., white) and foreground
( e.g., black) patch patterns. Any texture type is
then assumed to have a unique evolutionary path represented by a distinctive
region in the feature space of metrics characterizing these patterns and
their change. Application of hierarchical clustering based on a few (3
or 4) metrics representing characteristic stages in the patterns' change
process allowed us to accurately discriminate between 50 samples of 10 Brodatz texture types.


The
use of NonExtensive Divergence for Robust 'bag of features' Algorithms for
Visual Recognition


Amnon
Shashua, Tamir Hazan – HUJI


Images are sometimes represented by a set of informative
features ("bag of visualterms"). pLSA
is a method to analyze the featuresimages cooccurrence matrix and
recover hidden categories of images using a KLdivergence lowrank fit.
We introduce the Tsallis divergence error
measure showing much improved performance in the presence of noise. We
provide an optimization framework which extends the Maximum Likelihood
framework and theoretically guaranteed to provide robustness under
clutter, noise and outliers. Specifically, the conditions under which
our approach excels is when the cooccurrences array is sparse
 which happens in the application domain of "bag of visual words".


Blind
Source Separation, Deconvolution and Localization
using Sparse Signal Representations


Michael Zibulevsky – Technion

The blind source separation problem is concerned with
extraction of the
underlying unknown source signals from a set of their linear mixtures,
where the mixing matrix is unknown. The blind deconvolution
problem
consists in recovery of a signal/image blurred by unknown convolution
kernel. These two problems are close related and are encountered in
acoustics, radio, radar, medical signal and image processing,
hyperspectral imaging, and more, hence the
tremendous interest to them.
We show that exploiting the sparsity of a
wavelettype and other representations of the sources dramatically improves
the quality of separation/deconvolution. This leads
to a L1norm minimization problem, which can be solved efficiently by the
proposed relative Newton
method combined with the Smoothing Method of Multipliers.
We also show how to use the sparsity priors for
superresolution source
localization using multisensor observations.




