Geodesics on
Manifolds: Theory, Computation, and Application
|
Guillermo Shapiro – University
of Minnesota
|
In this talk I will describe the importance of geodesic
curves on manifolds for a number of computer vision problems, including bending
invariant shape recognition, surface matching, and image and video
colorization.
We will start by defining geodesics, describing recent
results on efficient ways of computing them, and then presenting theory and
computational algorithms for real problems. These include the use and
extension of Gromov theory for shape recognition
and the use and extension of Lipschitz minimizing
maps for non-rigid shape matching. I will also briefly show how efficient
geodesic computations lead to real time image and video colorization and
special effects.
|
Detecting
irregularities in images and in video
|
Oren Boiman
and Michal Irani – Weizmann
|
We address the problem of detecting irregularities in visual
data, e.g., detecting suspicious behaviors in video sequences, or identifying
salient patterns in images. The term ``irregular" depends on the context
in which the ``regular" or ``valid" are defined. Yet, it is not
realistic to expect explicit definition of all possible valid configurations
for a given context. We pose the problem of determining
the validity of visual data as a
process of constructing a puzzle: We try to compose a new observed image
region or a new video segment (``the query") using chunks of data
(``pieces of puzzle") extracted from previous visual examples (``the
database"). Regions in the observed data which can be composed using
large contiguous chunks of data from the database are considered very likely,
whereas regions in the observed data which cannot be composed from the
database (or can be composed, but only using small fragmented pieces) are
regarded as unlikely/suspicious. The problem is posed as an inference
process in a probabilistic graphical model. We show applications of this
approach to
identifying saliency in images
and video, and for suspicious behavior recognition.
|
Balanced
Exploration and Exploitation Model Search for Efficient Epipolar
Geometry Estimation
|
Liran Goshen
and Ilan Shimshoni – Technion and Haifa
|
A new robust matching algorithm is proposed. The Balanced
Exploration and Exploitation Model Search (BEEM) algorithm is simple and very
efficient for epipolar geometry estimation. It
works very well for difficult scenes where the putative correspondences
include a low percentage of inlier correspondences
and/or a large subset of the
inliers is consistent with a degenerate configuration of the epipolar geometry that is totally incorrect. Low
percentages of inliers often occur when the images have undergone a
significant deformation due to wide baseline of the cameras. The second
difficult situation occurs when the scene includes a degeneracy or close to a
degenerate configuration. For example, scenes that include a large subset of
inliers that lie in a small region in both images.
In these cases standard epipolar geometry
estimation algorithms require a high computational cost and often return an epipolar geometry consistent with a high number of
inliers, which belong only to the degeneracy, that is however totally
incorrect.
The algorithm handles these two difficult cases in a
unified manner. The algorithm includes the following main features: (1)
Balanced use of three search techniques: global random exploration, local
exploration near the current best solution and local exploitation to improve
the quality of the model. (2) Exploits available prior information to
accelerate the search process. (3) Uses the best found model to guide the
search process, escape from degenerate models and to define an efficient
stopping criterion. (4) Presents a simple and efficient method to estimate
the epipolar geometry from three SIFT
correspondences. (5) Uses the locality-sensitive hashing (LSH) approximate
nearest neighbor algorithm for fast putative correspondences generation.
The resulting algorithm when tested on real images with
or without degenerate configurations gives quality estimations and achieves
significant speedups compared to the state of the art algorithms!
|
Distance Dependent
Regularization
|
Yuval Averbuch, Yoav
Schechner - Technion
|
In this work we present several scenarios in which image
signal-to-noise ratio varies with the object distance. Examples include
imaging in bad weather or underwater. They also include imaging using active
illumination as in flash photography or radar, where the signal becomes weak
with the object distance. As typically done in noisy images, regularization
is required. However, standard regularization would affect all image
areas the same way, irrespective of their corresponding object distance (and
its associated image SNR). This leads to excessive blurring in areas that
have high SNR, or too little regularization in areas of low SNR. To counter
this problem, we introduce a method for distance-dependent regularization. It
adapts to the distance-dependent causes for SNR deterioration, in a spatially
varying way. Moreover, this approach is extended beyond denoising
to applications of image rendering. There, it creates arbitrary image
blurring effects (as defocus) with ease.
|
Isometric matching
of surfaces with partial occlusion
|
Ron Kimmel,
Michael Bronstein and Alex Bronstein- Technion
|
Recently, the face of a woman was reconstructed by implanting
tissues from a donor in France.
Immediately questions regarding the identity and looks of the reconstructed
face were brought up to public discussions.
In this talk we explore the information captured by the geometry of a
facial surface and try to answer questions about the uniqueness of
the identity provided in a given individual's face. For that goal, we
conduct empirical experiments and develop computational tools supported by
novel theories that provide some hints on the geometry of our faces. As an
example, we show that face expressions are
isometries. That
is, the facial surface is captured by its intrinsic geometry, while the
expression is governed by the way the facial surface is embedded in 3D space,
i.e. the extrinsic geometry.
|
A Closed Form Solution
to Natural Image Matting
|
Anat Levin, Dani
Lischinski and Yair Weiss
– Hebrew U.
|
Interactive digital matting, the process of extracting a
foreground
object from an image based on limited user input, is an important task in
image and video editing. >From a computer vision perspective, this task is
extremely challenging because it is massively ill-posed --- at each pixel
we must estimate the foreground and the background colors, as well as the
foreground opacity (``alpha matte'') from a single color measurement.
Current approaches either restrict the estimation to a small part of the
image, estimating foreground and background colors based on nearby pixels
where they are known, or perform iterative nonlinear estimation by
alternating foreground and background color estimation with alpha
estimation.
In this paper we present a closed form solution to natural image matting.
We derive a cost function from local smoothness assumptions on foreground
and background colors, and show that in the resulting expression it is
possible to analytically eliminate the foreground and background colors to
obtain a quadratic cost function in alpha. This allows us to find the
globally optimal alpha matte by solving a sparse linear system of
equations. Furthermore, the closed form formula allows us to predict the
properties of the solution by analyzing the eigenvectors of a sparse
matrix, closely related to matrices used in spectral image segmentation
algorithms. We show that high quality mattes can be obtained on natural
images from a surprisingly small amount of user input.
|
3D reconstruction
by example
|
Ronen Basri,
Ira Kemelmacher and Tal Hassner– Weizmann
|
Assume we have knowledge of the shape and appearance of several example
3D objects, all from a specific class. How can this knowledge be used to
recover the 3D shape of a novel object of the same class from a single view?
In this talk we try to answer this question by presenting two new
methods for 3D reconstruction from single images. The first method uses
examples of feasible appearance-to-depth mappings, extracted from a
pre-collected set of 3D objects, to produce reconstructions of novel objects
belonging to a variety of classes (e.g., hands, human figures). On-the-fly
example synthesis scheme is used to cope with unknown viewing conditions and
large example sets. The second method focuses on human faces, allowing 3D
reconstruction using only a single reference model of a different person's
face. Assuming Lambertian reflectance and rough
alignment of the input image to the reference model, this method seeks shape,
albedo, and lighting that best fit the image while
preserving the rough structure of the model. We demonstrate this approach by
providing accurate reconstructions of novel faces overcoming significant
differences in shape due to gender, race, and facial expressions.
|
Learning to
classify by ongoing feature selection
|
Dan Levi and Shimon Ullman - Weizmann
|
Existing classification algorithms use a set of training
examples to select classification features, which are then used for all
future applications of the classifier. A major problem with this approach is
the selection of a training set: a small set will result in reduced
performance, and a large set will require extensive training. In this
paper we propose a solution to this problem by developing an on-line feature
selection method, which continuously modifies and improves the features used
for classification based on the examples provided so far. The method is used
for learning a new class, and to continuously improve classification
performance as new data becomes available. In ongoing learning, examples are
continuously presented to the system, and new features arise from these
examples. The method continuously measures the value of the selected features
using mutual information, and uses these values to efficiently update the set
of selected features when new training information becomes available. The problem
is challenging because at each stage the training process uses a small subset
of the training data. Surprisingly, with sufficient training data the on-line
process reaches the same performance as a scheme that has a complete access
to the entire training data.
|
|
Computational
adaptation Model and its Predictions for Color Induction of first and
second orders can also be used for an algorithm for color conctancy
and color contrast
|
Yuval Barkan and Hedva Spitzer - TAU
|
The appearance of a patch of color or its contrast
depends not only on the stimulus itself but also on the surrounding stimuli
(induction effects – simultaneous contrast). A comprehensive computational
physiological model is presented to describe chromatic adaptation of the
first (retinal) and second (cortical) orders, and to
predict the different chromatic induction effects and in the same time can be
used as an algorithm. We propose that the chromatic induction of the first
order that yields perceived complementary colors can be predicted by retinal
adaptation mechanisms, contrary to previous suggestions. The second order of
the proposed adaptation mechanism succeeds to predict the automatic perceived
inhibition or facilitation of the central contrast of a texture stimulus,
depending on the surrounding contrast. Furthermore, contrary to other models,
this model is able to also predict the effect of variegated surrounding on
the central perceived color. We will show that this model can predict the
above effects and show its ability in the same time to perform color
constancy and color contrast.
|
Certainty and
Uncertainty in Filter Bank Design Methodology
|
Chen Sagiv, Nir Sochen and Yehoshua Zeevi - U. of Bremen,
TAU, Technion
|
Signal and image processing tasks often call for the
implementation of filter banks. A key issue then is the selection or design
of the "best" filter bank for the application at hand. We address
two fundamental aspects of this issue. The first is concerned with the choice
of the generating function, the so-called 'mother wavelet'. The second aspect
of design of the filter bank is concerned with the tessellation of the
combined position-frequency space with the family of the functions, obtained
by the action of proper group on the generating function.
A proper choice for the selection of the generating function is the minimal
entropy, i.e. uncertainty, in the resolution along the canonical variables
that define the combined space. Gabor functions are
the minimizers of the information uncertainty in
the case of the set of functions obtained by the action of the Weyl-Heisenberg group, whereas Gabor
wavelets are generated by the action of an affine group. A natural and
interesting question is whether the Gabor wavelets
are also the minimizers of the position-scale
uncertainty related to the $SIM2$ or the affine group in $2D$. We address
this issue using the Affine Weyl-Heisenberg group
which accounts for spatial and frequency translations as well s for spatial
dilations. Restricting ourselves to dilations tat are inversely related to
frequency translations, we obtain a framework closely related to Gabor wavelets In this work we obtain possible minimizers for more generalized uncertainty relations
which are related to the affine Weyl-Heisenberg
group or its sub-groups, and provide an insight on their significance in
image processing applications.
Proceeding from local to global issues, we consider the optimal tessellation
of the position-frequency space. There are many possible tessellations of the
combined space, and quantitative guidelines to obtain a "good"
pavement are called for. The vast research on wavelets provides a mechanism
to check whether a selected bank of filters constitutes a frame. In this
work, we generalize these ideas and offer possible pavements of the frequency
plane using Gabor functions, which have the
interesting property that the number of orientations required depends on the
scale. We also calculate the frame bounds and check the usefulness of this
theoretical derivation with respect to analysis and
synthesis of images.
|
Integrated
Segmentation and Classification Approach Applied to
Multiple Sclerosis Analysis
|
Ayelet Akselrod-Ballin,
Meirav Galun, Ronen Basri, and Achi Brandt - Weizmann
|
We present a novel multiscale
approach that combines segmentation with classification to detect abnormal
brain structures in medical imagery, and demonstrate its utility in detecting
multiple sclerosis lesions in 3D MRI data. Our method uses segmentation to
obtain a hierarchical decomposition of a multi-channel, anisotropic MRI scan.
It then produces a rich set of features describing the segments in terms of
intensity, shape, location, and neighborhood relations. These features are
then fed into a decision tree-based classifier, trained with data labeled by
experts, enabling the detection of lesions in all scales. Unlike common
approaches that use voxel-by-voxel analysis, our
system can utilize regional properties that are often important for
characterizing abnormal brain structures. We provide experiments showing
successful detections of lesions in both simulated and real MR images.
|
Feature
Hierarchies for Object Classification
|
Boris Epshtein and Shimon Ullman -
Weizmann
|
In a number of recent classification schemes, image
patches, or fragments, are successfully used as informative features for
object detection and classification. The fragments are typically chosen
during a training stage, according to the amount of information they deliver
about the class being recognized. To perform classification, the features are
searched in the input images, followed by a decision stage based on the
detected features. In the present
work, we extend this approach by decomposing the fragments into sub-fragments
in a hierarchical fashion. The presence of a fragment is then detected by the
presence of its components. For example, a fragment depicting the eye region
can be subdivided into sub-fragments depicting different eye parts, eyebrow etc.,
which are detected separately. A higher-level fragment can use different
types of sub-features (e.g., different eyebrows), and allow a controlled
amount of independent movement and other deformations of the sub-parts.
There are two motivations for such a hierarchical
construction. The first is dealing with variability during the matching
process: the same motivation that applies to the classification of complete
objects by image fragments, applies also to the detection of object parts. As
we will demonstrate, the decomposition by our method increases the amount of
information delivered by the fragments, improves the detection rate, and
increases the tolerance for local distortions and illumination changes. The
second motivation comes from the structure of the primate visual system, in
which objects are represented in a hierarchy of increasingly complex
features, ranging from local oriented features in V1, to complex shapes and
partial or complete object views at high visual areas.
I will describe an algorithm for automatically constructing a full feature
hierarchy, which proceeds by learning from examples, in a top-down manner.
The highest level, class-specific fragments, are extracted first, and then
subdivided into simpler sub-fragments in a recursive manner. The
decomposition at each level is performed by the application of the same
algorithm, guided by the maximization of mutual information. The subdivision
stops when the mutual information cannot be improved by further
decomposition, producing a set of simple, atomic features. For
different natural classes, the depth of the feature hierarchy is typically up
to four levels. In addition to the decomposition of a feature into its
sub-parts, the method also determines the optimal parameters, such as detection
thresholds and the amount of allowed displacements of the sub-features.
Finally, we demonstrate that fragments selected at the higher levels of the
hierarchy are typically complex and class-specific, but the lower level
fragments are simple, and often shared between different object classes.
|
3D Shape Recovery
of Smooth Surfaces: Dropping the Fixed Viewpoint Assumption
|
Yael Moses and Ilan
Shimshoni – IDC and Haifa
|
We present a new method for recovering the 3D
shape of a featureless smooth surface from three or more calibrated
images. The main contribution of this paper is the ability to handle general
images which are taken from unconstrained viewpoints and unconstrained
illumination directions. To the best of our knowledge, no other method
is currently capable of handling such images, since correspondence between
such images is hard to compute. Our method combines geometric and photometric
information in order to recover a dense correspondence between the images and
successfully computes an accurate 3D shape of the surface. The method
is based on a single pass and local computation and does not make use of
global optimization over the whole surface. While we assume a Lambertian reflectance function, our method can be easily
modified to handle more general reflectance models as long as it is possible
to recover local normals from photometric
information.
|