Israel Vision Day 2005

2005 Israel Computer Vision Day
Sunday, January 1, 2006

The Efi Arazi School of Computer Science

I.D.C. Herzliya

Computer Science Dept., University of Haifa

Sponsored by the Israeli Ministry of Science and Technology

Previous Vision Days Web Page: 2003 , 2004 .

Time	Speaker and Collaborators	Affiliation	Title
09:00-09:30	Gathering
09:30-09:45	Opening – Dr. Gideon Arieli, Ministry of Science
09:45-10:30	Guest Lecture: Prof. Guillermo Shapiro	University of Minnesota	Geodesics on Manifolds: Theory, Computation, and Applications
10:30-10:45	Coffee Break
10:45-11:10	Oren Boiman, Michal Irani	Weizmann	Detecting irregularities in images and in video
11:10-11:35	Ilan Shimshoni, Liran Goshen	Haifa Technion	Balanced Exploration and Exploitation Model Search for Efficient Epipolar Geometry Estimation
11:35-12:00	Yoav Schechner, Yuval Averbuch	Technion	Distance Dependent Regularization
12:00-12:25	Ron Kimmel, Michael Bronstein, Alex Bronstein	Technion	Isometric matching of surfaces with partial occlusion
12:25-13:30	Lunch break
13:30-13:55	Anat Levin, Dani Lischinski, Yair Weiss	HUJI	A Closed Form Solution to Natural Image Matting
13:55-14:20	Ronen Basri, Ira Kemelmacher, Tal Hassner	Weizmann	3D reconstruction by example
14:20-14:45	Dan Levi, Shimon Ullman	Weizmann	Learning to classify by ongoing feature selection
14:45-15:10	Yuval Barkan, Hedva Spitzer	TAU	Computational adaptation Model and its Predictions for Color Induction of first and second orders
15:10-15:30	Coffee Break
15:30-15:55	Chen Sagiv, Nir Sochen, Yehoshua Zeevi	U. of Bremen TAU Technion	Certainty and Uncertainty in Filter Bank Design Methodology
15:55-16:20	Ayelet Akselrod-Ballin, Meirav Galun, Ronen Basri, Achi Brandt	Weizmann	Integrated Segmentation and Classification Approach Applied to Multiple Sclerosis Analysis
16:20-16:45	Boris Epshtein, Shimon Ullman	Weizmann	Feature Hierarchies for Object Classification
16:45-17:10	Yael Moses, Ilan Shimshoni	IDC Haifa	3D Shape Recovery of Smooth Surfaces: Dropping the Fixed Viewpoint Assumption

General: This is the third Israel Computer Vision Day. It will be hosted at IDC.

For more details, requests to be added to the mailing list etc, please contact:

hagit@cs.haifa.ac.il toky@idc.ac.il

Location and Directions: The Vision Day will take place at the Interdisciplinary Center (IDC), Herzliya, in the Ivtzer Auditorium. For driving instructions see map.

A convenient option to arrive is by train, see time schedule here. Get off at the Herzliya Station, and order a taxi ride by phone. There are two taxi stations that provide this service: Moniyot Av-Yam (09 9501263 or 09 9563111), and Moniyot Pituach (09 9582288 or 09 9588001). The fair for a taxi ride from the railway station to IDC is around 20.- NIS.

Abstracts

Geodesics on Manifolds: Theory, Computation, and Application

Guillermo Shapiro – University of Minnesota

In this talk I will describe the importance of geodesic curves on manifolds for a number of computer vision problems, including bending invariant shape recognition, surface matching, and image and video colorization.

We will start by defining geodesics, describing recent results on efficient ways of computing them, and then presenting theory and computational algorithms for real problems. These include the use and extension of Gromov theory for shape recognition and the use and extension of Lipschitz minimizing maps for non-rigid shape matching. I will also briefly show how efficient geodesic computations lead to real time image and video colorization and special effects.

Detecting irregularities in images and in video

Oren Boiman and Michal Irani – Weizmann

We address the problem of detecting irregularities in visual data, e.g., detecting suspicious behaviors in video sequences, or identifying salient patterns in images. The term ``irregular" depends on the context in which the ``regular" or ``valid" are defined. Yet, it is not realistic to expect explicit definition of all possible valid configurations for a given context. We pose the problem of determining

the validity of visual data as a process of constructing a puzzle: We try to compose a new observed image region or a new video segment (``the query") using chunks of data (``pieces of puzzle") extracted from previous visual examples (``the database"). Regions in the observed data which can be composed using large contiguous chunks of data from the database are considered very likely, whereas regions in the observed data which cannot be composed from the database (or can be composed, but only using small fragmented pieces) are regarded as unlikely/suspicious. The problem is posed as an inference process in a probabilistic graphical model. We show applications of this approach to

identifying saliency in images and video, and for suspicious behavior recognition.

Balanced Exploration and Exploitation Model Search for Efficient Epipolar Geometry Estimation

Liran Goshen and Ilan Shimshoni – Technion and Haifa

A new robust matching algorithm is proposed. The Balanced Exploration and Exploitation Model Search (BEEM) algorithm is simple and very efficient for epipolar geometry estimation. It works very well for difficult scenes where the putative correspondences include a low percentage of inlier correspondences

and/or a large subset of the inliers is consistent with a degenerate configuration of the epipolar geometry that is totally incorrect. Low percentages of inliers often occur when the images have undergone a significant deformation due to wide baseline of the cameras. The second difficult situation occurs when the scene includes a degeneracy or close to a degenerate configuration. For example, scenes that include a large subset of inliers that lie in a small region in both images. In these cases standard epipolar geometry estimation algorithms require a high computational cost and often return an epipolar geometry consistent with a high number of inliers, which belong only to the degeneracy, that is however totally incorrect.

The algorithm handles these two difficult cases in a unified manner. The algorithm includes the following main features: (1) Balanced use of three search techniques: global random exploration, local exploration near the current best solution and local exploitation to improve the quality of the model. (2) Exploits available prior information to accelerate the search process. (3) Uses the best found model to guide the search process, escape from degenerate models and to define an efficient stopping criterion. (4) Presents a simple and efficient method to estimate the epipolar geometry from three SIFT correspondences. (5) Uses the locality-sensitive hashing (LSH) approximate nearest neighbor algorithm for fast putative correspondences generation.

The resulting algorithm when tested on real images with or without degenerate configurations gives quality estimations and achieves significant speedups compared to the state of the art algorithms!

Distance Dependent Regularization

Yuval Averbuch, Yoav Schechner - Technion

In this work we present several scenarios in which image signal-to-noise ratio varies with the object distance. Examples include imaging in bad weather or underwater. They also include imaging using active illumination as in flash photography or radar, where the signal becomes weak with the object distance. As typically done in noisy images, regularization is required. However, standard regularization would affect all image areas the same way, irrespective of their corresponding object distance (and its associated image SNR). This leads to excessive blurring in areas that have high SNR, or too little regularization in areas of low SNR. To counter this problem, we introduce a method for distance-dependent regularization. It adapts to the distance-dependent causes for SNR deterioration, in a spatially varying way. Moreover, this approach is extended beyond denoising to applications of image rendering. There, it creates arbitrary image blurring effects (as defocus) with ease.

Isometric matching of surfaces with partial occlusion

Ron Kimmel, Michael Bronstein and Alex Bronstein- Technion

Recently, the face of a woman was reconstructed by implanting tissues from a donor in France. Immediately questions regarding the identity and looks of the reconstructed face were brought up to public discussions.

In this talk we explore the information captured by the geometry of a facial surface and try to answer questions about the uniqueness of the identity provided in a given individual's face. For that goal, we conduct empirical experiments and develop computational tools supported by novel theories that provide some hints on the geometry of our faces. As an example, we show that face expressions are

isometries. That is, the facial surface is captured by its intrinsic geometry, while the expression is governed by the way the facial surface is embedded in 3D space, i.e. the extrinsic geometry.

A Closed Form Solution to Natural Image Matting

Anat Levin, Dani Lischinski and Yair Weiss – Hebrew U.

Interactive digital matting, the process of extracting a foreground
object from an image based on limited user input, is an important task in
image and video editing. >From a computer vision perspective, this task is
extremely challenging because it is massively ill-posed --- at each pixel
we must estimate the foreground and the background colors, as well as the
foreground opacity (``alpha matte'') from a single color measurement.
Current approaches either restrict the estimation to a small part of the
image, estimating foreground and background colors based on nearby pixels
where they are known, or perform iterative nonlinear estimation by
alternating foreground and background color estimation with alpha
estimation.

In this paper we present a closed form solution to natural image matting.
We derive a cost function from local smoothness assumptions on foreground
and background colors, and show that in the resulting expression it is
possible to analytically eliminate the foreground and background colors to
obtain a quadratic cost function in alpha. This allows us to find the
globally optimal alpha matte by solving a sparse linear system of
equations. Furthermore, the closed form formula allows us to predict the
properties of the solution by analyzing the eigenvectors of a sparse
matrix, closely related to matrices used in spectral image segmentation
algorithms. We show that high quality mattes can be obtained on natural
images from a surprisingly small amount of user input.

3D reconstruction by example

Ronen Basri, Ira Kemelmacher and Tal Hassner– Weizmann

Assume we have knowledge of the shape and appearance of several example 3D objects, all from a specific class. How can this knowledge be used to recover the 3D shape of a novel object of the same class from a single view?

In this talk we try to answer this question by presenting two new methods for 3D reconstruction from single images. The first method uses examples of feasible appearance-to-depth mappings, extracted from a pre-collected set of 3D objects, to produce reconstructions of novel objects belonging to a variety of classes (e.g., hands, human figures). On-the-fly example synthesis scheme is used to cope with unknown viewing conditions and large example sets. The second method focuses on human faces, allowing 3D reconstruction using only a single reference model of a different person's face. Assuming Lambertian reflectance and rough alignment of the input image to the reference model, this method seeks shape, albedo, and lighting that best fit the image while preserving the rough structure of the model. We demonstrate this approach by providing accurate reconstructions of novel faces overcoming significant differences in shape due to gender, race, and facial expressions.

Learning to classify by ongoing feature selection

Dan Levi and Shimon Ullman - Weizmann

Existing classification algorithms use a set of training examples to select classification features, which are then used for all future applications of the classifier. A major problem with this approach is the selection of a training set: a small set will result in reduced performance, and a large set will require extensive training. In this paper we propose a solution to this problem by developing an on-line feature selection method, which continuously modifies and improves the features used for classification based on the examples provided so far. The method is used for learning a new class, and to continuously improve classification performance as new data becomes available. In ongoing learning, examples are continuously presented to the system, and new features arise from these examples. The method continuously measures the value of the selected features using mutual information, and uses these values to efficiently update the set of selected features when new training information becomes available. The problem is challenging because at each stage the training process uses a small subset of the training data. Surprisingly, with sufficient training data the on-line process reaches the same performance as a scheme that has a complete access to the entire training data.

Computational adaptation Model and its Predictions for Color Induction of first and second orders can also be used for an algorithm for color conctancy and color contrast

Yuval Barkan and Hedva Spitzer - TAU

The appearance of a patch of color or its contrast depends not only on the stimulus itself but also on the surrounding stimuli (induction effects – simultaneous contrast). A comprehensive computational physiological model is presented to describe chromatic adaptation of the first (retinal) and second (cortical) orders, and to predict the different chromatic induction effects and in the same time can be used as an algorithm. We propose that the chromatic induction of the first order that yields perceived complementary colors can be predicted by retinal adaptation mechanisms, contrary to previous suggestions. The second order of the proposed adaptation mechanism succeeds to predict the automatic perceived inhibition or facilitation of the central contrast of a texture stimulus, depending on the surrounding contrast. Furthermore, contrary to other models, this model is able to also predict the effect of variegated surrounding on the central perceived color. We will show that this model can predict the above effects and show its ability in the same time to perform color constancy and color contrast.

Certainty and Uncertainty in Filter Bank Design Methodology

Chen Sagiv, Nir Sochen and Yehoshua Zeevi - U. of Bremen, TAU, Technion

Signal and image processing tasks often call for the implementation of filter banks. A key issue then is the selection or design of the "best" filter bank for the application at hand. We address two fundamental aspects of this issue. The first is concerned with the choice of the generating function, the so-called 'mother wavelet'. The second aspect of design of the filter bank is concerned with the tessellation of the combined position-frequency space with the family of the functions, obtained by the action of proper group on the generating function.

A proper choice for the selection of the generating function is the minimal entropy, i.e. uncertainty, in the resolution along the canonical variables that define the combined space. Gabor functions are the minimizers of the information uncertainty in the case of the set of functions obtained by the action of the Weyl-Heisenberg group, whereas Gabor wavelets are generated by the action of an affine group. A natural and interesting question is whether the Gabor wavelets are also the minimizers of the position-scale uncertainty related to the $SIM2$ or the affine group in $2D$. We address this issue using the Affine Weyl-Heisenberg group which accounts for spatial and frequency translations as well s for spatial dilations. Restricting ourselves to dilations tat are inversely related to frequency translations, we obtain a framework closely related to Gabor wavelets In this work we obtain possible minimizers for more generalized uncertainty relations which are related to the affine Weyl-Heisenberg group or its sub-groups, and provide an insight on their significance in image processing applications.

Proceeding from local to global issues, we consider the optimal tessellation of the position-frequency space. There are many possible tessellations of the combined space, and quantitative guidelines to obtain a "good" pavement are called for. The vast research on wavelets provides a mechanism to check whether a selected bank of filters constitutes a frame. In this work, we generalize these ideas and offer possible pavements of the frequency plane using Gabor functions, which have the interesting property that the number of orientations required depends on the scale. We also calculate the frame bounds and check the usefulness of this theoretical derivation with respect to analysis and
synthesis of images.

Integrated Segmentation and Classification Approach Applied to
Multiple Sclerosis Analysis

Ayelet Akselrod-Ballin, Meirav Galun, Ronen Basri, and Achi Brandt - Weizmann

We present a novel multiscale approach that combines segmentation with classification to detect abnormal brain structures in medical imagery, and demonstrate its utility in detecting multiple sclerosis lesions in 3D MRI data. Our method uses segmentation to obtain a hierarchical decomposition of a multi-channel, anisotropic MRI scan. It then produces a rich set of features describing the segments in terms of intensity, shape, location, and neighborhood relations. These features are then fed into a decision tree-based classifier, trained with data labeled by experts, enabling the detection of lesions in all scales. Unlike common approaches that use voxel-by-voxel analysis, our system can utilize regional properties that are often important for characterizing abnormal brain structures. We provide experiments showing successful detections of lesions in both simulated and real MR images.

Feature Hierarchies for Object Classification

Boris Epshtein and Shimon Ullman - Weizmann

In a number of recent classification schemes, image patches, or fragments, are successfully used as informative features for object detection and classification. The fragments are typically chosen during a training stage, according to the amount of information they deliver about the class being recognized. To perform classification, the features are searched in the input images, followed by a decision stage based on the detected features. In the present work, we extend this approach by decomposing the fragments into sub-fragments in a hierarchical fashion. The presence of a fragment is then detected by the presence of its components. For example, a fragment depicting the eye region can be subdivided into sub-fragments depicting different eye parts, eyebrow etc., which are detected separately. A higher-level fragment can use different types of sub-features (e.g., different eyebrows), and allow a controlled amount of independent movement and other deformations of the sub-parts.

There are two motivations for such a hierarchical construction. The first is dealing with variability during the matching process: the same motivation that applies to the classification of complete objects by image fragments, applies also to the detection of object parts. As we will demonstrate, the decomposition by our method increases the amount of information delivered by the fragments, improves the detection rate, and increases the tolerance for local distortions and illumination changes. The second motivation comes from the structure of the primate visual system, in which objects are represented in a hierarchy of increasingly complex features, ranging from local oriented features in V1, to complex shapes and partial or complete object views at high visual areas.

I will describe an algorithm for automatically constructing a full feature hierarchy, which proceeds by learning from examples, in a top-down manner. The highest level, class-specific fragments, are extracted first, and then subdivided into simpler sub-fragments in a recursive manner. The decomposition at each level is performed by the application of the same algorithm, guided by the maximization of mutual information. The subdivision stops when the mutual information cannot be improved by further decomposition, producing a set of simple, atomic features. For different natural classes, the depth of the feature hierarchy is typically up to four levels. In addition to the decomposition of a feature into its sub-parts, the method also determines the optimal parameters, such as detection thresholds and the amount of allowed displacements of the sub-features. Finally, we demonstrate that fragments selected at the higher levels of the hierarchy are typically complex and class-specific, but the lower level fragments are simple, and often shared between different object classes.

3D Shape Recovery of Smooth Surfaces: Dropping the Fixed Viewpoint Assumption

Yael Moses and Ilan Shimshoni – IDC and Haifa

We present a new method for recovering the 3D shape of a featureless smooth surface from three or more calibrated images. The main contribution of this paper is the ability to handle general images which are taken from unconstrained viewpoints and unconstrained illumination directions. To the best of our knowledge, no other method is currently capable of handling such images, since correspondence between such images is hard to compute. Our method combines geometric and photometric information in order to recover a dense correspondence between the images and successfully computes an accurate 3D shape of the surface. The method is based on a single pass and local computation and does not make use of global optimization over the whole surface. While we assume a Lambertian reflectance function, our method can be easily modified to handle more general reflectance models as long as it is possible to recover local normals from photometric information.