The Vision day will be held on

December 28 2003 !!




Computer Vision Day

The Interdisciplinary Center Herzliya




Speaker and Collaborators






Daphna Weinshall, Aharon Bar-Hillel, Tomer Hertz, Noam Shental



Learning Distance Functions for Image Retrieval by Query


Nir Sochen, Chen Sagiv, Daniel Cremers, Christoph Schnoerr


Segmentation: From Descartes to Kant
   or does it exist when we don't look at it?


Nachum Kiryati, Leah Bar, Nir Sochen


Variational Pairing of Image Segmentation and Blind Restoration


Coffee Break


Yael Moses, Shai Avidan and Yoram Moses


Multi-view correspondence in a distributed setting


Michal Irani and Bernard Sarel


Separating Transparent Layers through Layer Information Exchange


Lunch break


Micha Lindenbaum, E. Engbers and  A. Smeulders


An Information-based Measure for Grouping Quality


Amnon Shashua and Lior Wolf


A New Paradigm for Feature Selection with some surprising results


Ronen Basri, Lena Gorelick, Meirav Galun, Eitan Sharon, Achi Brandt


Shape Representation and Classification Using the Poisson Equation


Coffee Break


Yoav Schechner Shree Nayar and Peter Belhumeur


Codes for Multiplexing Images and Lighting


Eero Simoncelli


Wavelet-domain Gaussian Scale Mixture Models for Images




General: It has been several years since the termination of the annual Vision meetings held at TAU.

We would like to reinstate this tradition, hopefully on an annual basis. This year IDC is happy to host the first Israeli Computer Vision Day. We hope it will be an academically fruitful and pleasant conference.



Location and Directions: The Vision Day will take place at the Interdisciplinary Center (IDC), Herzliya, in the Ivtzer Auditorium. For driving instructions see map.

A convenient option to arrive is by train, see time schedule here. Get off at the Herzliya Station, and order a taxi ride by phone. There are two taxi stations that provide this service: Moniyot Av-Yam (09 9501263 or 09 9563111), and Moniyot Pituach (09 9582288 or 09 9588001). The fair for a taxi ride from the railway station to IDC is around 20.- NIS.






Wavelet-domain Gaussian Scale Mixture Models for Images
Eero P. Simoncelli - New York University

I'll describe some of our recent work in modeling the joint statistics of images when represented in
a multiscale oriented basis.  Local neighborhoods of coefficients, associated with basis functions
at nearby positions, scales and orientations, may be described as arising from the product of  a
Gaussian random vector and a hidden scalar variable.  This model captures a number of non-Gaussian behaviors
that are typical of natural images, and it also proves to be quite tractable in applications such as denoising.


Colorimetric Imaging with Minimum Noise

Hagit Hel-Or, Ulrich Barnhoefer, Brian Wandell U. of Haifa


Images acquired by digital cameras are affected by sensor noise and by color deviations. It can be shown that there is a tradeoff between these two image artifacts which is dependent on the sensor sensitivities and the ensuing transformations applied to the sensor outputs.

We study sensors associated with a special class of cameras, namely, Colorimetric Cameras. These sensors are capable of capturing colors which span the human visual subspace, thus are able to produce zero DE error, which in turn imply no color deviations in the captured image. We find the optimal sensors in this set with respect to minimizing sensor noise.

It was found that two of the three optimal sensors are surprisingly similar to the CIE-X and CIE-Z color matching functions. We show that this is not accidental. In fact, in choosing the XYZ primaries, the CIE unknowingly chose those whose matching functions (when considered as sensors) minimize sensor noise. We show that the criteria set by the CIE in choosing the XYZ primaries, imply minimization of sensor noise within our camera model.


Segmentation: From Descartes to Kant or does it exist when we don't look at it?

Nir Sochen, Chen Sagiv, Daniel Cremers, Christoph Schnoer Tel-Aviv U.


The partition of the image domain to significant regions is a long standing problem in computer vision. I will review several approaches which parallel various philosophical schools. This will lead to a formalism that put the low and high level vision on the same footing. In particular, segmentation with shape prior(s) and dynamic labeling will be demonstrated.



Variational Pairing of Image Segmentation and Blind Restoration
Leah Bar
, Nir Sochen, Nachum Kiryati Tel-Aviv U.
Segmentation and blind restoration are both classical problems, that are
known to be difficult and have attracted major research efforts.
This paper shows that these problems are tightly coupled and can
be successfully solved together.


Multi-view correspondence in a distributed setting

Yael Moses, Shai Avidan, and Yoram Moses The Interdisciplinary Center


We present a probabilistic algorithm for finding correspondences across
multiple images. The algorithm runs in a distributed setting, where each camera is attached to a separate computing unit, and the cameras communicate over a network.  No central computer is involved in the computation. The algorithm runs with low computational and communication cost.  Our distributed algorithm assumes access to a standard pairwise wide-baseline stereo matching algorithm (WBS) and our goal is to minimize the number of images transmitted over the network, as well as the number of times the WBS is computed.  We employ the theory of random graphs to provide an efficient probabilistic algorithm that performs WBS on a small number of image pairs, followed by a correspondence propagation phase.  The heart of the work is a theoretical analysis of the number of times WBS must be performed to ensure that an overwhelming portion of the correspondence information is extracted.  The analysis is extended to show how to combat computer and communication failures, which are expected to occur in such settings, as well as correspondence misses.  This analysis yields an efficient distributed algorithm, but it can also be used to improve the performance of centralized algorithms for correspondence.



Separating Transparent Layers through Layer Information Exchange
Michal Irani and Bernard Sarel Weizmann Inst.


We present an approach for separating two transparent layers in images and video sequences. Given two initial unknown physical mixtures (I1 and I2) of real scene layers, (L1 and L2) we seek a layer separation which minimizes the structural correlations across the two layers at every image point. Such a separation is achieved by transferring local structure from one image to the other wherever it is highly correlated with the underlying local structure in the other image, and vice versa. This bi-directional transfer operation, which we call the ``layer
information exchange", is performed on diminishing window sizes, from global image windows (i.e., the entire image), down to local image windows, thus detecting similar structures at varying scales across pixel positions.

We show the applicability of this approach to various real-world scenarios, including image and video transparency separation. In particular, we show that this approach can be used for separating transparent layers in images obtained under different polarizations, as well as for separating complex {\em non-rigid} transparent motions in video sequences. These can be done without prior knowledge of the layer mixing model (simple additive, alpha-mated composition with an unknown alpha-map, or other), and under unknown complex temporal changes (e.g., unknown varying lighting conditions).


An Information-based Measure for Grouping Quality

Micha Lindenbaum, E. Engbers and  A. Smeulders - Technion

Grouping is an essential process of computer vision. However, evaluation of grouping results is not straightforward and is often heuristic.  We propose a method for measuring grouping quality, based on the following observation: a better grouping result provides more information about the true, unknown grouping.

The amount of information is evaluated by the number of queries required to specify the true grouping. An automatic procedure, relying on the given hypothesized grouping, generates (homogeneity) queries about the true grouping and answers them using an oracle. The process terminates once the queries suffice to specify the true grouping. The number of queries is a measure of the hypothesis non-informativeness.

A related measure of informativeness is the uncertainty of the true grouping, characterized using a probabilistic model and common information theory terms such as surprise and entropy. This relation between the measures is established and experimentally supported. The proposed method suggests two main innovations and advantages relative to existing approaches:

Generality and fairness - Most previous, similarity-based measures, involve unavoidably arbitrary choices. The proposed information-based quality measure is free from such arbitrary choices, treats different types of grouping errors in a uniform way and does not favor any algorithm.
Non-heuristic justification - Unlike previous approaches, the number of queries may be interpreted as a surprise in an information theory context.  The query count was found to be approximately monotonic in the entropy, independent of the grouping error type, indicating that this interpretation is valid and that the query count is an adequate unbiased means for comparing grouping results.

Moreover, we found that the query count measure approximates human judgment better than other methods and as such, gives better results when used to optimize a segmentation algorithm, as demonstrated in our experiments.



A New Paradigm for Feature Selection with some surprising results

Amnon Shashua and Lior Wolf Hebrew U.


The problem of selecting a subset of relevant features in a potentially overwhelming quantity of data is classic and found in many branches of science. Examples in computer vision, text processing and more recently bio-informatics are abundant. In text classification tasks, for example, it is not uncommon to have 104 to 107 features of the size of the vocabulary containing word frequency counts, with the expectation that only a small fraction of them are relevant. Typical examples include the automatic sorting of URLs into a web directory and the detection of spam email.

In this work we present a definition of "relevancy" based on spectral properties of the Laplacian of the features' measurement matrix. The feature selection process is then based on a continuous ranking of the features defined by a least-squares optimization process. A remarkable property of the feature relevance function is that sparse solutions for the ranking values naturally emerge as a result of a ``biased non-negativity'' of a key matrix in the process. As a result, a simple least-squares optimization process converges onto a sparse solution, i.e., a selection of a subset of features which form a local maxima over the relevance function. The feature selection algorithm can be embedded in both unsupervised and supervised inference problems and empirical evidence show that the feature selections typically achieve high accuracy even when only a small fraction of the features are relevant.



Shape Representation and Classification Using the Poisson Equation
Ronen Basri, Lena Gorelick, Meirav Galun, Eitan Sharon, Achi Brandt Weizmann Inst.

Silhouettes contain rich information about the shape of objects that can be used for recognition and classification. We present a novel approach that allows us to reliably compute many useful properties of a silhouette. Our approach assigns for every internal point of the silhouette a value reflecting the mean time required for a random walk beginning at the point to hit the boundaries. This function can be computed by solving Poisson's equation, with the silhouette contours providing boundary conditions. We show how we can use this function to reliably extract various shape properties including part structure and rough skeleton, local orientation and aspect ratio of different parts, and convex and concave sections of the boundaries. In addition to this we discuss properties of the solution and show how to efficiently compute this solution using multigrid algorithms. We demonstrate the utility of the extracted properties by using them for shape classification and show how these properties can be incorporated in a hierarchical segmentation scheme to enforce smooth continuation of segment boundaries.

Codes for Multiplexing Images and Lighting
Yoav Schechner, Shree Nayar and Peter Belhumeur - Technion

Imaging of objects under variable lighting directions is an important and frequent practice in computer vision and image-based rendering. We introduce an approach that significantly improves the quality of such images, practically at no cost.

Traditional methods for acquiring images under variable illumination directions use only a single light source per acquired image. In contrast, our approach is based on a multiplexing principle, in which multiple light sources illuminate the object simultaneously from different directions. Thus, the object irradiance is much higher. The acquired images are then computationally demultiplexed.

We give the optimal code by which the illumination should be multiplexed to obtain the highest quality output. We then demonstrate its utility in experiments using high directional resolution lighting. The mathematical principle behind this approach is useful in other domains of imaging, unrelated to the regime of illumination.


Learning Distance Functions for Image Retrieval by Query
Daphna Weinshall, Aharon Bar-Hillel, Tomer Hertz, Noam Shental Hebrew U.

Image retrieval critically relies on the distance function used to compare the query image and the images in the database. We suggest to learn such distance functions by training binary classifiers with margins, where the classifiers are defined over the product space of pairs of images. The classifiers are trained to distinguish whether two points come from the same class, and their signed margin is used as a distance function. We explored several variants of this idea, based on using SVM and Boosting algorithms as product space classifiers. Our main contribution is a distance learning method which combines boosting hypotheses over the product space with a weak learner based on partitioning the original feature space. I will show comparative results of image retrieval in a distributed learning paradigm, using two databases: a large database of facial images (YaleB), and a database of natural images taken from a commercial CD. In both cases our combined boosting method outperforms all other methods, and its generalization to unseen classes is superior.