Israel Vision Day 2009

2009 Israel Computer Vision Day
Sunday, December 13, 2009

The Efi Arazi School of Computer Science

I.D.C. Herzliya

Supported by GM - Advanced Technical Center - Israel

Previous Vision Days Web Page: 2003, 2004, 2005, 2006, 2007, 2008.

Vision Day Schedule

Time	Speaker and Collaborators	Affiliation	Title
09:00-09:30	Gathering
09:30-9:50	Yael Pritch Eitam Kav-Venaki Shmuel Peleg	HUJI	Shift-Map Image Editing
9:50-10:10	Fredo Durand	MIT	Fourier to the rescue of Photography and Image Synthesis
10:10-10:30	Eli Shechtman Connelly Barnes Adam Finkelstein Dan Goldman	Adobe, Princeton	PatchMatch: A Randomized Correspondence Algorithm for Structural Image Editing
10:30-10:50	Daniel Glasner Shai Bagon Michal Irani	Weizmann	Super-Resolution From a Single Image
10:50-11:30	Coffee Break .
11:30-11:50	Hadas Kogan Ron Maurer Renato Keshet	HP	Vanishing Points Estimation by Self-Similarity
11:50-12:10	Arik Shamir Tao Chen, Ming-Ming Cheng, Shi-Min Hu, Ping Tan	IDC Tsinghua U. U of Singapore	Sketch2Photo: Internet Image Montage
12:10-12:30	Ofir Pele Michael Werman	HUJI	Fast and Robust Earth Mover's Distances
12:30-12:50	Anat Levin, Yair Weiss, Fredo Durand Bill Freeman	Weizmann HUJI MIT	Understanding and evaluating blind deconvolution algorithms
12:50-14:30	Lunch .
14:30-14:50	Tal Hassner Lior Wolf Yaniv Taigman	OpenU TAU face.com	Multiple One-Shots for Utilizing Class Label Information
14:50-15:10	Lior Wolf, Rotem Littman, Naama Mayer, Nachum Dershowitz, R. Shweka, Y. Choueka	TAU Genizah Project	Automatically Identifying Join Candidates in the Cairo Genizah
15:10-15:30	Simon Polak Amnon Shashua	HUJI	A system for recognizing instances from a large set of object classes using a novel shape model
15:30-15:50	Sefy Kagarlitsky Yael Moses Yacov Hel-Or	IDC	Piecewise-consistent Color Mappings of Images Acquired Under Various Conditions
15:50-16:10	Coffee Break .
16:10-16:30	Michael Kolomenkin Ilan Shimshoni Ayellet Tal	Haifa Technion	On Edge Detection On surfaces
16:30-16:50	Adrian Stern Ofer Levi	BGU	Optically Compressed Sensing by Undersampling the polar Fourier plane
16:50-17:10	Raanan Fattal	HUJI	Edge-Avoiding Wavelets and their Applications
17:10-17:30	Daniel Freedman Zachi Karni Craig Gotsman	HP Technion	Energy-Based Shape Deformation

General: This is the seventh Israel Computer Vision Day. It will be hosted at IDC.

For more details, requests to be added to the mailing list etc, please contact:

hagit@cs.haifa.ac.il toky@idc.ac.il

Location and Directions: The Vision Day will take place at the Interdisciplinary Center (IDC), Herzliya, in the Ivtzer Auditorium. For driving instructions see map.

A convenient option is to arrive by train, see time schedule here. Get off at the Herzliya Station, and order a taxi ride by phone. There are two taxi stations that provide this service: Moniyot Av-Yam (09 9501263 or 09 9563111), and Moniyot Pituach (09 9582288 or 09 9588001).

Abstracts

Shift-Map Image Editing

Yael Pritch, Eitam Kav-Venaki, Shmuel Peleg - HUJI

Geometric rearrangement of images includes operations such as image retargeting, object removal, or object rearrangement. Each such operation can be characterized by a shift-map: the relative shift of every pixel in the output image from its source in an input image.

We describe a new representation of these operations as an optimal graph labeling, where the shift-map represents the selected label for each output pixel. Two terms are used in computing the optimal shift-map: (i) A data term which indicates constraints such as the change in image size, object rearrangement, a possible saliency map, etc. (ii) A smoothness term, minimizing the new discontinuities in the output image caused by discontinuities in the shift-map.

This graph labeling problem can be solved using graph cuts. Since the optimization is global and discrete, it outperforms state of the art methods in most cases. Efficient hierarchical solutions for graph-cuts are presented, and operations on 1M images can take only a few seconds.

Fourier to the rescue of Photography and Image Synthesis

Fredo Durand - MIT

New analysis of light transport and image formation enables novel imaging strategies that reduce motion blur and depth of field as well as acceleration algorithms for computer graphics.

We analyze phenomena such as light transport in a scene, integration during the shutter interval and defocus blur with the Fourier transform of the domain of light rays and space-time. For imaging applications, this offers both new theoretical insights on upper bounds of achievable sharpness and signal-noise ratio in the presence of motion blur and depth of field as well as new lens and camera designs that, combined with computation, can reduce blur. In image synthesis, similar analysis enables algorithms that use adaptive sampling and novel reconstruction to simulate effects such as motion blur and depth of field with dramatic speedups.

PatchMatch:
A Randomized Correspondence Algorithm for Structural Image Editing

Eli Shechtman, Connelly Barnes, Adam Finkelstein, Dan Goldman – Adobe, Princeton

We present a new randomized algorithm for quickly finding approximate nearest neighbor matches between image patches for interactive image editing. Previous research in graphics and vision has leveraged such nearest-neighbor searches to provide a variety of high-level digital image editing tools. However, the cost of computing a field of such matches for an entire image has eluded previous efforts to provide interactive performance. Our algorithm offers substantial performance improvements over the previous state of the art (20-100x), enabling its use in interactive editing tools. The key insights driving the algorithm are that some good patch matches can be found via random sampling, and that natural coherence in the imagery allows us to propagate such matches quickly to surrounding areas. We offer a theoretical analysis of the convergence properties of the algorithm, as well as empirical and practical evidence for its high quality and performance. This one simple algorithm forms the basis for a variety of tools – image retargeting, completion and reshuffling – that can be used together in the context of a high-level image editing application. Finally, we provide additional intuitive constraints on the synthesis process that offer the user a level of control unavailable in previous methods.

Super-Resolution From a Single Image

Daniel Glasner, Shai Bagon, Michal Irani - Weizmann

Methods for super-resolution (SR) can be broadly classified into two families of methods: (i) The classical multi-image super-resolution (combining images obtained at subpixel misalignments), and (ii) Example-Based super-resolution (learning correspondence between low and high resolution image patches from a database). In this work we propose a unified framework for combining these two families of methods. We further show how this combined approach can be applied to obtain super resolution from as little as a single image (with no database or prior examples). Our approach is based on the observation that patches in a natural image tend to redundantly recur many times inside the image, both within the same scale, as well as across different scales. Recurrence of patches within the same image scale (at subpixel misalignments) gives rise to the classical super-resolution, whereas recurrence of patches across different scales of the same image gives rise to example-based super-resolution. Our approach attempts to recover at each pixel its best possible resolution increase based on its patch redundancy within and across scales.

Vanishing Points Estimation by Self-Similarity

Hadas Kogan, Ron Maurer, Renato Keshet - HP

We propose a new self-similarity based approach for the problem of vanishing point estimation in man-made scenes. A vanishing point (VP) is the convergence point of a pencil (a concurrent line set), that is a perspective projection of a corresponding parallel line set in the scene. Unlike traditional VP detection that relies on extraction and grouping of individual straight lines, our approach detects entire pencils based on a property of 1D affine-similarity between parallel cross-sections of a pencil. Our approach is not limited to real pencils. Under some conditions (normally met in man-made scenes), our method can detect pencils made of virtual lines passing through similar image features, and hence can detect VPs from repeating patterns that do not contain straight edges. We demonstrate that detecting entire pencils rather than individual lines improves the detection robustness in that it improves VP detection in challenging conditions, such as very-low resolution or weak edges, and simultaneously reduces VP false-detection rate when only a small number of lines are detectable.

Sketch2Photo: Internet Image Montage

Arik Shamir, Tao Chen, Ming-Ming Cheng, Shi-Min Hu, Ping Tan – IDC, Tsinghua University, University of Singapore

We present a system that composes a realistic picture from a simple freehand sketch annotated with text labels. The composed picture is generated by seamlessly stitching several photographs in agreement with the sketch and text labels; these are found by searching the Internet. Although online image search generates many inappropriate results, our system is able to automatically select suitable photographs to generate a high quality composition, using a filtering scheme to exclude undesirable images. We also provide a novel image blending algorithm to allow seamless image composition. Each blending result is given a numeric score, allowing us to find an optimal combination of discovered images. Experimental results show the method is very successful; we also evaluate our system using the results from two user studies.

Fast and Robust Earth Mover's Distances

Ofir Pele, Michael Werman - HUJI

We present a new Earth Mover's Distance (EMD) variant. We show that it is a metric (unlike the original EMD) which is a metric only for normalized histograms). Moreover, it is a natural extension of the L1 metric. We propose a linear time algorithm for the computation of the EMD variant, with a robust ground distance for oriented gradients.

We also present a new algorithm for a robust family of Earth Mover's Distances - EMDs with thresholded ground distances. We compute the EMD by an order of magnitude faster than the original algorithm, which makes it possible to compute the EMD on large histograms and databases. In addition, we show that EMDs with thresholded ground distances have many desirable properties. First, they correspond to the way humans perceive distances. Second, they are robust to outlier noise and quantization effects. Third, they are metrics. Finally, experimental results show that thresholding the ground distance of the EMD improves both accuracy and speed.

Understanding and evaluating blind deconvolution algorithms

Anat Levin, Yair Weiss, Fredo Durand, Bill Freeman – Weizmann, HUJI, MIT

Blind deconvolution is the recovery of a sharp version of a blurred image when the blur kernel is unknown. Recent algorithms have afforded dramatic progress, yet many aspects of the problem remain challenging and hard to understand. The goal of our work is to analyze and evaluate recent blind deconvolution algorithms both theoretically and experimentally. We explain the previously reported failure of the naive MAP approach by demonstrating that it mostly favors no-blur explanations. On the other hand we show that since the kernel size is often smaller than the image size a MAP estimation of the kernel alone can be well constrained and accurately recover the true blur.

The plethora of recent deconvolution techniques makes an experimental evaluation on ground-truth data important. We have collected blur data with ground truth and compared recent algorithms under equal settings. Additionally, our data demonstrates that the shift-invariant blur assumption made by most algorithms is often violated.

Multiple One-Shots for Utilizing Class Label Information

Tal Hassner, Lior Wolf, Yaniv Taigman - OpenU, TAU, face.com

The One-Shot Similarity measure has recently been introduced as a means of boosting the performance of face recognition systems. Given two vectors, their One-Shot Similarity score reflects the likelihood of each vector belonging to the same class as the other vector and not in a class defined by a fixed set of ``negative'' examples. An appealing aspect of this approach is that it does not require class labeled training data. This talk will explain how the One-Shot Similarity may nevertheless benefit from the availability of such labels. We claim the following contributions: (a) We present a system utilizing subject and pose information to improve facial image pair-matching performance using multiple One-Shot scores; (b) we show how separating pose and identity may lead to better face recognition rates in unconstrained, ``wild'' facial images; (c) we explore how far we can get using a single descriptor with different similarity tests as opposed to the popular multiple descriptor approaches; and (d) we demonstrate the benefit of learned metrics for improved One-Shot performance. We test the performance of our system on the challenging Labeled Faces in the Wild unrestricted benchmark and present results that exceed by a large margin the best results reported to date for this test.

Automatically Identifying Join Candidates in the Cairo Genizah

Lior Wolf, Rotem Littman, Naama Mayer, Nachum Dershowitz, R. Shweka, Y. Choueka - TAU Genizah Project

A join is a set of manuscript-fragments that are known to originate from the same original work. The Cairo Genizah is a collection containing approximately 250,000 fragments of mainly Jewish texts discovered in the late 19th century. The fragments are today spread out in libraries and private collections worldwide, and there is an ongoing effort to document and catalogue all extant fragments. The task of finding joins is currently conducted manually by experts, and presumably only a small fraction of the existing joins have been discovered. In this work, we study the problem of automatically finding candidate joins, so as to streamline the task. The proposed method is based on a combination of local descriptors and learning techniques. To evaluate the performance of various join-finding methods, without relying on the availability of human experts, we construct a benchmark dataset that is modeled on the Labeled Faces in the Wild benchmark for face recognition. Using this benchmark, we evaluate several alternative image representations and learning techniques. Finally, a set of newly-discovered join-candidates have been identified using our method and validated by a human expert.

A system for recognizing instances from a large set of object classes using a novel shape model

Simon Polak , Amnon Shashua - HUJI

We propose a model for classification and detection of object classes where the number of classes may be large and where multiple instances of object classes may be present in an image. The algorithm combines a bottom-up, low-level, procedure of a bag-of-words naive Bayes phase for winnowing out unlikely object classes with a high-level procedure for detection and classification. The high-level process is a hybrid of a voting method where votes are filtered using beliefs computed by a class-specific graphical model (using sum-TRBP). In that sense, shape is both explicit (determining the voting pattern) and implicit (each object part votes independently) --- hence we call our approach the "semi-explicit shape model".

Piecewise-consistent Color Mappings of Images Acquired Under Various Conditions

Sefy Kagarlitsky, Yael Moses, Yacov Hel-Or - IDC

Many applications in computer vision require comparisons between two images of the same scene. Comparison applications usually assume that corresponding regions in the two images have similar colors. However, this assumption is not always true. One way to deal with this problem is to apply a color mapping to one of the images. We address the challenge of computing color mappings between pairs of images acquired under different acquisition conditions, and possibly by different cameras. For images taken from different viewpoints, our proposed method overcomes the lack of pixel correspondence. For images taken under different illumination, we show that no single color mapping exists, and we address and solve a new problem of computing a minimal set of piecewise color mappings. When both viewpoint and illumination vary, our method can only handle planar regions of the scene. In this case, the scene planar regions are simultaneously co-segmented in the two images, and piecewise color mappings for these regions are calculated. We demonstrate applications of the proposed method for each of these cases.

On Edge Detection On surfaces

Michael Kolomenkin, Ilan Shimshoni, Ayellet Tal – Haifa, Technion

Edge detection in images has been a fundamental problem in computer vision from its early days. Edge detection on surfaces, on the other hand, has received much less attention. The most common edges on surfaces are ridges and valleys, used for processing range images in computer vision, as well as for non-photorealistic rendering in computer graphics. We propose a new type of edges on surfaces, termed relief edges. Intuitively, the surface can be considered as an unknown smooth manifold, on top of which a local height image is placed. Relief edges are the edges of this local image. We show how to compute these edges from the local differential geometric surface properties, by fitting a local edge model to the surface. We also show how the underlying manifold and the local images can be roughly approximated and exploited in the edge detection process. Last but not least, we demonstrate the application of relief edges to artifact illustration in archaeology.

Optically Compressed Sensing by Undersampling the polar Fourier plane

Adrian Stern, Ofer Levi - BGU

The recently introduced theory of compressed sensing (CS) has attracted the interest of theoreticians and practitioners alike and has initiated a fast emerging research field. CS theory shows that one can recover certain signals and images from far fewer samples or measurements that traditional methods use. CS provides a new framework for simultaneous sampling and compression of signals. Optically compressed sensing is a natural implementation of CS theory because of high redundancy typical to most optical data. In a previous work we presented a compressed imaging approach that uses a linear rotating sensor to capture indirectly polar strips of the Fourier transform of the image. Here we present further developments of this technique and present new results. The advantages of our technique, compared to other optically compressed imaging techniques, is that its optical implementation is relatively easy, it does not require complicate calibrations and that it can be implemented in near-real time.

Edge-Avoiding Wavelets and their Applications

Raanan Fattal - HUJI

We propose a new family of second-generation wavelets constructed using a robust data-prediction lifting scheme. The support of these new wavelets is constructed based on the edge content of the image and avoids having pixels from both sides of an edge. Multi-resolution analysis, based on these new edge-avoiding wavelets, shows a better decorrelation of the data compared to common linear translation-invariant multi-resolution analyses. The reduced inter-scale correlation allows us to avoid halo artifacts in band-independent multi-scale processing without taking any special precautions. We thus achieve nonlinear data-dependent multi-scale edge-preserving image filtering and processing at computation times which are linear in the number of image pixels. The new wavelets encode, in their shape, the smoothness information of the image at every scale. We use this to derive a new edge-aware interpolation scheme that achieves results, previously computed by an inhomogeneous Laplace equation, through an explicit computation. We thus avoid the difficulties in solving large and poorly-conditioned systems of equations.

We demonstrate the effectiveness of the new wavelet basis for various computational photography applications such as multi-scale dynamic-range compression, edge-preserving smoothing and detail enhancement, and image colorization.

Energy-Based Shape Deformation

Daniel Freedman, Zachi Karni, Craig Gotsman – HP, Technion

The talk will present a general approach to energy-based shape deformation, and applications of this approach to the problems of 2D shape deformation and image resizing. The expression for the deformation energy generalizes that found in the prior art, while still admitting an efficient "local-global" algorithm for its optimization. The key advantage of the energy function is the flexibility with which the set of "legal transformations" may be expressed; these transformations are the ones which are not considered to be distorting. This flexibility allows to pose the problem of 2D shape deformation (possibly within an image), as well as image resizing, in sensible ways, and generate minimally distorted results. Results of both algorithms show the effectiveness of this approach.