2009 Israel Computer Vision Day
Sunday, December 13, 2009

The Efi Arazi School of Computer Science

I.D.C. Herzliya


Supported by GM - Advanced Technical Center - Israel








Previous Vision Days Web Page: 2003, 2004, 2005, 2006, 2007, 2008.



Vision Day Schedule



Speaker and Collaborators






Yael Pritch
Eitam Kav-Venaki

Shmuel Peleg


Shift-Map Image Editing



Fredo Durand



Fourier to the rescue of Photography and Image Synthesis


Eli Shechtman
Connelly Barnes
Adam Finkelstein
Dan Goldman


PatchMatch: A Randomized Correspondence Algorithm for Structural Image Editing


Daniel Glasner
Shai Bagon

Michal Irani


Super-Resolution From a Single Image


Coffee Break .


Hadas Kogan
Ron Maurer

Renato Keshet


Vanishing Points Estimation by Self-Similarity


Arik Shamir

Tao Chen,

Ming-Ming Cheng,

Shi-Min Hu, Ping Tan

Tsinghua U.
U of Singapore

Sketch2Photo: Internet Image Montage


Ofir Pele

Michael Werman


Fast and Robust Earth Mover's Distances



Anat Levin, Yair Weiss, Fredo Durand

Bill Freeman




Understanding and evaluating blind deconvolution algorithms


Lunch .


Tal Hassner

Lior Wolf

Yaniv Taigman


Multiple One-Shots for Utilizing Class Label Information


Lior Wolf, Rotem Littman, Naama Mayer, Nachum Dershowitz,

R. Shweka, Y. Choueka


Genizah Project

Automatically Identifying Join Candidates in the Cairo Genizah


Simon Polak

Amnon Shashua


A system for recognizing instances from a large set of object classes using a novel shape model


Sefy Kagarlitsky

Yael Moses

Yacov Hel-Or


Piecewise-consistent Color Mappings of Images Acquired Under Various Conditions


Coffee Break .


Michael Kolomenkin

Ilan Shimshoni

Ayellet Tal



On Edge Detection On surfaces


Adrian Stern

Ofer Levi


Optically Compressed Sensing by Undersampling the polar Fourier plane


Raanan Fattal


Edge-Avoiding Wavelets and their Applications


Daniel Freedman
Zachi Karni

Craig Gotsman



Energy-Based Shape Deformation




General: This is the seventh Israel Computer Vision Day. It will be hosted at IDC.

For more details, requests to be added to the mailing list etc, please contact:

hagit@cs.haifa.ac.il toky@idc.ac.il


Location and Directions: The Vision Day will take place at the Interdisciplinary Center (IDC), Herzliya, in the Ivtzer Auditorium. For driving instructions see map.

A convenient option is to arrive by train, see time schedule here. Get off at the Herzliya Station, and order a taxi ride by phone. There are two taxi stations that provide this service: Moniyot Av-Yam (09 9501263 or 09 9563111), and Moniyot Pituach (09 9582288 or 09 9588001).








Shift-Map Image Editing

Yael Pritch, Eitam Kav-Venaki, Shmuel Peleg - HUJI


Geometric rearrangement of images includes operations such as image retargeting, object removal, or object rearrangement. Each such operation can be characterized by a shift-map: the relative shift of every pixel in the output image from its source in an input image.


We describe a new representation of these operations as an optimal graph labeling, where the shift-map represents the selected label for each output pixel. Two terms are used in computing the optimal shift-map: (i) A data term which indicates constraints such as the change in image size, object rearrangement, a possible saliency map, etc. (ii) A smoothness term, minimizing the new discontinuities in the output image caused by discontinuities in the shift-map.


This graph labeling problem can be solved using graph cuts. Since the optimization is global and discrete, it outperforms state of the art methods in most cases. Efficient hierarchical solutions for graph-cuts are presented, and operations on 1M images can take only a few seconds.


Fourier to the rescue of Photography and Image Synthesis

Fredo Durand - MIT


New analysis of light transport and image formation enables novel imaging strategies that reduce motion blur and depth of field as well as acceleration algorithms for computer graphics.


We analyze phenomena such as light transport in a scene, integration during the shutter interval and defocus blur with the Fourier transform of the domain of light rays and space-time. For imaging applications, this offers both new theoretical insights on upper bounds of achievable sharpness and signal-noise ratio in the presence of motion blur and depth of field as well as new lens and camera designs that, combined with computation, can reduce blur. In image synthesis, similar analysis enables algorithms that use adaptive sampling and novel reconstruction to simulate effects such as motion blur and depth of field with dramatic speedups.



A Randomized Correspondence Algorithm for Structural Image Editing

Eli Shechtman, Connelly Barnes, Adam Finkelstein, Dan Goldman Adobe, Princeton


We present a new randomized algorithm for quickly finding approximate nearest neighbor matches between image patches for interactive image editing. Previous research in graphics and vision has leveraged such nearest-neighbor searches to provide a variety of high-level digital image editing tools. However, the cost of computing a field of such matches for an entire image has eluded previous efforts to provide interactive performance. Our algorithm offers substantial performance improvements over the previous state of the art (20-100x), enabling its use in interactive editing tools. The key insights driving the algorithm are that some good patch matches can be found via random sampling, and that natural coherence in the imagery allows us to propagate such matches quickly to surrounding areas. We offer a theoretical analysis of the convergence properties of the algorithm, as well as empirical and practical evidence for its high quality and performance. This one simple algorithm forms the basis for a variety of tools image retargeting, completion and reshuffling that can be used together in the context of a high-level image editing application. Finally, we provide additional intuitive constraints on the synthesis process that offer the user a level of control unavailable in previous methods.


Super-Resolution From a Single Image

Daniel Glasner, Shai Bagon, Michal Irani - Weizmann


Methods for super-resolution (SR) can be broadly classified into two families of methods: (i) The classical multi-image super-resolution (combining images obtained at subpixel misalignments), and (ii) Example-Based super-resolution (learning correspondence between low and high resolution image patches from a database). In this work we propose a unified framework for combining these two families of methods. We further show how this combined approach can be applied to obtain super resolution from as little as a single image (with no database or prior examples). Our approach is based on the observation that patches in a natural image tend to redundantly recur many times inside the image, both within the same scale, as well as across different scales. Recurrence of patches within the same image scale (at subpixel misalignments) gives rise to the classical super-resolution, whereas recurrence of patches across different scales of the same image gives rise to example-based super-resolution. Our approach attempts to recover at each pixel its best possible resolution increase based on its patch redundancy within and across scales.


Vanishing Points Estimation by Self-Similarity

Hadas Kogan, Ron Maurer, Renato Keshet - HP


We propose a new self-similarity based approach for the problem of vanishing point estimation in man-made scenes. A vanishing point (VP) is the convergence point of a pencil (a concurrent line set), that is a perspective projection of a corresponding parallel line set in the scene. Unlike traditional VP detection that relies on extraction and grouping of individual straight lines, our approach detects entire pencils based on a property of 1D affine-similarity between parallel cross-sections of a pencil. Our approach is not limited to real pencils. Under some conditions (normally met in man-made scenes), our method can detect pencils made of virtual lines passing through similar image features, and hence can detect VPs from repeating patterns that do not contain straight edges. We demonstrate that detecting entire pencils rather than individual lines improves the detection robustness in that it improves VP detection in challenging conditions, such as very-low resolution or weak edges, and simultaneously reduces VP false-detection rate when only a small number of lines are detectable.


Sketch2Photo: Internet Image Montage

Arik Shamir, Tao Chen, Ming-Ming Cheng, Shi-Min Hu, Ping Tan IDC, Tsinghua University, University of Singapore


We present a system that composes a realistic picture from a simple freehand sketch annotated with text labels. The composed picture is generated by seamlessly stitching several photographs in agreement with the sketch and text labels; these are found by searching the Internet. Although online image search generates many inappropriate results, our system is able to automatically select suitable photographs to generate a high quality composition, using a filtering scheme to exclude undesirable images. We also provide a novel image blending algorithm to allow seamless image composition. Each blending result is given a numeric score, allowing us to find an optimal combination of discovered images. Experimental results show the method is very successful; we also evaluate our system using the results from two user studies.


Fast and Robust Earth Mover's Distances

Ofir Pele, Michael Werman - HUJI


We present a new Earth Mover's Distance (EMD) variant. We show that it is a metric (unlike the original EMD) which is a metric only for normalized histograms). Moreover, it is a natural extension of the L1 metric. We propose a linear time algorithm for the computation of the EMD variant, with a robust ground distance for oriented gradients.

We also present a new algorithm for a robust family of Earth Mover's Distances - EMDs with thresholded ground distances.  We compute the EMD by an order of magnitude faster than the original algorithm, which makes it possible to compute the EMD on large histograms and databases. In addition, we show that EMDs with thresholded ground distances have many desirable properties. First, they correspond to the way humans perceive distances. Second, they are robust to outlier noise and quantization effects. Third, they are metrics. Finally, experimental results show that thresholding the ground distance of the EMD improves both accuracy and speed.


Understanding and evaluating blind deconvolution algorithms

Anat Levin, Yair Weiss, Fredo Durand, Bill Freeman Weizmann, HUJI, MIT


Blind deconvolution is the recovery of a sharp version of a blurred image when the blur kernel is unknown. Recent algorithms have afforded dramatic progress, yet many aspects of the problem remain challenging and hard to understand. The goal of our work is to analyze and evaluate recent blind deconvolution algorithms both theoretically and experimentally. We explain the previously reported failure of the naive MAP approach by demonstrating that it mostly favors no-blur explanations. On the other hand we show that since the kernel size is often smaller than the image size a MAP estimation of the kernel alone can be well constrained and accurately recover the true blur.

The plethora of recent deconvolution techniques makes an experimental evaluation on ground-truth data important. We have collected blur data with ground truth and compared recent algorithms under equal settings. Additionally, our data demonstrates that the shift-invariant blur assumption made by most algorithms is often violated.


Multiple One-Shots for Utilizing Class Label Information

Tal Hassner, Lior Wolf, Yaniv Taigman - OpenU, TAU, face.com


The One-Shot Similarity measure has recently been introduced as a means of boosting the performance of face recognition systems. Given two vectors, their One-Shot Similarity score reflects the likelihood of each vector belonging to the same class as the other vector and not in a class defined by a fixed set of ``negative'' examples. An appealing aspect of this approach is that it does not require class labeled training data. This talk will explain how the One-Shot Similarity may nevertheless benefit from the availability of such labels. We claim the following contributions: (a) We present a system utilizing subject and pose information to improve facial image pair-matching performance using multiple One-Shot scores; (b) we show how separating pose and identity may lead to better face recognition rates in unconstrained, ``wild'' facial images; (c) we explore how far we can get using a single descriptor with different similarity tests as opposed to the popular multiple descriptor approaches; and (d) we demonstrate the benefit of learned metrics for improved One-Shot performance. We test the performance of our system on the challenging Labeled Faces in the Wild unrestricted benchmark and present results that exceed by a large margin the best results reported to date for this test.


Automatically Identifying Join Candidates in the Cairo Genizah

Lior Wolf, Rotem Littman, Naama Mayer, Nachum Dershowitz, R. Shweka, Y. Choueka - TAU Genizah Project


A join is a set of manuscript-fragments that are known to originate from the same original work. The Cairo Genizah is a collection containing approximately 250,000 fragments of mainly Jewish texts discovered in the late 19th century. The fragments are today spread out in libraries and private collections worldwide, and there is an ongoing effort to document and catalogue all extant fragments. The task of finding joins is currently conducted manually by experts, and presumably only a small fraction of the existing joins have been discovered. In this work, we study the problem of automatically finding candidate joins, so as to streamline the task. The proposed method is based on a combination of local descriptors and learning techniques. To evaluate the performance of various join-finding methods, without relying on the availability of human experts, we construct a benchmark dataset that is modeled on the Labeled Faces in the Wild benchmark for face recognition. Using this benchmark, we evaluate several alternative image representations and learning techniques. Finally, a set of newly-discovered join-candidates have been identified using our method and validated by a human expert.


A system for recognizing instances from a large set of object classes using a novel shape model

Simon Polak , Amnon Shashua - HUJI


We propose a model for classification and detection of object classes where the number of classes may be large and where multiple instances of object classes may be present in an image. The algorithm combines a bottom-up, low-level, procedure of a bag-of-words naive Bayes phase for winnowing out unlikely object classes with a high-level procedure for detection and classification. The high-level process is a hybrid of a voting method where votes are filtered using beliefs computed by a class-specific graphical model (using sum-TRBP). In that sense, shape is both explicit (determining the voting pattern) and implicit (each object part votes independently) --- hence we call our approach the "semi-explicit shape model".


Piecewise-consistent Color Mappings of Images Acquired Under Various Conditions

Sefy Kagarlitsky, Yael Moses, Yacov Hel-Or - IDC


Many applications in computer vision require comparisons between two images of the same scene. Comparison applications usually assume that corresponding regions in the two images have similar colors. However, this assumption is not always true. One way to deal with this problem is to apply a color mapping to one of the images. We address the challenge of computing color mappings between pairs of images acquired under different acquisition conditions, and possibly by different cameras. For images taken from different viewpoints, our proposed method overcomes the lack of pixel correspondence. For images taken under different illumination, we show that no single color mapping exists, and we address and solve a new problem of computing a minimal set of piecewise color mappings. When both viewpoint and illumination vary, our method can only handle planar regions of the scene. In this case, the scene planar regions are simultaneously co-segmented in the two images, and piecewise color mappings for these regions are calculated. We demonstrate applications of the proposed method for each of these cases.


On Edge Detection On surfaces

Michael Kolomenkin, Ilan Shimshoni, Ayellet Tal Haifa, Technion


Edge detection in images has been a fundamental problem in computer vision from its early days. Edge detection on surfaces, on the other hand, has received much less attention. The most common edges on surfaces are ridges and valleys, used for processing range images in computer vision, as well as for non-photorealistic rendering in computer graphics. We propose a new type of edges on surfaces, termed relief edges. Intuitively, the surface can be considered as an unknown smooth manifold, on top of which a local height image is placed. Relief edges are the edges of this local image. We show how to compute these edges from the local differential geometric surface properties, by fitting a local edge model to the surface. We also show how the underlying manifold and the local images can be roughly approximated and exploited in the edge detection process. Last but not least, we demonstrate the application of relief edges to artifact illustration in archaeology.


Optically Compressed Sensing by Undersampling the polar Fourier plane

Adrian Stern, Ofer Levi - BGU


The recently introduced theory of compressed sensing (CS) has attracted the interest of theoreticians and practitioners alike and has initiated a fast emerging research field. CS theory shows that one can recover certain signals and images from far fewer samples or measurements that traditional methods use. CS provides a new framework for simultaneous sampling and compression of signals. Optically compressed sensing is a natural implementation of CS theory because of high redundancy typical to most optical data. In a previous work we presented a compressed imaging approach that uses a linear rotating sensor to capture indirectly polar strips of the Fourier transform of the image. Here we present further developments of this technique and present new results. The advantages of our technique, compared to other optically compressed imaging techniques, is that its optical implementation is relatively easy, it does not require complicate calibrations and that it can be implemented in near-real time.


Edge-Avoiding Wavelets and their Applications

Raanan Fattal - HUJI

We propose a new family of second-generation wavelets constructed using a robust data-prediction lifting scheme. The support of these new wavelets is constructed based on the edge content of the image and avoids having pixels from both sides of an edge. Multi-resolution analysis, based on these new edge-avoiding wavelets, shows a better decorrelation of the data compared to common linear translation-invariant multi-resolution analyses. The reduced inter-scale correlation allows us to avoid halo artifacts in band-independent multi-scale processing without taking any special precautions. We thus achieve nonlinear data-dependent multi-scale edge-preserving image filtering and processing at computation times which are linear in the number of image pixels. The new wavelets encode, in their shape, the smoothness information of the image at every scale. We use this to derive a new edge-aware interpolation scheme that achieves results, previously computed by an inhomogeneous Laplace equation, through an explicit computation. We thus avoid the difficulties in solving large and poorly-conditioned systems of equations.


We demonstrate the effectiveness of the new wavelet basis for various computational photography applications such as multi-scale dynamic-range compression, edge-preserving smoothing and detail enhancement, and image colorization.


Energy-Based Shape Deformation

Daniel Freedman, Zachi Karni, Craig Gotsman HP, Technion


The talk will present a general approach to energy-based shape deformation, and applications of this approach to the problems of 2D shape deformation and image resizing. The expression for the deformation energy generalizes that found in the prior art, while still admitting an efficient "local-global" algorithm for its optimization. The key advantage of the energy function is the flexibility with which the set of "legal transformations" may be expressed; these transformations are the ones which are not considered to be distorting. This flexibility allows to pose the problem of 2D shape deformation (possibly within an image), as well as image resizing, in sensible ways, and generate minimally distorted results. Results of both algorithms show the effectiveness of this approach.