Karteek Alahari Department of Computing, Oxford Brookes University, Wheatley, Oxford OX33 1HX, UK. PHONE: +44 1865 485 605 FAX : +44 1865 484 545 . (AT) brookes.ac.uk http://cms.brookes.ac.uk/staff/Karteek/ EDUCATION Doctor of Philosphy (Ph.D.) Oxford Brookes University (July 2010) Adviser: Prof. Philip H. S. Torr MS by Research, Computer Science IIIT, Hyderabad, India (July 2005) "Modelling and Recognition of Dynamic Events in Video" Advisers: Prof. C. V. Jawahar, Prof. P. J. Narayanan B. Tech. (Honours), Computer Science and Engineering (CGPA: 9.86/10.00) IIIT, Hyderabad, India (July 2004) PUBLICATIONS L. Ladicky, P. Sturgess, K. Alahari, C. Russell, and P. H. S. Torr "What, Where & How Many? Combining Object Detectors and CRFs" Proc. European Conference on Computer Vision and Pattern Recognition (ECCV), Sep 2010 (Oral) K. Alahari, C. Russell, and P. H. S. Torr "Efficient Piecewise Learning for Conditional Random Fields" Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2010 K. Alahari, P. Kohli, and P. H. S. Torr "Dynamic Hybrid Algorithms for MAP Inference in Discrete MRFs" IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 2010 P. Sturgess, K. Alahari, L. Ladicky, and P. H. S. Torr "Combining Appearance and Structure from Motion Features for Road Scene Understanding" Proc. British Machine Vision Conference (BMVC), Sep 2009 K. Alahari, P. Kohli, and P. H. S. Torr "Reduce, Reuse & Recycle: Efficiently Solving Multi-Label MRFs" Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2008 S. Ramalingam, P. Kohli, K. Alahari, and P. H. S. Torr "Exact Inference in Multi-label CRFs with Higher Order Cliques" Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2008 K. Alahari and C. V. Jawahar "Dynamic Events as Mixtures of Spatial and Temporal Features" Proc. Indian Conference on Computer Vision, Graphics and Image Processing (ICVGIP), LNCS 4338, Dec 2006 K. Alahari and C. V. Jawahar "Discriminative Actions for Recognising Events" Proc. Indian Conference on Computer Vision, Graphics and Image Processing (ICVGIP), LNCS 4338, Dec 2006 K. Alahari, S. L. Putrevu, and C. V. Jawahar "Learning Mixtures of Offline and Online features for Handwritten Stroke Recognition" Proc. IEEE International Conference on Pattern Recognition (ICPR), Aug 2006 K. Alahari, S. L. Putrevu, and C. V. Jawahar "Discriminant Substrokes for Online Handwriting Recognition" Proc. IEEE International Conference on Document Analysis and Recognition (ICDAR), Aug 2005 (Oral) S. S. Ravi Kiran, K. Alahari, and C. V. Jawahar "Recognizing Human Activities from Constituent Actions" Proc. National Conference on Communications (NCC), Feb 2005 (Oral) K. Alahari, S. Kuthirummal, C. V. Jawahar, and P. J. Narayanan "Geometric and Stochastic Error Minimisation in Motion Tracking" Proc. Sixth Asian Conference on Computer Vision (ACCV), Jan 2004 AWARDS EPSRC Studentship (December 2005 - December 2008) IPAM UCLA Travel Grant (February 2008) GE Foundation Scholar (2004 - 2005) - One of the 73 graduate students in India to have received this scholarship awarded by The GE Foundation Ranked First in the graduating class of 2004 at IIIT, Hyderabad EXPERIENCE Postdoctoral Research Assistant, Vision Lab, Oxford Brookes University (April - August 2010) Research Assistant, Vision Lab, Oxford Brookes University (December 2008 - March 2009) Research Intern, Interactive Visual Media Group, Microsoft Research, Redmond, USA (May - Aug 2004) Mentor: Dr. Nebojsa Jojic Teaching Assistant for two semesters; Maths II (Prof. C. N. Kaul, Spring 2004), Computer Organization (Prof. P. J. Narayanan, Spring 2002) INVITED TALKS Scene Understanding in an Energy Minimization Framework Oxford Vision Workshop, Oxford, UK, July 2010 Mitsubishi Electric Research Labs, Boston, USA, June 2010 KTH Royal Institute of Technology, Stockholm, Sweden, May 2010 Ecole Centrale Paris, Chatenay-Malabry, France, April 2010 Reduce, Reuse & Recycle ETH Zurich, Zurich, Switzerland, November 2009 Oxford-Willow Mini Vision Workshop, Oxford, UK, April 2008 PROFESSIONAL ACTIVITIES Program Committee ACCV 2007, 2009, 2010; ECCV 2010; ICPR 2010; Online Learning for Classification Workshop 2008. Reviewer NIPS 2009; ICVGIP 2008, 2010; The Visual Computer Journal; Image and Vision Computing Journal; IPSJ Trans. Computer Vision and Applications Additional Reviewer ICCV 2005 SOFTWARE SKILLS Programming Languages C, C++, Lisp, Perl Operating Systems Linux, Mac OS X, Windows Others Matlab, OpenGL, Qt SELECTED PROJECTS Parameter learning for CRFs Conditional Random Field models have proved effective for several low-level computer vision problems. Inference in these models involves solving a combinatorial optimization problem, with methods such as graph cuts, belief propagation. Although several methods have been proposed to learn the model parameters from training data, they suffer from various drawbacks. Learning these parameters involves computing the partition function, which is intractable. To overcome this, state-of-the-art structured learning methods frame the problem as one of large margin estimation. Iterative solutions have been proposed to solve the resulting convex optimization problem. Each iteration involves solving an inference problem over all the labels, which limits the efficiency of these structured methods. In this paper we present an efficient large margin piecewise learning method which is widely applicable. We show how the resulting optimization problem can be reduced to an equivalent convex problem with a small number of constraints, and solve it using an efficient scheme. Our method is both memory and computationally efficient. Scene Understanding Computer vision algorithms for individual tasks such as object recognition, detection and segmentation have shown impressive results in the recent past. The next challenge is to integrate all these algorithms and address the problem of scene understanding. This paper is a step towards this goal. We present a probabilistic framework for reasoning about regions, objects, and their attributes such as object class, location, and spatial extent. Our model is a Conditional Random Field defined on pixels, segments and objects. We define a global energy function for the model, which combines results from sliding window detectors, and low-level pixel-based unary and pairwise relations. One of our primary contributions is to show that this energy function can be solved efficiently. Experimental results show that our model achieves significant improvement over the baseline methods on the Cambridge-driving Labeled Video and PASCAL VOC datasets. Efficient Energy Minimization We present novel techniques that improve the computational and memory efficiency of algorithms for solving multi-label energy functions arising from discrete MRFs or CRFs. These methods are motivated by the observations that the performance of minimization algorithms depends on: (a) the initialization used for the primal and dual variables; and (b) the number of primal variables involved in the energy function. Our first method (dynamic alpha-expansion) works by `recycling' results from previous problem instances. The second method simplifies the energy function by `reducing' the number of unknown variables present in the problem. Further, we show that it can also be used to generate a good initialization for the dynamic alpha-expansion algorithm by `reusing' dual variables. We test the performance of our methods on energy functions encountered in the problems of colour and object based segmentation, and stereo matching. Experimental results show that our methods achieve a substantial improvement in the performance of alpha-expansion, as well as other popular algorithms such as sequential tree-reweighted message passing, max-product belief propagation. We also demonstrate the applicability of our schemes for certain higher order energy functions for interactive texture based image and video segmentation. In most cases we achieve a 10-15 times speed-up in the computation time. Our modified alpha-expansion algorithm provides similar performance to Fast-PD. Higher Order Cliques for Scene Reconstruction We address the problem of exactly inferring the maximum a posteriori solutions of discrete multi-label MRFs or CRFs with higher order cliques. We present a framework to transform special classes of multi-label higher order functions to submodular second order boolean functions (referred to as ${\cal F}_s^2$), which can be minimized exactly using graph cuts and we characterize those classes. The basic idea is to use two or more boolean variables to encode the states of a single multi-label variable. There are many ways in which this can be done and much interesting research lies in finding ways which are optimal or minimal in some sense. We study the space of possible encodings and find the ones that can transform the most general class of functions to ${\cal F}_s^2$. Our main contributions are two-fold. First, we extend the subclass of submodular energy functions that can be minimized exactly using graph cuts. Second, we show how higher order potentials can be used to improve single view 3D reconstruction results. We believe that our work on exact minimization of higher order energy functions will lead to similar improvements in solutions of other labelling problems.