Research Theme: Machine Learning

Live Projects
SMADA - Smart Data Analytics
Machine Learning 'in the wild'
Machine Reasoning
Generalised Max-entropy Classifiers

The goal of SMADA is to develop a next-generation, platform-independent collaborative big data environment designed to act as a space in which both existing and newly developed analytical tools are considered as modules that can be interconnected to generate new tools, thereby allowing small and large companies to safely share their knowledge pool.

Our goal is a blue sky rethinking of machine learning, laying the foundations for an entirely new, inherently robust theory of learning. Statistical learning theory is generalised to allow for test and training data to come from distinct probability distributions. We move away from the selection of single models to that of convex sets of models, and employ the resulting theory to lay solid theoretical foundations for deep learning.

Despite the amazing success of pattern recognition via deep learning, such methods already show significant limitations. A new paradigm of machine reasoning, with machine learning as just one aspect, is gaining traction (

We are working on a generalised maximum-entropy classification framework, in which the empirical expectation of the feature functions is bounded by the lower and upper expectations associated with the lower and upper probabilities associated with a belief measure. This generalised setting permits a more cautious appreciation of the information content of a training set.

Past Projects
Metric learning
Tensor classification
Vehicle classification from inductive loop signature

We devised a general framework for learning distance functions for generative dynamical models, given a training set of labelled videos. The optimal distance function is selected among a family of pullback ones, induced by a parameterised automorphism of the space of models. We focus here on hidden Markov models and their manifold, and design an appropriate automorphism there. Experimental results are presented which show how pullback learning greatly improves action recognition performances with respect to base distances.

In most real-world problem however, observations are influenced by a number of nuisance factors. To tackle their influence, it is natural to resort to multi-linear or "tensorial" decompositions. We show how HOSVD can be exploited to formulate a natural generalization of Tenenbaum's bilinear classifiers, which we call 'multilinear classifiers', able to classify observations depending on one content label and several style labels.

Inductive loops are sensors that are widely deployed on road networks for the purpose of traffic data collection. Our aim is to classify vehicles in a 10 category scheme such as the SWISS10 from inductive loop signals. We looked at two machine-learning algorithms: Support Vector Machines and Adaptive Boosting with decision stumps. We used the two most common algorithms for multiclass classification, One-versus-One and One-versus-Rest, and we looked at addressing class-imbalance with Under-sampling, Oversampling and SMOTE.