In federated learning the training data remains distributed on the mobile devices and learning of the model takes
place by aggregating locally computed updates. In this way, different hospital sites may enjoy superior diagnostic performance than that obtained by just training machine learning
models on their proprietary data. The parameters (but not the data) of the individual models trained by each partner on
their own data are shared, to obtain a more robust "global" model which is then shared among the partners.


While interesting work has been recently directed at continual learning from streaming data in a fully supervised setting, especially focusing on avoiding catastrophic
forgetting, continual learning in a semisupervised setting remains a wideopen research question.
In our approach, the problem can be reconduced to a continual supervised learning setting under a 'multiple worlds' assumption,
in which we first seek, in an incremental fashion, the most likely labelling(s) of the current datastream.


Selfsupervised learning empowers us to exploit the variety of labels that usual come with the data for free. Technically, the idea is to define an auxiliary task for which
we already have labels, 'hidden' within the structure of the data itself.
This happens by defining a selfsupervised task, also known as pretext task, which guides us to a supervised loss function.
However, no theoretical foundations for selfsupervised learning yet exist  the purpose of our work is to provide a theoretical justification for selfsupervised learning
through a combination of functional analysis and optimisation theory.


Our goal is a blue sky rethinking of machine learning, laying the foundations for an entirely new, inherently robust
theory of learning. Statistical learning theory is generalised to allow for test and training data to come from distinct
probability distributions. We move away from the selection of single models to that of convex sets of models, and
employ the resulting theory to lay solid theoretical foundations for deep learning. 

We devised a general framework for learning
distance functions for generative dynamical models, given a training set of labelled
videos. The optimal distance function is selected among a family of pullback ones,
induced by a parameterised automorphism of the space of models. We focus here on
hidden Markov models and their manifold, and design an appropriate automorphism
there. Experimental results are presented which show how pullback learning greatly
improves action recognition performances with respect to base distances.


In most realworld problem however, observations are influenced by a number of nuisance factors. To tackle their influence,
it is natural to resort to multilinear or "tensorial" decompositions.
We show how HOSVD can be exploited to formulate a natural generalization of Tenenbaum's bilinear classifiers, which we
call 'multilinear classifiers', able to classify observations depending on one content label and several style labels.


Inductive loops are sensors that are widely deployed on road networks for the purpose
of traffic data collection. Our aim is to classify vehicles in a 10 category scheme such as the SWISS10 from
inductive loop signals.
We looked at two machinelearning algorithms: Support Vector Machines and Adaptive
Boosting with decision stumps. We used the two most common algorithms for multiclass
classification, OneversusOne and OneversusRest, and we looked at addressing
classimbalance with Undersampling, Oversampling and SMOTE.

