Research Project: Autonomous Driving

In this new projects we consider the issue of vision-based autonomous driving, i.e., the problem of endowing cars to self-drive based on streaming videos captured by cameras mounted on them. In such a setting, which closely mimicks how human drivers ‘work’, the car needs to reconstruct and understand the surrounding environment from the incoming video sequence(s). A crucial task of video understanding is to recognise and localise (in space and time) different actions or events appearing in the video: for instance, the vehicle needs to perceive the behaviour of pedestrians by identifying which kind of activities (e.g., ‘moving’ versus ‘stopping’) they are performing, when and where this is happening.

In the computer vision literature this problem is termed spatio-temporal action localisation or, in short, action detection.
Unlike current human action detection datasets such as J-HMDB, UCF-101, LIRIS-HARL, DALY or AVA, the Road Event and Activity Detection (READ) dataset we introduce here is specially designed from the perspective of self-driving cars, and includes spatiotemporal actions performed not just by humans but by all road users, including cyclists, motor-bikers, drivers of vehicles large and small, and obviously pedestrians.
We strongly believe, a belief back up by clear evidence, that an awareness of all the actions and events taking place, and their location within the road scene, is essential for inherently safe self-driving cars.
Relevant papers:
  • Valentina Fontana, Gurkirt Singh, Stephen Akrigg, Manuele Di Maio, Suman Saha, Fabio Cuzzolin
    Action detection from a robot-car perspective
    arXiv:1807.11332, July 30 2018
Lab Member(s): Valentina Fontana, Stephen Akrigg, Manuele Di Maio, Gurkirt Singh