Grants and Funding

*NEW* SARAS - Smart Autonomous Robotic Assistant Surgeon

January 2018 - December 2020
Coordinator: Dr Riccardo Muradore, University of Verona, Italy
Budget: €4,315,640 (Oxford Brookes' share: €596,073)
Own role: Scientific Officer (SO) for the whole project, as well as WP Leader

In surgical operations many people crowd the area around the operating table. The introduction of robotics in surgery has not decreased this number. During a laparoscopic intervention with the da Vinci robot, for example, the presence of an assistant surgeon, two nurses and an anaesthetist, is required, together with that of the main surgeon teleoperating the robot. The assistant surgeon needs always be present to take care of simple surgical tasks the main surgeon cannot perform with the robotic tools s/he is teleoperating (e.g. suction and aspiration during dissection, moving or holding organs in place to make room for cutting or suturing, using the standard laparoscopic tools). Another expert surgeon is thus required to play the role of the assistant, to properly support the main surgeon using traditional laparoscopic tools as shown in Figure 1.

The goal of SARAS is to develop a next-generation surgical robotic platform that allows a single surgeon (i.e., without the need for an expert assistant surgeon) to execute robotic minimally invasive surgery (R-MIS), thereby increasing the social and economic efficiency of a hospital while guaranteeing the same level of safety for patients. This platform is called solo-surgeon system.

Material and resources:

Meta Vision Knowledge Transfer Partnership

September 2015 - August 2017
Budget: £160,000
Own role: Academic supervisor

In the welding industry, we see an increasing need for automated inspection, both in partnership with automated seam tracking and as a completely separate function.
The aim of this project is to develop algorithms for computer vision capable of analysing 3D scans of robotic welds, by extracting underlying geometry, identifying a range of standard defects, and classifying the welds as acceptable or not according to geometrical definitions.

Three key stages of the project can be identified:
  • Performing automatic analysis of 3D data requires an in depth understanding and application of the underlying mathematics involved. It will be necessary to use this knowledge to define the basis for the operation of the algorithms.

  • The second step will be to use the mathematical development in the form of a set of algorithms for matching the 3D datasets of actual parts to be inspected to either theoretical models of good and bad welds or stored, processed 3D models of good and bad welds, and thereby making a determination of the overall quality of the weld in question and identifying any particular defects.

  • To support the first two items above, it may be necessary to have a database which extracts key geometric information about good and bad shapes and makes that available to the inspection algorithms themselves. The database will also store basic 3D representations of complete parts.
News and project website:

Online action recognition for human-robot interaction

Oxford Brookes University: 150th Anniversary Scholarship

September 2015 - March 2019
Budget: 1 PhD studentship
Own role: Director of studies
Personnel: Gurkirt Singh

Action recognition is a fast-growing area of research in computer vision. The problem consists in, given a video captured by one or more cameras, detecting and recognising the category of the action performed by the person(s) who appear in the video. The problem is very challenging, for a number of reasons: labelling videos is an ambiguous task, as the same sequence can be assigned different verbal descriptions by different human observers; different motions can carry the same meaning (inherent variability); nuisance factors such as viewpoint, illumination variations, occlusion (as parts of the moving person can be hidden behind objects or other people) further complicate recognition. In addition, traditional action recognition benchmarks are based on a ‘batch’ philosophy: it is assumed that a single action is present within each video, and videos are processed as a whole, typically via algorithms which require entire days to be completed. This can be ok for tasks such as video browsing and retrieval over the internet (although speed is a huge issue there), but is completely unacceptable for a number of real world applications which require a prompt, real-time interpretation of what is going on.

Consequently, a new paradigm of ‘online’, ‘real-time’ action recognition is rapidly emerging, and is likely to shape the field in coming years. The AI and Vision group is already building on its multi-year experience in batch action recognition to expand towards online recognition, based on two distinct approaches: one based on the application of novel ‘deep learning’ neural networks to automatically segmented video regions, the other resting on continually updating an approximation of the space of ‘feature’ measurements extracted from images, via a set of balls of radius which depends on how difficult classification is within that region of the space. Investing on online action recognition is crucial to maintain and further improve Brookes’ reputation in human action classification, face the fierce international competition on the topic.

Papers and Posters:

Code and resources:

Uncertainty in Computer Vision

Faculty of Technology, Design and Environment: Next 10 Award

September 2014 - February 2018
Budget: 1 PhD studentship
Own role: Director of studies
Personnel: Suman Saha

In recent years “online action detection” has attracted a lot of attention in the Computer Vision community due to its far- reaching real-world applications such as, human-robot interaction, autonomous surveillance, computer gaming and virtual environment, automated vehicle driving, biometric gait identifications. Here, the goal is to detect multiple human actions from online videos. In offline action detection, the system has full access to the video whereas, in the online case the detection model has only access to the present and previous frames and thus, the detection task is more challenging. Current state- of-the-art detection systems demonstrate promising results for offline applications such as, video-indexing. However, the computer vision community is still striving to model a robust online action recognition system which can perform in real-time.

Papers and Posters:
Code and resources:

Recognising and Localising Human Actions

Oxford Brookes University: Doctoral School on "Intelligent Transport Systems" (ITS)

October 2011 - October 2014
Budget: 1 PhD studentship
Own role: Director of studies
Personnel: Michael Sapienza

Human action recognition in challenging video data is becoming an increasingly important research area, given the huge amounts of user generated content uploaded to the Internet each day. The detection of human actions will facilitate automatic video description and organisation, as well as online search and retrieval. Furthermore, for the Intelligent Transport Systems (ITS) autonomous vehicle to drive safely in urban environments, it must learn to recognise and quickly react to human actions.
Giving machines the capability to recognise human actions from videos poses considerable challenges. The captured videos are often of low-quality, contain unconstrained camera motion, zoom, and shake. In addition, human actions interpreted as space-time sequences trace a very flexible structure, with variations in viewpoint, pose, scale, and illumination. Apart from these nuisance factors, actions inherently possess a high degree of within-class variability: for example, a walking motion may vary in stride, pace and style, yet remain the same action. Creating action models which can cope with this variability, while being able to discriminate between a significant number of action classes, is a serious challenge. In the first 15 months of this research project entitled "Recognising and localising human actions", we have made significant steps to tackle this challenge.

Papers and Posters:
Code and resources:

Tensorial modeling of dynamical systems for gait and activity recognition

August 2011 - January 2014
Budget: £122,000
Own role: Principal Investigator (PI)
Personnel: Dr Wenjuan Gong

Case for Support

Biometrics such as face, iris, or fingerprint recognition for surveillance and security have received growing attention in the last decade. They suffer, however, from two major limitations: they cannot be used at a distance, and require user cooperation. For these reasons, originally driven by an initiative of US’s DARPA, identity recognition from gait has been proposed as a novel behavioral biometrics, based on people’s distinctive gait pattern.
Despite its attractive features, though, gait identification is still far from being ready to be deployed in practice, as in real-world scenarios recognition is made extremely difficult by the presence of nuisance factors such as viewpoint, illumination, clothing, etcetera. Similar issues are shared by other applications such as action and activity recognition.
This proposal concerns the problem of classifying video sequences by attributing to each sequence a label, such as the type of event recorded or the identity of the person performing a certain action. It proposes a novel framework for motion recognition capable of dealing in a principled way with the issue of nuisance factors in both gait and activity recognition. The goal is pushing towards a more widespread diffusion of gait identification, as a concrete contribution to enhancing the security levels in the country in the current, uncertain scenarios. However, as the techniques devised in this proposal are extendable to action and identity recognition, their commercial exploitation potential in, for instance, video indexing or interactive video games is also enormous.

Code and resources: