Research groups

Scene understanding & Artificial Intelligence (SUnAI)

In recent years the amount of information available online has been growing exponentially. Resulting from this, one of the biggest problems we face relates to semantic information searches. There are now ways to search very easily and quickly through textual data, but the same cannot be said for audiovisual data.


The SUNAI group develops computer vision and artificial intelligence algorithms to obtain information from images or videos.  In particular our work focuses on:

  • Algorithms for automatic object recognition in natural images, for subsequent use in natural environments.
  • Algorithms for recognition of gestures and non-verbal language using images and videos of people, to build user-friendly interfaces for human-computer interaction and to analyse social interactions between people.

Main projects

  • 1.


    Visual recognition using end-to-end learning methodology: theory and applications. Human information processing mechanisms suggest the need for deep architectures to extract complex structures and build internal representation of rich sensory inputs. With increases in storage and computation capacities (the use of GPUs), end-to-end learning systems and big data have started getting the attention of the computer vision community. Currently, deep convolutional neural networks trained with very big datasets of millions of images, for instance ImageNet or Places, have shown remarkable improvements in several visual recognition tasks. The success of these approaches is mainly attributed to the simultaneously learning of the feature representation and the classification rule. End-to-end learning algorithms model the problem from pixels to outputs, and the descriptors are adjusted to the decision rule at learning time. This project explores the development of novel deep learning algorithms using both tagged and untagged data, and their application to several computer vision tasks (scene understanding, emotion recognition and ADHD, egocentric vision and medical image analysis).

Bioinformatics services offered

  • Computer vision applications

  • Machine learning applications to complex data

  • Object recognition and context

  • Image segmentation

  • Deep learning algorithms applied to large tagged and untagged datasets