• End of Registration
  • -
  • Classes Start
  • may 11 2015
  • Classes End
  • jun 12 2015
  • Estimated Effort
  • 02:00 h/week
  • Language
  • English
« Ce cours est disponible en mode « Archivé ou » : il n'y a pas d'animation de l'équipe pédagogique (pas de forum ni d’exercice noté comme les quiz) et le cours ne délivre aucune attestation de suivi avec succès ni de certificat. Toutefois, vous pouvez sans limitation accéder aux vidéos et ressources textuel Ce mode de diffusion vous permet donc, en attendant l’ouverture d’une future « session animée », de vous former en ayant un accès aux contenus principaux. »

About This Course

Robots have gradually moved from factory floors to populated areas. Therefore, there is a crucial need to endow robots with perceptual and interaction skills enabling them to communicate with people in the most natural way. With auditory signals distinctively characterizing physical environments and speech being the most effective means of communication among people, robots must be able to fully extract the rich auditory information from their environment.

This course will address fundamental issues in robot hearing; it will describe methodologies requiring two or more microphones embedded into a robot head, thus enabling sound-source localization, sound-source separation, and fusion of auditory and visual information.

The course will start by briefly describing the role of hearing in human-robot interaction, overviewing the human binaural system, and introducing the computational auditory scene analysis paradigm. Then, it will describe in detail sound propagation models, audio signal processing techniques, geometric models for source localization, and unsupervised and supervised machine learning techniques for characterizing binaural hearing, fusing acoustic and visual data, and designing practical algorithms. The course will be illustrated with numerous videos shot in the author’s laboratory.

Who can attend this course?

The course is intended for Master of Science students with good background in signal processing and machine learning. The course is also valuable to PhD students, researchers and practitioners, who work in signal and image processing, machine learning, robotics, or human-machine interaction, and who wish to acquire competences in binaural hearing methodologies.

The course material will allow the attendants to design and develop robot and machine hearing algorithms.

Recommended Background

Introductory courses in digital signal processing, probability and statistics, computer science.

Course Syllabus

  • Week 1: Introduction to Robot Hearing
  • Week 2 : Methodological Foundations
  • Week 3 : Sound-Source Localization
  • Week 4 : Machine Learning and Binaural Hearing
  • Week 5 : Fusion of Audio and Vision

Course teacher

Radu Horaud

About the instructor Radu Horaud

Radu Patrice Horaud holds a position of research director at INRIA Grenoble Rhône-Alpes. He is the founder and leader of the PERCEPTION team.

Radu’s research interests cover computational vision, audio signal processing, audio-visual scene analysis, machine learning, and robotics. He has authored over 160 scientific publications.

Radu has pioneered work in computer vision using range data (or depth images) and has developed a number of principles and methods at the cross-roads of computer vision and robotics. In 2006, he started to develop audio-visual fusion and recognition techniques in conjunction with human-robot interaction.

Radu Horaud was the scientific coordinator of the European Marie Curie network VISIONTRAIN (2005-2009), STREP projects POP (2006-2008) and HUMAVIPS (2010-2013), and the principal investigator of a collaborative project between INRIA and Samsung’s Advanced Institute of Technology (SAIT) on computer vision algorithms for 3D television (2010-2013). In 2013 he was awarded an ERC Advanced Grant for his five year project VHIA (2014-2019).

Radu Horaud's webpage.

Course Organisation

The course contents are structured around 5 weeks, however all the contents will be available from the opening of the MOOC. Each week consists in approximately 10 sessions : each one containing a video about 6 minutes and quizzes.

This session is an archived Mooc permanently open.


No attestation of achievement will be delivered for this course.

Conditions of Use

of the course:

Licence Creative Commons BY-NC-ND : the name of the author should always be mentionned. The user may not use the material for commercial purposes. The user can exploit the work except in a commercial context and he cannot make changes in the original work.

of the content produced by users :

Licence Creative Commons BY-NC-ND : the name of the author should always be mentioned. The user may not use the material for commercial purposes.The user can exploit the work except in a commercial context and he cannot make changes in the original work.


logo inria recherche logo investissement d'avenir

This course is provided by Inria through the project IDEFI uTOP (Open MultiPartner University of Technology) - contract PIA ANR-11-IDFI-0037 (http://utop.fr - http://utop.inria.fr/)

Crédit photo : © Inria / Photo H. Raguet / modifiée V. Peregrin