About This Course
Robots have gradually moved from factory floors to populated areas. Therefore, there is a crucial need to endow robots with perceptual and interaction skills enabling them to communicate with people in the most natural way. With auditory signals distinctively characterizing physical environments and speech being the most effective means of communication among people, robots must be able to fully extract the rich auditory information from their environment.
This course will address fundamental issues in robot hearing; it will describe methodologies requiring two or more microphones embedded into a robot head, thus enabling sound-source localization, sound-source separation, and fusion of auditory and visual information.
The course will start by briefly describing the role of hearing in human-robot interaction, overviewing the human binaural system, and introducing the computational auditory scene analysis paradigm. Then, it will describe in detail sound propagation models, audio signal processing techniques, geometric models for source localization, and unsupervised and supervised machine learning techniques for characterizing binaural hearing, fusing acoustic and visual data, and designing practical algorithms. The course will be illustrated with numerous videos shot in the author’s laboratory.
Who can attend this course?
The course is intended for Master of Science students with good background in signal processing and machine learning. The course is also valuable to PhD students, researchers and practitioners, who work in signal and image processing, machine learning, robotics, or human-machine interaction, and who wish to acquire competences in binaural hearing methodologies.
The course material will allow the attendants to design and develop robot and machine hearing algorithms.
Introductory courses in digital signal processing, probability and statistics, computer science.
- Week 1: Introduction to Robot Hearing
- Week 2 : Methodological Foundations
- Week 3 : Sound-Source Localization
- Week 4 : Machine Learning and Binaural Hearing
- Week 5 : Fusion of Audio and Vision
About the instructor Radu Horaud
Radu Patrice Horaud holds a position of research director at INRIA Grenoble Rhône-Alpes. He is the founder and leader of the PERCEPTION team.
Radu’s research interests cover computational vision, audio signal processing, audio-visual scene analysis, machine learning, and robotics. He has authored over 160 scientific publications.
Radu has pioneered work in computer vision using range data (or depth images) and has developed a number of principles and methods at the cross-roads of computer vision and robotics. In 2006, he started to develop audio-visual fusion and recognition techniques in conjunction with human-robot interaction.
Radu Horaud was the scientific coordinator of the European Marie Curie network VISIONTRAIN (2005-2009), STREP projects POP (2006-2008) and HUMAVIPS (2010-2013), and the principal investigator of a collaborative project between INRIA and Samsung’s Advanced Institute of Technology (SAIT) on computer vision algorithms for 3D television (2010-2013). In 2013 he was awarded an ERC Advanced Grant for his five year project VHIA (2014-2019).
Radu Horaud's webpage.
The course contents are structured around 5 weeks, however all the contents will be available from the opening of the MOOC. Each week consists in approximately 10 sessions : each one containing a video about 6 minutes and quizzes.
This session is an archived Mooc permanently open.
No attestation of achievement will be delivered for this course.
Conditions of Use
of the course:
Licence Creative Commons BY-NC-ND : the name of the author should always be mentionned. The user may not use the material for commercial purposes. The user can exploit the work except in a commercial context and he cannot make changes in the original work.
of the content produced by users :
Licence Creative Commons BY-NC-ND : the name of the author should always be mentioned. The user may not use the material for commercial purposes.The user can exploit the work except in a commercial context and he cannot make changes in the original work.
This course is provided by Inria through the project IDEFI uTOP (Open MultiPartner University of Technology) - contract PIA ANR-11-IDFI-0037 (http://utop.fr - http://utop.inria.fr/)
Crédit photo : © Inria / Photo H. Raguet / modifiée V. Peregrin