Machine Learning with Feature Extractions for Regression Estimation of Binaural Sound Source Localization

It appears your Web browser is not configured to display PDF files. Download adobe Acrobat or click here to download the PDF file.

Click here to download the PDF file.

Supplemental Files: 

Creator: 

Massicotte, Philippe Yvan

Date: 

2022

Abstract: 

Binaural sound source localization is the determination of the position of a sound source based on two data sensors, microphones, mimicking the human auditory system. Many audio processing systems in our daily work and life rely on sound source localization, such as speech enhancement/recognition and human-robot interaction. However, the accuracy of sound source localization under adverse acoustic scenarios is still hard to ensure. This thesis proposes machine learning with feature extractions to estimate the sound source localization by manipulating and analyzing data collected by public Head Related Transfer Function databases. The two proposed methods are wavelet scattering long short-term memory and wavelet scattering convolutional neural network. These developed methods are studied in classification and regression approaches for different scenarios. The results demonstrate that the proposed methods achieve excellent performance in multiple noisy environments compared to recent literature, especially in regression binaural sound source localization.

Subject: 

Engineering - Electronics and Electrical
Artificial Intelligence
Computer Science

Language: 

English

Publisher: 

Carleton University

Thesis Degree Name: 

Master of Applied Science: 
M.App.Sc.

Thesis Degree Level: 

Master's

Thesis Degree Discipline: 

Engineering, Electrical and Computer

Parent Collection: 

Theses and Dissertations

Items in CURVE are protected by copyright, with all rights reserved, unless otherwise indicated. They are made available with permission from the author(s).