Neural Network Augmented Kalman Filter for Robust Acoustic Howling Suppression

Yixuan Zhang1, Hao Zhang2, Meng Yu2, Dong Yu2

1. The Ohio State University, Columbus, OH, USA

2. Tencent AI Lab, Bellevue, WA, USA

Abstract: Acoustic howling suppression (AHS) is a critical challenge in audio communication systems. In this study, we propose a novel approach that leverages the power of neural networks (NN) to enhance the performance of traditional Kalman filter algorithms for AHS. Specifically, our method involves the integration of NN modules into the Kalman filter, enabling refining reference signal, a key factor in effective adaptive filtering, and estimating covariance metrics for the filter which are crucial for adaptability in dynamic conditions, thereby obtaining improved AHS performance. As a result, the proposed method achieves improved AHS performance compared to both standalone NN and Kalman filter methods. Experimental evaluations validate the effectiveness of our approach.

This page provides sound demos for the titled paper. The titled paper can be accessed through [Link will be provided later.].



comparison
Fig. Spectrograms of (a) target near-end signal and outputs of (b) NeuralKalmanAHS, (c) NeuralKalmanAHS without \( \mathbf{\Psi}_{vv}(k) \), \( \mathbf{\Psi}_{\Delta\Delta}(k) \), (d) NeuralKalmanAHS without \( \mathbf{R}(k) \), (e) Kalman filter.

Although the output from the Kalman filter has been included, listening to the entire audio is not recommended.

Waveforms
G = 2
Target signal
NeuralKalmanAHS
NeuralKalmanAHS without \( \mathbf{\Psi}_{vv}(k) \), \( \mathbf{\Psi}_{\Delta\Delta}(k) \)
NeuralKalmanAHS without \( \mathbf{R}(k) \)
Kalman filter




comparison
Fig. Spectrograms of (a) target signal, (b) no AHS, (c) Kalman filter, (d) DeepMFC, (e) HybridAHS, (f) Neural-KG, and (g) Proposed NeuralKalmanAHS.

Due to potential risks to the auditory system, we have omitted the 'no AHS' audio. Although we have included the outputs from the Kalman filter, listening to the entire audio is not recommended.

Waveforms
Moderate Howling (G = 1.5) Severe Howling (G = 3)
Target signal
no AHS
Kalman filter
Deep MFC
Hybrid AHS
Neural-KG
NeuralKalmanAHS