Biography

Enrollment Date: 2013

Graduation Date：2016

Degree：M.S.

Defense Date：2016.05.25

Advisors：Dongmei Li

Department：Graduate School at Shenzhen，Tsinghua University

Title of Dissertation/Thesis：Research and Implementation of a Realtime Analysis/Synthesis Gammatone Filterbank

Abstract:
In every life we are surrounded by different types of noise. Machines used for sound enhancement and recognition have a bad performance in complex acoustic environments, thus limiting their application. But the human ear can still work robustly under hostile environments. It also has strong anti-noise ability and sensitivity. Extensive research has focused on integrating characteristics of the human ear into sound signal processing. Gammatone filter (GTF) is widely used in simulation of the auditory system and more specifically the basement membrane. But comparing with traditional signal analysis model like fast Fourier transform(FFT), signal analysis based on GTF leads to higher computational complexity and has no solid inverse transform. To address these problems, a hardware design of realtime analysis/synthesis Gammatone Filterbank (GTFB) is proposed. This paper first investigates the characteristics of the human ear, its simulation model, especially the auditory filters and their research status, then focus on analyzing the magnitude-frequency characteristics and phase- frequency characteristics of GTF. Based on infinite impulse response (IIR) filter, a digital implementation of GTF is introduced. It has accurate theoretical description such as filter tap coefficients, gain and group delay. Then it is simulated and checked by MATLAB. According to cochlear nerve delay curve, a realtime synthesis stage based on phase compensation is proposed. This method reduces system’s complexity by eliminating the construction of synthesis Filterbank. The maximum delay of the filterbank is limited to 16 ms. The relationship between system performance and filter’s number is further studied, thus getting the optimal number of GTF. The synthesis stage is evaluated by signal-to-noise ratio (SNR), perceptual evaluation of speech quality (PESQ) and Short-Time Objective Intelligibility (STOI). The system is compared with other research and results show it has low computational complexity and good performance. Then a fixed-point model of GTFB is proposed and its quantization error is observed. After that， a hardware design of GTFB is proposed by using pipeline technology and clock control method. It supports realtime analysis/synthesis processing of dual-channel sound. The correctness and feasibility is validated through the implementation and testing on a FPGA chip. Finally, the GTFB chip is completed in 0.18μm CMOS technology and its total area is 4.1mm2. Evaluation on the synthesized speech shows that PESQ score is 4.3, STOI score is above 0.9 and the total time delay is only 15.72ms. Experiments demonstrate the chip correctly and efficiently performs realtime analysis/synthesis of signal.

Publications

Papers：：

[1] Youwei Yang,Yi Jiang,Runsheng Liu,Dongmei Li, A Realtime Analysis/Synthesis Gammatone Filterbank, ICSPCC 2015, pp. 416 - 421, 2015.

Patents：

[1] Donmei Li,Youwei Yang,Rui Jia,Runsheng Liu. A Real - time Decomposition / Synthesis Method Based on Auditory Perception: China, 201611026399.6[P]. 2020-06-05.

[2] Donmei Li,Youwei Yang,Rui Jia,Runsheng Liu. A Gamma-pass Filter Bank System for Voice Real-Time Decomposition/Synthesis: China, 201610921435.9[P]. 2019-11-08.