Show simple item record

dc.contributor.advisorRuili, Wang
dc.contributor.authorQiu, Yuanhang
dc.date.accessioned2022-01-24T04:42:38Z
dc.date.accessioned2022-06-30T01:48:19Z
dc.date.available2022-01-24T04:42:38Z
dc.date.available2022-06-30T01:48:19Z
dc.date.issued2022
dc.identifier.urihttp://hdl.handle.net/10179/17243
dc.description.abstractSpeech enhancement, aiming at improving the intelligibility and overall perceptual quality of a contaminated speech signal, is an effective way to improve speech communications. In this thesis, we propose three novel deep learning methods to improve speech enhancement performance. Firstly, we propose an adversarial latent representation learning for latent space exploration of generative adversarial network based speech enhancement. Based on adversarial feature learning, this method employs an extra encoder to learn an inverse mapping from the generated data distribution to the latent space. The encoder establishes an inner connection with the generator and contributes to latent information learning. Secondly, we propose an adversarial multi-task learning with inverse mappings method for effective speech representation. This speech enhancement method focuses on enhancing the generator's capability of speech information capture and representation learning. To implement this method, two extra networks are developed to learn the inverse mappings from the generated distribution to the input data domains. Thirdly, we propose a self-supervised learning based phone-fortified method to improve specific speech characteristics learning for speech enhancement. This method explicitly imports phonetic characteristics into a deep complex convolutional network via a contrastive predictive coding model pre-trained with self-supervised learning. The experimental results demonstrate that the proposed methods outperform previous speech enhancement methods and achieve state-of-the-art performance in terms of speech intelligibility and overall perceptual quality.en_US
dc.publisherMassey Universityen_US
dc.rightsThe Authoren_US
dc.subjectSpeech processing systemsen
dc.subjectMachine learningen
dc.titleDeep learning for speech enhancement : a thesis presented in partial fulfilment of the requirements for the degree of Doctor of Philosophy in Computer Science at Massey University, Albany, New Zealanden_US
dc.typeThesisen_US
thesis.degree.disciplineComputer Scienceen_US
thesis.degree.grantorMassey Universityen_US
thesis.degree.levelDoctoralen_US
thesis.degree.nameDoctor of Philosophy (PhD)en_US
dc.confidentialEmbargo : Noen_US
dc.subject.anzsrc460212 Speech recognitionen
dc.subject.anzsrc461103 Deep learningen


Files in this item

Icon

This item appears in the following Collection(s)

Show simple item record