ADA Library Digital Repository

Text-Dependent Speaker Identification

Show simple item record

dc.contributor.author Akhundova, Natavan
dc.date.accessioned 2023-10-16T11:25:39Z
dc.date.available 2023-10-16T11:25:39Z
dc.date.issued 2022-04
dc.identifier.uri http://hdl.handle.net/20.500.12181/727
dc.description.abstract Speaker identification is a process of identifying a person who is speaking and is very useful in applications such as customer service or even in investigations and reporting forensic evidence. This study focuses on finding the relationship between the latest state of-art technology in speaker recognition which is x-vectors, and the uttered text within audio signals, as well as, the duration of them. In order to accomplish that, three different datasets are used: two relatively small digits datasets in English and Azerbaijani, and one larger dataset of digits and commands in Azerbaijani. The hypotheses tested in this research are as following: 1) x-vectors hold the information about the text in audio recordings, and the accuracy of the model changes as the text is changed; 2) x-vectors show better accuracy with longer audio recordings than shorter ones. All three datasets were trained to test the first hypothesis and the findings show that when the models are given audio samples in which a new unseen text is uttered, the accuracy decreases drastically. The last dataset was used to test the second hypothesis. Indeed, x-vectors are data-hungry and more speech samples together with longer duration of recordings gave the best results. Although, most of the experiments are conducted in the Azerbaijani language, it is believed that the results are not related to the specific language. Moreover, testing these hypotheses with a dataset of another language will yield the same results, as proved with the English dataset in this study. en_US
dc.language.iso en en_US
dc.publisher ADA University en_US
dc.rights Attribution-NonCommercial-NoDerivs 3.0 United States *
dc.rights.uri http://creativecommons.org/licenses/by-nc-nd/3.0/us/ *
dc.subject Speaker recognition. en_US
dc.subject Speaker identification. en_US
dc.subject TDNN. en_US
dc.title Text-Dependent Speaker Identification en_US
dc.type Thesis en_US


Files in this item

The following license files are associated with this item:

This item appears in the following Collection(s)

Show simple item record

Attribution-NonCommercial-NoDerivs 3.0 United States Except where otherwise noted, this item's license is described as Attribution-NonCommercial-NoDerivs 3.0 United States

Search ADA LDR


Advanced Search

Browse

My Account