Text-Dependent Speaker Identification

Akhundova, Natavan

Home
→
CB5. ADA Theses, Dissertations and Final Projects
→
School of Information Technologies and Engineering
→
View Item

dc.contributor.author	Akhundova, Natavan
dc.date.accessioned	2023-10-16T11:25:39Z
dc.date.available	2023-10-16T11:25:39Z
dc.date.issued	2022-04
dc.identifier.uri	http://hdl.handle.net/20.500.12181/727
dc.description.abstract	Speaker identification is a process of identifying a person who is speaking and is very useful in applications such as customer service or even in investigations and reporting forensic evidence. This study focuses on finding the relationship between the latest state of-art technology in speaker recognition which is x-vectors, and the uttered text within audio signals, as well as, the duration of them. In order to accomplish that, three different datasets are used: two relatively small digits datasets in English and Azerbaijani, and one larger dataset of digits and commands in Azerbaijani. The hypotheses tested in this research are as following: 1) x-vectors hold the information about the text in audio recordings, and the accuracy of the model changes as the text is changed; 2) x-vectors show better accuracy with longer audio recordings than shorter ones. All three datasets were trained to test the first hypothesis and the findings show that when the models are given audio samples in which a new unseen text is uttered, the accuracy decreases drastically. The last dataset was used to test the second hypothesis. Indeed, x-vectors are data-hungry and more speech samples together with longer duration of recordings gave the best results. Although, most of the experiments are conducted in the Azerbaijani language, it is believed that the results are not related to the specific language. Moreover, testing these hypotheses with a dataset of another language will yield the same results, as proved with the English dataset in this study.	en_US
dc.language.iso	en	en_US
dc.publisher	ADA University	en_US
dc.relation	School of IT and Engineering	en_US
dc.relation	Graduate program	en_US
dc.rights	Attribution-NonCommercial-NoDerivs 3.0 United States	*
dc.rights.uri	http://creativecommons.org/licenses/by-nc-nd/3.0/us/	*
dc.subject	Speaker recognition.	en_US
dc.subject	Speaker identification.	en_US
dc.subject	TDNN.	en_US
dc.subject	IT and Engineering	en_US
dc.title	Text-Dependent Speaker Identification	en_US
dc.type	Thesis	en_US