ADA Library Digital Repository

Text to speech system for Azerbaijani language

Show simple item record

dc.contributor.author Aghalarli, Yusif
dc.date.accessioned 2025-02-10T06:29:58Z
dc.date.available 2025-02-10T06:29:58Z
dc.date.issued 2023-04
dc.identifier.uri http://hdl.handle.net/20.500.12181/941
dc.description.abstract This Master thesis focuses on the development of a Text-to-Speech (TTS) system for the Azerbaijani language. TTS technology has been gaining popularity due to its ability to generate human-like speech from written text, making it beneficial for people with disabilities, language learners, and those who prefer auditory learning. The thesis starts with an introduction to TTS, its significance, and its history. The literature review section provides an overview of related studies, including the recent advancements in TTS systems. The review covers several topics, such as the different techniques and models used in TTS systems, the evaluation metrics used to assess their performance, and the challenges and limitations of developing TTS systems for low-resource languages. The main focus of the study is the Tacotron-2 architecture, which is known for its high-quality and natural-sounding speech. This architecture consists of two parts: a mel spectrogram generator and a neural vocoder. The mel spectrogram is a representation of the speech signal that captures its spectral information, while the neural vocoder generates the actual speech waveform. The study also explains the data collection process, which is a crucial component of developing a TTS system. The first data collection attempt produced poor-quality data, which prompted the researchers to refine the process by using an audio book with speech alignment. This process resulted in approximately 19 hours of high-quality data, which was used to train the Tacotron-2 architecture. To evaluate the performance of the TTS system, a survey was conducted, and participants were asked to evaluate the system using the Mean Opinion Score .The results showed that the system received a MOS score of 3.3, indicating that it produced acceptable speech quality. In conclusion, this Master thesis provides a comprehensive overview of developing a TTS system for the Azerbaijani language using the Tacotron-2 architecture. The study presents the different components of the TTS system, the data collection process, and the evaluation metrics used to assess the system's performance. It also highlights the challenges and limitations of developing TTS systems for low-resource languages and suggests future directions for improving the system's performance. en_US
dc.language.iso en en_US
dc.publisher ADA University en_US
dc.rights Attribution-NonCommercial-NoDerivs 3.0 United States *
dc.rights.uri http://creativecommons.org/licenses/by-nc-nd/3.0/us/ *
dc.subject Speech synthesis -- Neural networks en_US
dc.subject Text-to-speech systems -- Azerbaijani language. en_US
dc.subject Tacotron-2 architecture -- Applications in speech generation. en_US
dc.subject Data collection -- Speech data. en_US
dc.subject Speech quality evaluation -- Mean Opinion Score (MOS) en_US
dc.subject Low-resource languages -- Speech synthesis en_US
dc.subject Language technology -- Applications for disabilities en_US
dc.title Text to speech system for Azerbaijani language en_US
dc.type Thesis en_US


Files in this item

The following license files are associated with this item:

This item appears in the following Collection(s)

Show simple item record

Attribution-NonCommercial-NoDerivs 3.0 United States Except where otherwise noted, this item's license is described as Attribution-NonCommercial-NoDerivs 3.0 United States

Search ADA LDR


Advanced Search

Browse

My Account