ADA Library Digital Repository

Spelling Correction for Azerbaijani Language Using Sequence to Sequence Model

Show simple item record

dc.contributor.author Dashdamirli, Asad
dc.date.accessioned 2024-12-19T23:46:43Z
dc.date.available 2024-12-19T23:46:43Z
dc.date.issued 2023-04
dc.identifier.uri http://hdl.handle.net/20.500.12181/930
dc.description.abstract In natural language processing (NLP), spelling correction is an essential task which seeks to automatically correct misspelled words in text documents,. This thesis focuses on Azerbaijani language spelling correction, which presents unique challenges due to its rich morphology and complex orthographic norms. Beginning with a comprehensive literature review covering the extant approaches and techniques for spelling correction in various languages, the thesis then proceeds to its methodology. We identify the limitations of existing methods and propose a novel approach for Azerbaijani orthography correction based on a sequence-to-sequence (seq2seq) deep neural network. Our proposed method makes use of seq2seq models, which have demonstrated great success in a variety of NLP tasks, to discover the mapping between misspelled words and their right counterparts. In addition, we introduce techniques for generating artificial noise to augment the training data and enhance the model’s ability to manage various types of misspellings. We conduct extensive experiments on a large corpus of Azerbaijani text data in order to evaluate the performance of our approach. We evaluate the results in terms of character error rate, word error rate and sequence error rate by comparing our method to several other methods. Our experiments demonstrate that our seq2seq-based approach reaches adequate results, 5.3% character error rate and 25.77% word error rate in text from news which shows potential to enhance the accuracy of Azerbaijani text spelling correction. In addition, we analyze the effect of artificial noise generation techniques on the performance of our model and provide insights into how effective they are in managing various misspelling types. In addition, we discuss the limitations of our methodology and possible future directions for further development. This thesis concludes with a novel approach to Azerbaijani spelling correction using deep neural networks, specifically the seq2seq model, along with artificial noise generation techniques. The experimental results demonstrate the viability of our method for enhancing the precision of Azerbaijani text documents in real-world settings. en_US
dc.language.iso en en_US
dc.publisher ADA University en_US
dc.rights Attribution-NonCommercial-NoDerivs 3.0 United States *
dc.rights.uri http://creativecommons.org/licenses/by-nc-nd/3.0/us/ *
dc.subject Natural language processing -- Azerbaijani language en_US
dc.subject Spelling correction -- Computational methods en_US
dc.subject Azerbaijani language -- Orthography and spelling en_US
dc.title Spelling Correction for Azerbaijani Language Using Sequence to Sequence Model en_US
dc.type Thesis en_US


Files in this item

The following license files are associated with this item:

This item appears in the following Collection(s)

Show simple item record

Attribution-NonCommercial-NoDerivs 3.0 United States Except where otherwise noted, this item's license is described as Attribution-NonCommercial-NoDerivs 3.0 United States

Search ADA LDR


Advanced Search

Browse

My Account