ADA Library Digital Repository

Detecting Fake News Using Natural Language Processing Techniques in Azerbaijani Language

Show simple item record

dc.contributor.author Hasanzada, Azar
dc.date.accessioned 2025-10-27T11:05:03Z
dc.date.available 2025-10-27T11:05:03Z
dc.date.issued 2025
dc.identifier.uri http://hdl.handle.net/20.500.12181/1504
dc.description.abstract The detection of fake news has become an increasingly critical challenge in the digital information age, particularly for low-resource languages like Azerbaijani, where limited resources exist for automated misinformation detection. This thesis explores the adaptation of large multilingual transformer models, specifically BERT and RoBERTa, for the task of Azerbaijani fake news classification. A translated version of the LIAR dataset was used to construct a labeled corpus, where statements were categorized into binary classes representing "Real" and "Fake" news. Extensive preprocessing, including metadata integration and length analysis for tokenization, was conducted to prepare the data for model training. Both models were fine-tuned and evaluated on a held-out test set of 1,267 examples. The multilingual BERT model achieved an overall accuracy of 68%, outperforming the RoBERTa model, which reached 65%. However, a deeper analysis revealed that RoBERTa exhibited better class balance, achieving relatively higher recall for the minority "Real" class, whereas BERT demonstrated a stronger bias toward the majority "Fake" class. Despite BERT’s superior overall metrics, RoBERTa’s more equitable performance across classes suggests its potential suitability for applications where balanced prediction is critical. The results validate the feasibility of adapting transformer models for Azerbaijani fake news detection but also highlight persistent challenges, particularly in achieving high accuracy on minority classes. Future directions are suggested, including dataset expansion with native Azerbaijani sources, domainadaptive pretraining, and multimodal approaches. This work lays a foundation for advancing automated misinformation detection tools for underrepresented languages. en_US
dc.language.iso en en_US
dc.publisher ADA University en_US
dc.rights Attribution-NonCommercial-NoDerivs 3.0 United States *
dc.rights.uri http://creativecommons.org/licenses/by-nc-nd/3.0/us/ *
dc.subject Fake news -- Detection -- Data processing. en_US
dc.subject Natural language processing (Computer science) en_US
dc.subject Machine learning -- Applications in linguistics. en_US
dc.subject Artificial intelligence -- Misinformation detection. en_US
dc.subject Computational linguistics -- Azerbaijani language. en_US
dc.title Detecting Fake News Using Natural Language Processing Techniques in Azerbaijani Language en_US
dc.type Thesis en_US
dcterms.accessRights Absolute Embargo (No access without the author's permission)


Files in this item

Files Size Format View

There are no files associated with this item.

The following license files are associated with this item:

This item appears in the following Collection(s)

Show simple item record

Attribution-NonCommercial-NoDerivs 3.0 United States Except where otherwise noted, this item's license is described as Attribution-NonCommercial-NoDerivs 3.0 United States

Search ADA LDR


Advanced Search

Browse

My Account