Detecting Fake News Using Natural Language Processing Techniques in Azerbaijani Language

Hasanzada, Azar

Library MyADA ADA University

Home
→
CB5. ADA Theses, Dissertations and Final Projects
→
School of Information Technologies and Engineering
→
View Item

Detecting Fake News Using Natural Language Processing Techniques in Azerbaijani Language

Hasanzada, Azar

URI: http://hdl.handle.net/20.500.12181/1504

Date: 2025

Abstract:

The detection of fake news has become an increasingly critical challenge in the digital information age, particularly for low-resource languages like Azerbaijani, where limited resources exist for automated misinformation detection. This thesis explores the adaptation of large multilingual transformer models, specifically BERT and RoBERTa, for the task of Azerbaijani fake news classification. A translated version of the LIAR dataset was used to construct a labeled corpus, where statements were categorized into binary classes representing "Real" and "Fake" news. Extensive preprocessing, including metadata integration and length analysis for tokenization, was conducted to prepare the data for model training. Both models were fine-tuned and evaluated on a held-out test set of 1,267 examples. The multilingual BERT model achieved an overall accuracy of 68%, outperforming the RoBERTa model, which reached 65%. However, a deeper analysis revealed that RoBERTa exhibited better class balance, achieving relatively higher recall for the minority "Real" class, whereas BERT demonstrated a stronger bias toward the majority "Fake" class. Despite BERT’s superior overall metrics, RoBERTa’s more equitable performance across classes suggests its potential suitability for applications where balanced prediction is critical. The results validate the feasibility of adapting transformer models for Azerbaijani fake news detection but also highlight persistent challenges, particularly in achieving high accuracy on minority classes. Future directions are suggested, including dataset expansion with native Azerbaijani sources, domainadaptive pretraining, and multimodal approaches. This work lays a foundation for advancing automated misinformation detection tools for underrepresented languages.

Show full item record

Files in this item

Files	Size	Format	View
There are no files associated with this item.

The following license files are associated with this item:

Creative Commons

This item appears in the following Collection(s)

School of Information Technologies and Engineering

Except where otherwise noted, this item's license is described as Attribution-NonCommercial-NoDerivs 3.0 United States

Detecting Fake News Using Natural Language Processing Techniques in Azerbaijani Language

Detecting Fake News Using Natural Language Processing Techniques in Azerbaijani Language

Abstract:

Files in this item

This item appears in the following Collection(s)

Search ADA LDR

Browse

All of ADA LDR

This Collection

My Account