Detecting Fake News Using Natural Language Processing Techniques in Azerbaijani Language

Hasanzada, Azar

Home
→
CB5. ADA Theses, Dissertations and Final Projects
→
School of Information Technologies and Engineering
→
View Item

dc.contributor.author	Hasanzada, Azar
dc.date.accessioned	2025-10-27T11:05:03Z
dc.date.available	2025-10-27T11:05:03Z
dc.date.issued	2025
dc.identifier.uri	http://hdl.handle.net/20.500.12181/1504
dc.description.abstract	The detection of fake news has become an increasingly critical challenge in the digital information age, particularly for low-resource languages like Azerbaijani, where limited resources exist for automated misinformation detection. This thesis explores the adaptation of large multilingual transformer models, specifically BERT and RoBERTa, for the task of Azerbaijani fake news classification. A translated version of the LIAR dataset was used to construct a labeled corpus, where statements were categorized into binary classes representing "Real" and "Fake" news. Extensive preprocessing, including metadata integration and length analysis for tokenization, was conducted to prepare the data for model training. Both models were fine-tuned and evaluated on a held-out test set of 1,267 examples. The multilingual BERT model achieved an overall accuracy of 68%, outperforming the RoBERTa model, which reached 65%. However, a deeper analysis revealed that RoBERTa exhibited better class balance, achieving relatively higher recall for the minority "Real" class, whereas BERT demonstrated a stronger bias toward the majority "Fake" class. Despite BERT’s superior overall metrics, RoBERTa’s more equitable performance across classes suggests its potential suitability for applications where balanced prediction is critical. The results validate the feasibility of adapting transformer models for Azerbaijani fake news detection but also highlight persistent challenges, particularly in achieving high accuracy on minority classes. Future directions are suggested, including dataset expansion with native Azerbaijani sources, domainadaptive pretraining, and multimodal approaches. This work lays a foundation for advancing automated misinformation detection tools for underrepresented languages.	en_US
dc.language.iso	en	en_US
dc.publisher	ADA University	en_US
dc.rights	Attribution-NonCommercial-NoDerivs 3.0 United States	*
dc.rights.uri	http://creativecommons.org/licenses/by-nc-nd/3.0/us/	*
dc.subject	Fake news -- Detection -- Data processing.	en_US
dc.subject	Natural language processing (Computer science)	en_US
dc.subject	Machine learning -- Applications in linguistics.	en_US
dc.subject	Artificial intelligence -- Misinformation detection.	en_US
dc.subject	Computational linguistics -- Azerbaijani language.	en_US
dc.title	Detecting Fake News Using Natural Language Processing Techniques in Azerbaijani Language	en_US
dc.type	Thesis	en_US
dcterms.accessRights	Absolute Embargo (No access without the author's permission)

Files in this item

Files	Size	Format	View
There are no files associated with this item.

The following license files are associated with this item:

Creative Commons

This item appears in the following Collection(s)

School of Information Technologies and Engineering

Show simple item record

Except where otherwise noted, this item's license is described as Attribution-NonCommercial-NoDerivs 3.0 United States

Detecting Fake News Using Natural Language Processing Techniques in Azerbaijani Language

Files in this item

This item appears in the following Collection(s)

Search ADA LDR

Browse

All of ADA LDR

This Collection

My Account