| dc.contributor.author | Hasanzada, Azar | |
| dc.date.accessioned | 2025-10-27T11:05:03Z | |
| dc.date.available | 2025-10-27T11:05:03Z | |
| dc.date.issued | 2025 | |
| dc.identifier.uri | http://hdl.handle.net/20.500.12181/1504 | |
| dc.description.abstract | The detection of fake news has become an increasingly critical challenge in the digital information age, particularly for low-resource languages like Azerbaijani, where limited resources exist for automated misinformation detection. This thesis explores the adaptation of large multilingual transformer models, specifically BERT and RoBERTa, for the task of Azerbaijani fake news classification. A translated version of the LIAR dataset was used to construct a labeled corpus, where statements were categorized into binary classes representing "Real" and "Fake" news. Extensive preprocessing, including metadata integration and length analysis for tokenization, was conducted to prepare the data for model training. Both models were fine-tuned and evaluated on a held-out test set of 1,267 examples. The multilingual BERT model achieved an overall accuracy of 68%, outperforming the RoBERTa model, which reached 65%. However, a deeper analysis revealed that RoBERTa exhibited better class balance, achieving relatively higher recall for the minority "Real" class, whereas BERT demonstrated a stronger bias toward the majority "Fake" class. Despite BERT’s superior overall metrics, RoBERTa’s more equitable performance across classes suggests its potential suitability for applications where balanced prediction is critical. The results validate the feasibility of adapting transformer models for Azerbaijani fake news detection but also highlight persistent challenges, particularly in achieving high accuracy on minority classes. Future directions are suggested, including dataset expansion with native Azerbaijani sources, domainadaptive pretraining, and multimodal approaches. This work lays a foundation for advancing automated misinformation detection tools for underrepresented languages. | en_US |
| dc.language.iso | en | en_US |
| dc.publisher | ADA University | en_US |
| dc.rights | Attribution-NonCommercial-NoDerivs 3.0 United States | * |
| dc.rights.uri | http://creativecommons.org/licenses/by-nc-nd/3.0/us/ | * |
| dc.subject | Fake news -- Detection -- Data processing. | en_US |
| dc.subject | Natural language processing (Computer science) | en_US |
| dc.subject | Machine learning -- Applications in linguistics. | en_US |
| dc.subject | Artificial intelligence -- Misinformation detection. | en_US |
| dc.subject | Computational linguistics -- Azerbaijani language. | en_US |
| dc.title | Detecting Fake News Using Natural Language Processing Techniques in Azerbaijani Language | en_US |
| dc.type | Thesis | en_US |
| dcterms.accessRights | Absolute Embargo (No access without the author's permission) |
| Files | Size | Format | View |
|---|---|---|---|
|
There are no files associated with this item. |
|||
The following license files are associated with this item: