Azerbaijan Text Clustering using Machine Learning Methods

Bashirov, Sokrat

Library MyADA ADA University

Home
→
CB5. ADA Theses, Dissertations and Final Projects
→
School of Information Technologies and Engineering
→
View Item

Azerbaijan Text Clustering using Machine Learning Methods

Bashirov, Sokrat

URI: http://hdl.handle.net/20.500.12181/1525

Date: 2024

Abstract:

In this digital era, the explosion of textual data is causing us to develop sophisticated text mining and clustering methods. Although the state of art has improved for most well-resourced languages, relatively little research had been carried out on a language with smaller resource like Azerbaijani. In this thesis I investigated using clustering algorithms to enhance the information and communication access in Azerbaijani speaking community. 15,500 news articles were used compiled as a part of oxu.az. So, K-means, Fuzzy-Kmeans, Agglomerative Hierarchical Clustering, Spectral Clustering along with Gaussian Mixture Model (GMM) and Latent Dirichlet Allocation were deployed. They were evaluated on the basis of Silhouette Score (SS) and Davies-Bouldin Index. Word2Vec embeddings yield higher ARI than TF-IDF, while Spectral Clustering and LDA report superior scores owing to their capability of mapping complex workout nodes. The future works will improve the Pre-processing, hybrid Clustering and Deep Learning Embeddings. Applications to real-world problems ranging from recommendation systems and content categorization, all of which will build experience with the models.

Show full item record

Files in this item

Files	Size	Format	View
There are no files associated with this item.

The following license files are associated with this item:

Creative Commons

This item appears in the following Collection(s)

School of Information Technologies and Engineering

Except where otherwise noted, this item's license is described as Attribution-NonCommercial-NoDerivs 3.0 United States

Azerbaijan Text Clustering using Machine Learning Methods

Azerbaijan Text Clustering using Machine Learning Methods

Abstract:

Files in this item

This item appears in the following Collection(s)

Search ADA LDR

Browse

All of ADA LDR

This Collection

My Account

Azerbaijan Text Clustering using Machine Learning Methods

Azerbaijan Text Clustering using Machine Learning Methods

Abstract:

Files in this item

This item appears in the following Collection(s)

Related items

Search ADA LDR

Browse

All of ADA LDR

This Collection

My Account