Music Genre Detection using Deep Learning Techniques

Karimov, Ilyas

Library MyADA ADA University

Home
→
CB5. ADA Theses, Dissertations and Final Projects
→
School of Information Technologies and Engineering
→
View Item

Music Genre Detection using Deep Learning Techniques

Karimov, Ilyas

URI: http://hdl.handle.net/20.500.12181/733

Date: 2022-04

Abstract:

The purpose of this study is to compare various methods for detecting the genre of music using machine learning algorithms. The findings are obtained using Neural Network, Convolutional Neural Network, and Recurrent Neural Network - Long Short-Term Memory, and Multi-Layer Perceptron algorithms on two distinct datasets. The first dataset is well-known in the field of music classification; it is named the GTZAN dataset and contains 1000 records. The second dataset is compiled by me and added to the GTZAN collection, which currently has over 2200 songs with vocals and instrumentals combined. To avoid confusing the model, I removed the first section of the MFCC features from the second dataset.For experiments, just MFCC features are utilized, and I write a script to choose the first 30 seconds, modify the bit rate from 41000 to 20500 and extension from .mp3 to .wav , and then extract MFCC features with different segments such as 1,3,6,10, 15 and 30. To avoid confusing the model, I removed the first section of the MFCC features from the second dataset. At the conclusion of the investigation, From the first section, the best result was 85% and 75% training and validation accuracies, which is obtained with only the usage of GTZAN Data. From the second section, I obtained a 62% F1 score and 80% and 62% training and validation accuracies, respectively, using MLP and CNN. However, the best results were obtained with CNN, with segments 15 and 30 achieving 72 and 71% accuracy, and 80% and 75% training and validation accuracy, respectively.

Show full item record