Abstract:
The purpose of this study is to compare various methods for detecting the genre of music using
machine learning algorithms. The findings are obtained using Neural Network, Convolutional
Neural Network, and Recurrent Neural Network - Long Short-Term Memory, and Multi-Layer
Perceptron algorithms on two distinct datasets. The first dataset is well-known in the field of music
classification; it is named the GTZAN dataset and contains 1000 records. The second dataset is
compiled by me and added to the GTZAN collection, which currently has over 2200 songs with
vocals and instrumentals combined. To avoid confusing the model, I removed the first section of
the MFCC features from the second dataset.For experiments, just MFCC features are utilized, and I
write a script to choose the first 30 seconds, modify the bit rate from 41000 to 20500 and extension
from .mp3 to .wav , and then extract MFCC features with different segments such as 1,3,6,10, 15
and 30. To avoid confusing the model, I removed the first section of the MFCC features from the
second dataset. At the conclusion of the investigation, From the first section, the best result was
85% and 75% training and validation accuracies, which is obtained with only the usage of GTZAN
Data. From the second section, I obtained a 62% F1 score and 80% and 62% training and
validation accuracies, respectively, using MLP and CNN. However, the best results were obtained
with CNN, with segments 15 and 30 achieving 72 and 71% accuracy, and 80% and 75% training
and validation accuracy, respectively.