Introduction
This paper was created based on my dissertation on the same topic. The paper was submitted to 2020 Joint Conference on AI Music Creativity, but did not ultimately get published. Despite so, i’ve gained so much experience in the process of publishing academic papers, thanks to the guidance of my mentor, Dr Jeremie Clos.
Background
We were trying to predict the predominant music in a mixture of polyphonic music. Techniques employed in this research are:
- Background Seperation
- Feature Extraction using MFCC, and various other audio features
- Hyperparameter selection and comparison of different machine learning models
Findings
From this research, we came to several conclusions that would hopefully guide future researchers in the same field.
- The MFCC features gives the best perfomance, accross the models. This is consistent in the literature.
- Features from the spectral domain is nessecary to produce best performing models, in addition to cepstral features.
- The random forest architecture achieves best out of the three classical ML models (SVM, Random Forest, and Neural Network).
Learn more
Read more at 👉 link