This project was a task proposed on a kaggle dataset.
The provised dataset was composed of student marks on subjects and socio-economical information. The main goal was to build a Machine Learning model for gender prediction.
The data was checked, explored and then prepared for modelling. Two different Models were applied, namely Random Forest and Support Vector Machine(SVM). A model accuracy of 89% was achieved.
Afterwards, Shapley Values were applied to the model to explain the most relevant parameters for the prediction. It was concluded that the socio-economical information added no relevant benefit to the model.
At the end, a final model was applied to predict the math score of the students. This last model showcased that the method could also be applied to predict numerical variables.
Social Network
Check it out