-

Student Score Card Analysis
Machine Learning Exercise

This project was a task proposed on a kaggle dataset.

The provised dataset was composed of student marks on subjects and socio-economical information. The main goal was to build a Machine Learning model for gender prediction.
The data was checked, explored and then prepared for modelling. Two different Models were applied, namely Random Forest and Support Vector Machine(SVM). A model accuracy of 89% was achieved.

Afterwards, Shapley Values were applied to the model to explain the most relevant parameters for the prediction. It was concluded that the socio-economical information added no relevant benefit to the model.

At the end, a final model was applied to predict the math score of the students. This last model showcased that the method could also be applied to predict numerical variables.

GOAL

Develop a Machine Learning Model to predict the gender of students based on their scorecards and socio-economical info.

RESULTS

Two different models were build. An accuracy of 89,5% was achieved.
It was also concluded that only the subjects' marks provided relevant contribution to the model.