-

Student Score Card Analysis
Machine Learning Exercise

This project was a task proposed on a kaggle dataset.

The provised dataset was composed of student marks on subjects and socio-economical information. The main goal was to build a Machine Learning model for gender prediction.
The data was checked, explored and then prepared for modelling. Two different Models were applied, namely Random Forest and Support Vector Machine(SVM). A model accuracy of 89% was achieved.

Afterwards, Shapley Values were applied to the model to explain the most relevant parameters for the prediction. It was concluded that the socio-economical information added no relevant benefit to the model.

At the end, a final model was applied to predict the math score of the students. This last model showcased that the method could also be applied to predict numerical variables.

title
title
title
Open Image Dataset Overview




a
title
Open Image Numerical variables overview
title
Open Image Categorical Variables overview
title
Open Image Categorical X Numerical variable visualization
title
Open Image Distribution of Test Scores per gender
title
Open Image Shapley Values Technique for feature importance