Eugene S. Perez

Work place: Bulacan State University, Philippines



Research Interests: Software Development Process, Software Engineering, Information Security, Information Systems


Eugene S. Perez is a faculty member of Bulacan State University-Sarmiento Campus. He took up his Master in Information Technology at Angeles University Foundation. Mr. Perez is also a web-developer, his research interest includes Web development, Software Engineering, Cybersecurity and Information Systems. Mr. Perez is a member of the Philippine Society of IT Educators Central Luzon Chapter (PSITE R3).

Author Articles
Predicting Student Program Completion Using Naïve Bayes Classification Algorithm

By Joann Galopo Perez Eugene S. Perez

DOI:, Pub. Date: 8 Jun. 2021

Data mining approaches provide different educational institutions opportunities to find hidden patterns from the data stored in the database. Many researchers have used these data to develop a model that would assist the institution administrators in decision-making. This study was performed to predict student program completion using the Naïve Bayes classifier technique. The dataset utilized in this study was obtained from Bulacan State University – Sarmiento Campus in the Philippines under BS Information Technology program from five-year graduates’ data for Academic Year 2012-2016. This dataset was pre-processed, cleansed, transformed, and balanced before constructing the model. Ten predictors were used for predicting student completion. The feature selection technique was used to filter and evaluate the significance of each factor. The significant variables assessed by the feature selection technique (Weight by Correlation) were the final parameters in creating the model. The Naïve Bayes classifier was applied to predict the students’ completion using the 70:30 ratios for training and testing dataset distribution. Correlation analysis identified the weight of individual attributes to the label attribute. From 10 possible predictor variables, only four (4) predictor variables were selected after correlation analysis. The identified significant attributes affecting program completion, namely (in order of significance): parents' monthly income, mother and father's educational attainment, and High School GPA attributes. The significant attributes identified in correlation analysis splitted into 70% training data or 447 records and 30% testing data or 191 records. There were 84 out of 191 data samples, or 44% of students were predicted to complete the program. On the other hand, 107 out of 191 data samples, or 56%, were predicted as not completing the program. The accuracy values performed an 84% rating with 80.46% class precision, and 83.33% class recall in the testing dataset (n=191). The outcomes of this study have a significant impact on HEIs, particularly on college completion rates. This study shall be highly significant and beneficial specifically to university administrators as this be a tool for them to identify students who will complete college based on variables included in the model.

[...] Read more.
Other Articles