Md. Shamsur Rahim

Work place: Department of Computer Science, Faculty of Information Technology, American International University-Bangladesh



Research Interests: Computational Science and Engineering, Software Construction, Software Engineering, Data Mining, Data Structures and Algorithms


Md Shamsur Rahim completed his B.Sc. in Computer Science and Software Engineering and M.Sc. in Computer Science from American International University-Bangladesh in 2014 and 2016. Currently he is working as an Assistant Professor at the Computer Science department in the same institute. Rahim’s research interest includes: Data Mining, Data Science, and Software Engineering.

Author Articles
An Empirical Comparison of Missing Value Imputation Techniques on APS Failure Prediction

By Siam Rafsunjani Rifat Sultana Safa Abdullah Al Imran Md. Shamsur Rahim Dip Nandi

DOI:, Pub. Date: 8 Feb. 2019

The Air Pressure System (APS) is a type of function used in heavy vehicles to assist braking and gear changing. The APS failure dataset consists of the daily operational sensor data from failed Scania trucks. The dataset is crucial to the manufacturer as it allows to isolate components which caused the failure. However, missing values and imbalanced class problems are the two most challenging limitations of this dataset to predict the cause of the failure. The prediction results can be affected by the way of handling these missing values and imbalanced class problem. In this paper, we have examined and presented the impact of five different missing value imputation techniques namely: Expectation Maximization, Mean Imputation, Soft Impute, MICE, and Iterative SVD in producing significantly better results. We have also performed an empirical comparison of their performance by applying five different classifiers namely: Naive Bayes, KNN, SVM, Random Forest, and Gradient Boosted Tree on this highly imbalanced dataset. The primary aim of this study is to observe the impact of the mentioned missing value imputation techniques in the enhancement of the prediction results, performing an empirical comparison to figure out the best classification model and imputation technique. We found that the MICE imputation and the random under-sampling techniques are the highest influential techniques for improving the prediction performance and false negative rate.

[...] Read more.
Other Articles