Rifat Sultana Safa

Work place: Department of Computer Science, Faculty of Information Technology, American International University-Bangladesh

E-mail: rifatsultana96@gmail.com


Research Interests: Computer systems and computational processes, Data Mining, Data Structures and Algorithms


Rifat Sultana Safa is currently studying her B.Sc. in Computer Science and Engineering in the American International University-Bangladesh. Rifat's research interest includes: Data mining, Health and Biomedical Analytics, Data science.

Author Articles
An Empirical Comparison of Missing Value Imputation Techniques on APS Failure Prediction

By Siam Rafsunjani Rifat Sultana Safa Abdullah Al Imran Md. Shamsur Rahim Dip Nandi

DOI: https://doi.org/10.5815/ijitcs.2019.02.03, Pub. Date: 8 Feb. 2019

The Air Pressure System (APS) is a type of function used in heavy vehicles to assist braking and gear changing. The APS failure dataset consists of the daily operational sensor data from failed Scania trucks. The dataset is crucial to the manufacturer as it allows to isolate components which caused the failure. However, missing values and imbalanced class problems are the two most challenging limitations of this dataset to predict the cause of the failure. The prediction results can be affected by the way of handling these missing values and imbalanced class problem. In this paper, we have examined and presented the impact of five different missing value imputation techniques namely: Expectation Maximization, Mean Imputation, Soft Impute, MICE, and Iterative SVD in producing significantly better results. We have also performed an empirical comparison of their performance by applying five different classifiers namely: Naive Bayes, KNN, SVM, Random Forest, and Gradient Boosted Tree on this highly imbalanced dataset. The primary aim of this study is to observe the impact of the mentioned missing value imputation techniques in the enhancement of the prediction results, performing an empirical comparison to figure out the best classification model and imputation technique. We found that the MICE imputation and the random under-sampling techniques are the highest influential techniques for improving the prediction performance and false negative rate.

[...] Read more.
Other Articles