IJISA Vol. 18, No. 2, 8 Apr. 2026
Cover page and Table of Contents: PDF (size: 735KB)
PDF (735KB), PP.85-97
Views: 0 Downloads: 0
Public Transport (PT), Machine Learning Models, Return Items, Lost Items, Prediction
Public transport (PT) users often experience instances of leaving items behind in the public transport system. Finders who come across these items may choose to keep them maliciously or, out of goodwill, decide to return them. This paper aims to utilize six (6) machine learning models, including LR, SVM, DT, RF, NB, and KNN, to predict the ability of finders to return found items. Nine (9) features, comprising four (4) demographic parameters (age, gender, income, and education), were used in the models’ prediction process. The study involved a total of 603 PT users in the Accra cosmopolitan area of Ghana to assess finder’s decision regarding returning found item(s). The classification success rates were obtained as follows: 86.740% (LR), 87.293% (SVM), 82.873% (DT), 85.083% (RF), 85.083% (GNB), and 87.845% (KNN) using Python codes. The RF model also performed well, considering the balance of performance with the desired precision and recall. RF, GNB, and LR achieved the highest AUC values (0.78), demonstrating strong discriminative ability in predicting user honesty.
Simon A. Ocansey, Makafui Agboyi, Gideon L. Sackitey, AKM K. Islam, "Predicting Public Transport User Honesty: A Machine Learning Approach to Lost Item Returns", International Journal of Intelligent Systems and Applications(IJISA), Vol.18, No.2, pp.85-97, 2026. DOI:10.5815/ijisa.2026.02.06
[1]S. Fujii & T. G¨arling. Application of attitude theory for improved predictive accuracy of stated preference methods in travel demand analysis. Transportation Research Part A: Policy and Practice, 37(4):389–402, 2003.
[2]A. Bucciol; F. Landini & M. Piovesan. Unethical minds: Individual characteristics that predict unethical behavior. leonardo3.dse.univr.it, 2012.
[3]E. Alpaydin. Introduction to machine learning, 2020.
[4]C. Alabi & J. B. Hayfron-Acquah. An improved frame difference background subtraction technique for enhancing road safety at night. International Journal of Computer Applications, Volume 183 – No. 1:0975 – 8887, 2021.
[5]S.A. Ocansey; G.L. Sackitey & M. Agboyi. Investigating the moral behavior of public transport users in returning lost but found items. RUPT-Urban, Planning and Transport Research, pages 1 – 25, 2024.
[6]S.M. Githinji. Designing lost and found web applications. 2016.
[7]P.O. Sadiku; R.O. Ogundokun & O.C. Abikoye. Ifound - an online lost item recovery application. i-manager39. Journal on Information Technology, 2019.
[8]Zipate. Retrieved on february 25. 2014.
[9]C.S. Oboh & E.O. Omolehinwa. Sociodemographic variables, and ethical decision-making: a survey of professional accountants in nigeria. Emerald Publishing Limited, pages 131–148, 2021.
[10]E.O. Onwuchekwa & O.R. Jegede. Information retrieval methods in libraries and information centers. African Research Review, pages 108–120, 2011.
[11]W. Budiawan; S. Saptadi; Sriyanto; C. Tjioe & T. Phommachak. Traffic accident severity prediction using naive bayes algorithm - a case study of semarang toll road. IOP Conf. Ser: Mater. Sci. Eng., 598, 2019.
[12]D. Kim; S. Jung & S. Yoon. Risk prediction for winter road accidents on expressways. Applied Sciences, 2021.
[13]T. Bokaba; W. Doorsamy & B.S. Paul. Comparative study of machine learning classifiers for modelling road traffic accidents. Applied Sciences, 2022.
[14]I. Cinar & M. Koklu. Classification of rice varieties using artificial intelligence methods. International Journal of Intelligent Systems and Applications in Engineering, 2019.
[15]S. Mardanirad; D. A. Wood & H. Zakeri. The application of deep learning algorithms to classify subsurface drilling lost circulation severity in large oil field datasets. Research Gate, 2021.
[16]K. Rudolf; S. Matthias & M. Christian. Data mining applications in the automotive industry. Research Gate, pages 23–40, 2010.
[17]B. Kalantar; B. Pradhan; S.A. Naghibi; A. Motevalli & S. Mansor. ssessment of the effects of training data selection on the landslide susceptibility mapping: a comparison between support vector machine (svm), logistic regression (lr) and artificial neural networks (ann). Geomatics, Natural Hazards and Risk, 9(49), 2017.
[18]M. Cruyff; U. B¨ockenholt;P.G.M. Van der Heijden; E. Laurence & A. Frank. Review of regression procedures for randomized response data, including univariate and multivariate logistic regression, the proportional odds model and item response model, and self-protective responses. Elsevier, 2016.
[19]P. Li; M. Abdel-Aty & J. Yuan. Real-time crash risk prediction on arterials based on lstm-cnn. Accident Analysis Prevention, 2020.
[20]A.J. Smola & B. Sch¨olkopf. A tutorial on support vector regression. Statistics and Computing, 14(3):199–222, 2004.
[21]P. Abhang; B. Gawali & S. Mehrotra. Technical aspects of brain rhythms and speech parameters. Introduction to EEG-and Speech-Based Emotion Recognition, pages 51–79, 2016.
[22]I. Tarımer; A. C¸ oban & R.E. Kocaman. Sentiment analysis on imdb movie comments and twitter data by machine learning and vector space techniques. arXiv.org, 2019.
[23]D.G. Miner; A.L.Miner; M. Goldstein; R. Nisbet; Walton; P. Bolding; J. Hilbe & T. Hill. Practical predictive analytics and decisioning systems for medicine informatics accuracy and cost-effectiveness for healthcare administration and delivery including medical research. Elsevier, 2014.
[24]I.A. Ozkan; M. Koklu & I.U. Sert. Diagnosis of urinary tract infection based on artificial intelligence methods. Elsevier, pages 51–59, 2018.
[25]E. Soylu. Data mining. cited online: 08 September 2019.
[26]N.B. Amor; S. Benferhat & Z. Elouedi. Qualitative classification with possibilistic decision trees. Morden Information Processing. Elsevier, pages 159–169, 2006.
[27]W. Mao & F. Wang. New advances in intelligence and security informatics. Academic Press, 2012.
[28]K. Dunham. Chapter 6 - phishing, smishing, and vishing. In Mobile Malware Attacks and Defense, pages 125–196. Syngress, Boston, 2009.
[29]R. Panigrahi & S. Borah. Classification and analysis of facebook metrics dataset using supervised classifiers. Social Network Analytics, 2019.
[30]J.S. Richman. Multivariate neighborhood sample entropy: a method for data reduction and prediction of complex data. methods enzymol. PMID, pages 379–408, 2011.
[31]K. Sabanci & M. Koklu. The classification of eye states by using knn and mlp classification models according to the eeg signals. International Journal of Intelligent Systems and Applications in Engineering, 3(4):127–130, 2015.
[32]N. Dong; N. Canh; D.B. Thuan; N. Hung; N. Anh & T.Tuan. Joint network coding and machine learning for error-prone wireless broadcast, 2017.
[33]F. Pedregosa; V. Michel; O.Grisel; M. Blondel; P. Prettenhofer; R. Weiss; V. Dubourg; J. Vanderplas; D. Cournapeau; G. Varoquaux; A. Gramfort; B. Thirion; A. Passos; M. Brucher; M. Perrot & E. Duchesnay. Scikit-learn: Machine learning in python. Journal of Machine Learning Research, 12:2825–2830, 2011.
[34]L. Breiman. Random forest. machine learning. Link.Springer, 45(1), 2001.
[35]H. Cataloluk. Disease diagnosis using data mining methods on actual medical data. Bilecik University, Graduate School of Science, 2012.
[36]I.A. Ozkan & M. Koklu. Skin lesion classification using machine learning algorithms. International Journal of Intelligent Systems and Applications in Engineering, 5(4):285–289, 2017.
[37]D.M.W. Powers. Evaluation: from precision, recall and f-measure to roc, informed Ness, markedness and correlation. Research Gate, 2011.
[38]M. Sokolova & G. Lapalme. A systematic analysis of performance measures for classification tasks. Information Processing Management, 45(4):427–437, 2009.
[39]M. Buckland & F. Gey. The relationship between recall and precision. Journal of the American Society for Information Science, 45(1):12–19, 1994.
[40]C.J. van Rijsbergen. Information Retrieval (2nd ed.). Butterworths.
[41]J. Yerushalmy. Statistical problems in assessing methods of medical diagnosis, with special reference to x-ray techniques. Public Health report, 62(40):1432–1449, 1947.
[42]D.G. Altman & J.M. Bland. Diagnostic tests 2: Predictive values. BMJ, 309(6947), 102, 1994.
[43]Y. Benjamini & Y. Hochberg. Controlling the false discovery rate: a practical and powerful approach to multiple testing. Journal of the Royal Statistical Society: Series B (Methodological), 57(1):289– 300, 1995.
[44]T. Fawcett. n introduction to roc analysis. pattern recognition letters. Elsevier, 27(8):861–874, 2006.
[45]L.A. Jeni; J.F. Cohn & F. De La Torre. Facing imbalanced data–recommendations for the use of performance metrics. Humaine association conference on affective computing and intelligent interaction (pp. 245-251.). IEEE, 2013.
[46]R. Longadge; S. Dongre; & L. Malik. Class imbalance problem in data mining: Review. arXiv preprint arXiv:1305.1707, 2013.
[47]L. Torgo. Data mining with r: learning with case studies. Chapman and Hall/CRC, 2011.