Profit Forecasting for Daily Pharmaceutical Sales Using Traditional, Shallow, and Deep Neural Networks: A Case Study from Sabha City, Libya

PDF (1658KB), PP.126-144

Views: 0 Downloads: 0

Author(s)

Mansour Essgaer 1,* Asma Agaal 2 Amna Abbas 3 Rabia Al Mamlook 4

1. Artificial Intelligence Department, Faculty of Information Technology, Sebha University, Sebha, Libya

2. Computer Sciences Department, Faculty of Technical Sciences, Sabha, Libya

3. Computer Science Department, Faculty of Science, Sebha, Libya

4. Business Administration, Trine University, Angola, Indiana, United States

* Corresponding author.

DOI: https://doi.org/10.5815/ijieeb.2026.01.08

Received: 20 Jul. 2025 / Revised: 27 Oct. 2025 / Accepted: 20 Dec. 2025 / Published: 8 Feb. 2026

Index Terms

Cost of Sales, Public Pharmacies, Time Series Modeling, Sales Forecasting, Machine Learning, Deep Learning, Comparison of Predictive Models, Sabha, Libya

Abstract

Abstract: Accurate profit forecasting is critical for small-scale pharmacies, particularly in resource-constrained environments where financial decisions must be both timely and data-informed. This study investigates the predictive performance of sixteen regression models for daily profit forecasting using transactional data collected from a single local pharmacy in Sabha, Libya, over a 14-month period. An exploratory data analysis revealed strong right-skewed distributions in sales, cost, and profit, as well as pronounced temporal patterns, including seasonal peaks during spring and early summer and weekly profit clustering around weekends. After outlier treatment using the interquartile range method. A total of sixteen regression models were developed and evaluated, encompassing linear models (Linear, Ridge, Lasso, ElasticNet), tree-based models (Decision Tree, Random Forest, Extra Trees, Gradient Boosting, AdaBoost), proximity-based models (K-Nearest Neighbors), kernel-based models (Support Vector Regression), and neural architectures (Multi-Layer Perceptron, Convolutional Neural Network, Long Short-Term Memory, Gated Recurrent Unit). The models were assessed using Mean Absolute Error, Mean Squared Error, Root Mean Squared Error, and the R-squared score. The results consistently showed that tree-based ensemble models—particularly Extra Trees and LightGBM—achieved the highest accuracy, with R² values of 0.978 and 0.975 respectively, significantly outperforming neural and linear models. Learning curves and residual plots further confirmed the superior generalization and robustness of these models. We acknowledge that the dataset size (424 records) and the deterministic relationship between sales, costs, and profit influence these metrics. The study highlights the importance of model selection tailored to domain-specific data characteristics and suggests that well-tuned ensemble methods may offer reliable, interpretable, and scalable solutions for profit forecasting in simialr low-resource retail environments. However, broad claims of usefulness for all low-resource settings should be tempered by the limited scope of this dataset. Future work should consider longer-term data and external economic indicators to further improve model reliability, and focus on operational deployment strategies, investigating how these models can be integrated into daily pharmacy workflows despite real-time data constraints.

Cite This Paper

Mansour Essgaer, Asma Agaal, Amna Abbas, Rabia Al Mamlook, "Profit Forecasting for Daily Pharmaceutical Sales Using Traditional, Shallow, and Deep Neural Networks: A Case Study from Sabha City, Libya", International Journal of Information Engineering and Electronic Business(IJIEEB), Vol.18, No.1, pp. 126-144, 2026. DOI:10.5815/ijieeb.2026.01.08

Reference

[1]A. Burinskiene, “Forecasting Model: The Case of the Pharmaceutical Retail,” The case of the pharmaceutical retail, vol. 9, p. 582186, Aug. 2022.
[2]V. I. Kontopoulou, A. D. Panagopoulos, I. Kakkos, and G. K. Matsopoulos, “A review of ARIMA vs. machine learning approaches for time series forecasting in data driven networks,” Future Internet, vol. 15, no. 8, p. 255, 2023.
[3]A. Agaal, M. Essgaer, H. M. Farkash, and Z. A. Othman, “Data-driven Insights for Informed Decision-Making: Applying LSTM Networks for Robust Electricity Forecasting in Libya,” IJISA, vol. 17, no. 3, pp. 65–89, June 2025, doi: 10.5815/ijisa.2025.03.05. 
[4]A. Saleem, “High frequency demand forecasting: the case of a Swedish pharmacy retailer.” 2022. Accessed: Dec. 31, 2025. [Online]. Available: https://www.diva-portal.org/smash/record.jsf?pid=diva2:1696406
[5]N. L. Rane, M. Paramesha, S. P. Choudhary, and J. Rane, “Machine learning and deep learning for big data analytics: A review of methods and applications,” Partners Universal International Innovation Journal, vol. 2, no. 3, pp. 172–197, 2024.
[6]K. P. Fourkiotis and A. Tsadiras, “Applying machine learning and statistical forecasting methods for enhancing pharmaceutical sales predictions,” Forecasting, vol. 6, no. 1, pp. 170–186, 2024.
[7]R. Rathipriya, A. A. Abdul Rahman, S. Dhamodharavadhani, A. Meero, and G. Yoganandan, “Demand forecasting model for time-series pharmaceutical data using shallow and deep neural network model,” Neural Comput & Applic, vol. 35, no. 2, pp. 1945–1957, Jan. 2023, doi: 10.1007/s00521-022-07889-9.
[8]T. Falatouri, F. Darbanian, P. Brandtner, and C. Udokwu, “Predictive analytics for demand forecasting–a comparison of SARIMA and LSTM in retail SCM,” Procedia Computer Science, vol. 200, pp. 993–1003, 2022.
[9]H. Gao et al., “Machine learning in business and finance: a literature review and research opportunities,” Financ Innov, vol. 10, no. 1, p. 86, Sept. 2024, doi: 10.1186/s40854-024-00629-z.
[10]D. O. Hassan and B. A. Hassan, “A comprehensive systematic review of machine learning in the retail industry: classifications, limitations, opportunities, and challenges,” Neural Comput & Applic, vol. 37, no. 4, pp. 2035–2070, Feb. 2025, doi: 10.1007/s00521-024-10869-w.
[11]G. Z. Wang, “Sales Forecasting for Firms based on Multiple Regression Model,” in Proceedings of the International Conference on Big Data Economy and Digital Management, 2022. Accessed: Dec. 31, 2025. [Online]. Available: https://pdfs.semanticscholar.org/6422/2b96be860bb5f92ba210fabbb7f69cbfc5b7.pdf
[12]A. Lindfors, “Demand forecasting in retail: A comparison of time series analysis and machine learning models,” 2021, Accessed: Dec. 31, 2025. [Online]. Available: https://www.doria.fi/handle/10024/181510
[13]M. R. HASAN, “Addressing seasonality and trend detection in predictive sales forecasting: a machine learning perspective,” Journal of Business and Management Studies, vol. 6, no. 2, p. 100, 2024.
[14]D.-S. Kmet, “Sales forecasting for small retail business in a market with anomalous events,” 2023, Accessed: Dec. 31, 2025. [Online]. Available: https://er.ucu.edu.ua/items/7fc648ce-37aa-4aa3-be7d-d63ccfddf3bd
[15]S. İmece and Ö. F. Beyca, “Demand forecasting with integration of time series and regression models in pharmaceutical industry,” International Journal of Advances in Engineering and Pure Sciences, vol. 34, no. 3, pp. 415–425, 2022.
[16]F. Mbonyinshuti, J. Nkurunziza, J. Niyobuhungiro, and E. Kayitare, “Application of random forest model to predict the demand of essential med,” Pan African Medical Journal, vol. 42, no. 1, 2022, Accessed: Dec. 31, 2025. [Online]. Available: https://www.ajol.info/index.php/pamj/article/view/251571
[17]C. Neba Cyril, “Advancing Retail Predictions: Integrating Diverse Machine Learning Models for Accurate Walmart Sales Forecasting,” SSRN Electronic Journal, 2024, doi: 10.2139/ssrn.4861836.
[18]S. M and M. S, “Exploring the Gradient Boosting and LSTM for Power Distribution-based Time Series Analysis,” Knowledge Transactions on Applied Machine Learning, vol. 01, no. 04, pp. 1–10, Sept. 2023, doi: 10.59567/ktaml.v1.04.01.
[19]F. Divina, A. Gilson, F. Goméz-Vela, M. García Torres, and J. F. Torres, “Stacking ensemble learning for short-term electricity consumption forecasting,” Energies, vol. 11, no. 4, p. 949, 2018.
[20]S. Sang, F. Qu, and P. Nie, “Ensembles of gradient boosting recurrent neural network for time series data prediction,” IEEE Access, 2021, Accessed: Dec. 31, 2025. [Online]. Available: https://ieeexplore.ieee.org/abstract/document/9438681/
[21]A. Kori and N. Gadagin, “INTERPRETABLE FINANCIAL RISK MODELS: LEVERAGING GRADIENT BOOSTING AND FEATURE IMPORTANCE ANALYSIS”, Accessed: Dec. 31, 2025. [Online]. Available: https://www.researchgate.net/profile/Anita-Kori/publication/386000024_INTERPRETABLE_FINANCIAL_RISK_MODELS_LEVERAGING_GRADIENT_BOOSTING_AND_FEATURE_IMPORTANCE_ANALYSIS/links/673ef96fb903016a31cedfa8/INTERPRETABLE-FINANCIAL-RISK-MODELS-LEVERAGING-GRADIENT-BOOSTING-AND-FEATURE-IMPORTANCE-ANALYSIS.pdf
[22]A. Kori and N. Gadagin, “Gradient Boosting for Interpretable Risk Assessment in Finance: A Study on Feature Importance and Model Explainability,” in Proceedings of the 2024 International Conference on Machine Learning and Finance, 2024. Accessed: Dec. 31, 2025. [Online]. Available: https://www.researchgate.net/profile/Anita-Kori/publication/386052952_Gradient_Boosting_for_Interpretable_Risk_Assessment_in_Finance_A_Study_on_Feature_Importance_and_Model_Explainability/links/674173dd7ca4cb2842a3ef7a/Gradient-Boosting-for-Interpretable-Risk-Assessment-in-Finance-A-Study-on-Feature-Importance-and-Model-Explainability.pdf
[23]N. M. Nhat, “Applied Random Forest Algorithm for News and Article Features on The Stock Price Movement: An Empirical Study of The Banking Sector in Vietnam,” Journal of Applied Data Sciences, vol. 5, no. 3, pp. 1311–1324, 2024.
[24]G. Spanos, A. Lalas, K. Votis, and D. Tzovaras, “Principal Component Random Forest for Passenger Demand Forecasting in Cooperative, Connected, and Automated Mobility,” Sustainability, vol. 17, no. 6, p. 2632, 2025.
[25]P. Li, L. Liao, G. Lai, M. Li, and L. Zhang, “An extreme random tree model combining EDA and PCA for predicting PV module temperature method,” in Fifth International Conference on Optoelectronic Science and Materials (ICOSM 2023), SPIE, Feb. 2024, p. 56. doi: 10.1117/12.3016364.
[26]E. Suprihadi, N. Danila, and Z. Ali, “Enhancing financial product forecasting accuracy using EMD and feature selection with ensemble models,” Journal of Open Innovation: Technology, Market, and Complexity, vol. 11, no. 2, p. 100531, 2025.
[27]C. Hu, K. Miao, M. Zhou, Y. Shen, and J. Sun, “Intelligent Performance Degradation Prediction of Light-Duty Gas Turbine Engine Based on Limited Data,” Symmetry, vol. 17, no. 2, p. 277, 2025.
[28]R. Manriquez, S. Kotz, A. Ravignani, and B. De Boer, “Deep Learning on Small Datasets to Classify Mammalian Vocalizations,” in Proceedings of the 10th Convention of the European Acoustics Association Forum Acusticum 2023, European Acoustics Association, Jan. 2024, pp. 4687–4690. doi: 10.61782/fa.2023.1052.
[29]A. B. Amendolara, D. Sant, H. G. Rotstein, and E. Fortune, “LSTM-Based Recurrent Neural Network Provides Effective Short Term Flu Forecasting,” vol. 23, no. 1, p. 1788, May 2023, doi: 10.21203/rs.3.rs-2818892/v1.
[30]I. H. Rather, S. Kumar, and A. H. Gandomi, “Breaking the data barrier: a review of deep learning techniques for democratizing AI with small datasets,” Artificial Intelligence Review, vol. 57, no. 9, p. 226, Aug. 2024, doi: 10.1007/s10462-024-10859-3. 
[31]F. Coppini, Y. Jiang, and S. Tabti, “Predictive models on 1D signals in a small-data environment. 2021, IMB-Institut de Mathématiques de Bordeaux.”
[32]L. Brigato and L. Iocchi, “A Close Look at Deep Learning with Small Data,” in 2020 25th International Conference on Pattern Recognition (ICPR), IEEE, Jan. 2021, pp. 2490–2497. doi: 10.1109/icpr48806.2021.9412492.
[33]A. K. Bashir et al., “Comparative analysis of machine learning algorithms for prediction of smart grid stability †,” 2021.
[34]L. Zheng and H. He, “Share price prediction of aerospace relevant companies with recurrent neural networks based on PCA,” Expert Systems with Applications, vol. 183, p. 115384, Nov. 2021, doi: 10.1016/j.eswa.2021.115384.
[35]J.-L. Wu and P.-C. Chang, “A Trend‐Based Segmentation Method and the Support Vector Regression for Financial Time Series Forecasting,” Mathematical Problems in Engineering, vol. 2012, no. 1, p. 615152, Jan. 2012, doi: 10.1155/2012/615152.
[36]U. N. Chowdhury, S. K. Chakravarty, and Md. T. Hossain, “Short-Term Financial Time Series Forecasting Integrating Principal Component Analysis and Independent Component Analysis with Support Vector Regression,” vol. 6, no. 03, p. 51, 2018, doi: 10.4236/jcc.2018.63004.
[37]U. N. Chowdhury, S. K. Chakravarty, and Md. T. Hossain, “Integration of principal component analysis and support vector regression for financial time series forecasting,” Journal of Computer and Communications, vol. 15, no. 8, pp. 28–32, 2017, doi: 10.4236/jcc.2018.63004.
[38] Ștefan Rusu, M. I. Boloș, and M. Leordeanu, “COMPARATIVE ANALYSIS OF REGRESSION MODELS FOR STOCK PRICE PREDICTION: LINEAR, SUPPORT VECTOR, POLYNOMIAL, AND LASSO,” Journal of Financial Studies, vol. 9, no. 17, pp. 143–156, Nov. 2024, doi: 10.55654/jfs.2024.9.17.09.
[39]A. Kocaoğlu, “Efficient Optimization of a Support Vector Regression Model with Natural Logarithm of the Hyperbolic Cosine Loss Function for Broader Noise Distribution,” Applied Sciences, vol. 14, no. 9, p. 3641, Apr. 2024, doi: 10.3390/app14093641.
[40]M. O. Arowolo, M. O. Adebiyi, A. A. Adebiyi, and O. Olugbara, “Optimized Hybrid Heuristic Based Dimensionality Reduction Methods for Malaria Vector Using KNN Classifier,” Nov. 2020, doi: 10.21203/rs.3.rs-107396/v1.
[41]I. Shivankar, B. Shirsath, B. Bhande, and C. Chandewar, “A Comparative Analysis of a Novel Custom Distance Metric Against Traditional Metrics in KNN Classification,” July 2024, doi: 10.22541/essoar.172019478.88222627/v1.
[42]S. S. Ghosal, Y. Sun, and Y. Li, “How to Overcome Curse-of-Dimensionality for Out-of-Distribution Detection?,” Proceedings of the AAAI Conference on Artificial Intelligence, vol. 38, no. 18, pp. 19849–19857, Mar. 2024, doi: 10.1609/aaai.v38i18.29960. 
[43]M. T. Coppejans, “Breaking the Curse of Dimensionality,” SSRN Electronic Journal, 2000, doi: 10.2139/ssrn.236812.
[44]J. D. A. Santos and G. A. Barreto, “An outlier-robust kernel RLS algorithm for nonlinear system identification,” Nonlinear Dynamics, vol. 90, no. 3, pp. 1707–1726, Aug. 2017, doi: 10.1007/s11071-017-3760-2.
[45]A. Agaal, H. Farkash, M. Essgaer, and A. Ahessin, “Towards Efficient Electricity Management in Benghazi: Forecasting Demand and Load Shedding with ARIMA Models,” Solar Energy and Sustainable Development Journal, vol. 14, no. FICTS-2024, Art. no. FICTS-2024, Jan. 2025, doi: 10.51646/jsesd.v14iFICTS-2024.446.
[46]M. Elmnifi, M. Almaktar, S. Vambol, O. Trush, V. Sydorenko, and V. Mykhailov, “Agricultural waste in Libya as a resource for biochar and methane production: An analytical study,” Ecological Questions, vol. 35, no. 2, pp. 1–20, Dec. 2024, doi: 10.12775/eq.2024.021.
[47]A. Bakeer, “Exploring AI-driven solutions for Libyan agriculture: Current applications and strategic pathways,” World Journal of Advanced Research and Reviews, vol. 24, pp. 1344–1349, Nov. 2024.
[48]A. Agaal, M. Essgaer, and A. Amarrf, “Addressing Class Imbalance for Breast Cancer Prediction in Southern Libya: A Comparative Study of Sampling Techniques,” in Sebha University Conference Proceedings, 2024.
[49]A. Agaal, M. Essgaer, A. Alshareef, H. Alkhadafe, and Y. BenYahmed, “Application of Classification and Regression Tree and Spectral Clustering to Breast Cancer Prediction: Optimizing the Precision-Recall Trade-Off,” in 2023 IEEE 3rd International Maghreb Meeting of the Conference on Sciences and Techniques of Automatic Control and Computer Engineering (MI-STA), 2023, pp. 311–317. doi: 10.1109/MI-STA57575.2023.10169755.
[50]T. Emmanuel, T. Maupong, D. Mpoeleng, T. Semong, B. Mphago, and O. Tabona, “A survey on missing data in machine learning,” Journal of Big data, vol. 8, no. 1, p. 140, 2021.
[51]H. Alimohammadi and S. N. Chen, “Performance evaluation of outlier detection techniques in production timeseries: A systematic review and meta-analysis,” Expert Systems with Applications, vol. 191, p. 116371, 2022.
[52]M. Kvet, “Temporal bi-index,” in 2023 IEEE 32nd International Symposium on Industrial Electronics (ISIE), IEEE, 2023, pp. 1–6.
[53]D. Singh and B. Singh, “Feature wise normalization: An effective way of normalizing data,” Pattern Recognition, vol. 122, p. 108307, 2022.