A Hybrid Machine Learning Model Optimized with Reinforcement Learning–Enhanced Spider Wasp Optimizer for Customer Value Prediction

Document Type : Research Paper

Authors

1 Professor, Department of Information Technology and Operations Management, Faculty of Management and Accounting, Allameh Tabataba'i University, Tehran, Iran.

2 Ph.D. Candidate, Department of Industrial Management, Faculty of Management, University of Tehran, Tehran, Iran.

3 Department of Management, Faculty of Administrative Sciences and Economics,University of Isfahan, Isfahan, Iran.

4 Department of Management, Allameh Askari International University, Tehran, Iran.

10.22111/ijbds.2026.53633.2294

Abstract

Having an accurate estimation of a customer's worth is one of the more important tasks performed by banks in this modern world, especially with the profound number of customers and the complex nature of transactions, along with the massive variance in transactions. In light of this need, we develop a multi-layer stacked ensemble model specifically designed to improve the predictive performance of banking customers in Iran. The first layer consists of 4 different learners (XGBoost, CatBoost, Random Forest, and Gradient Boosting). Each model has its learning capacity and learns from customer behavior and financial characteristics in complementary ways. The second layer consists of a LightGBM classifier, which fuses (by meta-model) the outputs of the first-layer learners into the final prediction. The second set of model hyperparameters were optimized using a Reinforcement Learning (RL)-based SWO to efficiently search for optimal hyperparameters across a high-dimensional space, which is typically not well-explored using classic optimization strategies. Utilizing a repeated 5-fold stratified cross-validation approach, we were able to achieve strong predictive accuracy: Accuracy = 89.70%; Precision = 92.84%; Recall = 92.46%; F-Score = 92.61%; ROC AUC = 0.9632; all of which surpass the single models. Our results provide evidence supporting the successful application of a multi-layer ensemble with metaheuristic hyperparameter optimization in building a viable and powerful customer valuation tool for banks.

Keywords


  1. Amin, M. F. (2022). Confusion Matrix in Binary Classification Problems: A Step-by-Step Tutorial. Journal of Engineering Research - Egypt/Journal of Engineering Research, 6(5), 0. https://doi.org/10.21608/erjeng.2022.274526
  2. Azhari, A., & Utari, N. (2023). Banking Customer Loyalty: Unveiling the role of customer relationship marketing and customer value. Advances in Business & Industrial Marketing Research, 1(2). https://doi.org/10.60079/abim.v1i2.94
  3. Bansal, A., Singh, S., Jain, Y., & Verma, A. (2022). Analysis of ensemble classifiers for bank churn prediction. 2022 International Conference on Computing, Communication, and Intelligent Systems (ICCCIS), XII, 593–598. https://doi.org/10.1109/icccis56430.2022.10037623
  4. Bauer, J., & Jannach, D. (2021). Improved customer lifetime value prediction with Sequence-To-Sequence learning and Feature-Based models. ACM Transactions on Knowledge Discovery From Data, 15(5), 1–37. https://doi.org/10.1145.3441444
  5. Burges, C. J. (1998). A tutorial on support vector machines for pattern recognition. Data mining and knowledge discovery, 2(2), 121-167. https://doi.org/10.1023/a:1009715923555
  6. Carbonero-Ruz, M., Martínez-Estudillo, F. J., Fernández-Navarro, F., Becerra-Alonso, D., & Martínez-Estudillo, A. C. (2017). A two dimensional accuracy-based measure for classification performance. Information Sciences, 382–383, 60–80. https://doi.org/10.1016/j.ins.2016.12.005
  7. Channa, H. S. (2018). Customer Lifetime Value: an ensemble model approach. In Advances in intelligent systems and computing (pp. 353–363). https://doi.org/10.1007.978-981-13-1402-5_27
  8. Darzi, M. A., & Bhat, S. A. (2018). Personnel capability and customer satisfaction as predictors of customer retention in the banking sector. International Journal of Bank Marketing, 36(4), 663–679. https://doi.org/10.1108/ijbm-04-2017-0074
  9. Deng, Y., Li, D., Yang, L., Tang, J., & Zhao, J. (2021). Analysis and prediction of bank user churn based on ensemble learning algorithm. 2021 IEEE International Conference on Power Electronics, Computer Applications (ICPECA). https://doi.org/10.1109/icpeca51329.2021.9362520
  10. Dias, J., Godinho, P., & Torres, P. (2020). Machine learning for customer churn prediction in retail banking. In Lecture notes in computer science (pp. 576–589). https://doi.org/10.1007.978-3-030-58808-3_42
  11. Ejgerdi, N. A., & Kazerooni, M. (2023). A stacked ensemble learning method for customer lifetime value prediction. Kybernetes, 53(7), 2342–2360. https://doi.org/10.1108/k-12-2022-1676
  12. Fawcett, T. (2006). An introduction to ROC analysis. Pattern recognition letters, 27(8), 861-874. https://doi.org/10.1016/j.patrec.2005.10.010
  13. Galal, M., Rady, S., & Aref, M. (2022). Enhancing Customer Churn Prediction in Digital Banking using Ensemble Modeling. 2022 4th Novel Intelligent and Leading Emerging Sciences Conference (NILES). https://doi.org/10.1109/niles56402.2022.9942408
  14. Guyon, I., & Elisseeff, A. (2003). An introduction to variable and feature selection. Journal of machine learning research, 3(Mar), 1157-1182. https://doi.org/10.1007.978-3-540-35488-8_1
  15. Hodge, V., & Austin, J. (2004). A survey of outlier detection methodologies. Artificial intelligence review, 22, 85-126. https://doi.org/10.1023/b:aire.0000045502.10941.a9
  16. Hosmer Jr, D. W., Lemeshow, S., & Sturdivant, R. X. (2013). Applied logistic regression. John Wiley & Sons. https://doi.org/10.1002.9781118548387
  17. Jafarnejad Chaghoshi, A. , Khani, A. M. and Rezasoltani,, A. (2024). Risk Modeling in Banking Services for the Blind Using Fuzzy FMEA and Graph Neural Network (GNN). Journal of Industrial Management Perspective14(4), 223-255. https://doi.org/10.48308/jimp.14.4.223
  18. Jafarnjad, A. , Rezasoltani,, A. and Khani, A. M. (2025). Analyzing and Predicting Hiring Decisions Using Machine Learning and Deep Learning. Journal of Public Administration17(2), 295-327. https://doi.org/10.22059/jipa.2025.390322.3649
  19. Jafarnejad,A. , Rezasoltani,A. and Khani,A. M. (2025). Comparative Analysis of Machine Learning Algorithms in Predicting Jumps in Stock Closing Price: Case Study of Iran Khodro Using NearMiss and SMOTE Approaches. Iranian Journal of Finance9(3), 27-54. https://doi.org/10.30699/ijf.2025.491324.1496
  20. Jafarnejad, A. , Rezasoltani, A. and Khani, A. M. (2025). Predicting Heart Disease Using Automated Machine Learning Based on Genetic Algorithms. Journal of Information Technology Management17(2), 91-122. https://doi.org/10.22059/jitm.2024.382556.3829
  21. Kaewkiriya, T., & Wisaeng, K. (2023). Development of customer predictive model for investment using ensemble learning technique. Journal of Computer Science, 19(6), 775–785. https://doi.org/10.3844/jcssp.2023.775.785
  22. Khani,A. Mohammad, Mohaghar,A. , Rezasoltani,A. and Hosseinian,S. Hoda (2025). Advanced hyperparameter optimization and adaptive synthetic sampling in machine learning for predictive maintenance of industrial machinery. International Journal of Research in Industrial Engineering14(4), 607-629. https://doi.org/10.22105/riej.2025.500994.1528
  23. Ki̇Li̇Mci̇, Z. H. (2022). The effectiveness of homogeneous classifier ensembles on customer churn prediction in banking, insurance, and telecommunication sectors. International Journal of Computational and Experimental Science and Engineering, 8(3), 77–84. https://doi.org/10.22399/ijcesen.1163929
  24. Kotsiantis, S., Kanellopoulos, D., & Pintelas, P. (2006). Handling imbalanced datasets: A review. GESTS international transactions on computer science and engineering, 30(1), 25-36. https://doi.org/10.3233/jcm-2008-8301
  25. Mehregan, M. R. , Rezasoltani, A. , & Khani, A. M. (2025). A Novel Hybrid Machine Learning Model for Defect Prediction in Industrial Manufacturing Processes. Contributions of Science and Technology for Engineering2(4), 43-58. https://doi.org/10.22080/cste.2025.29099.1037
  26. Mehregan, M. R. , Taghavifard, M. T. , Khani, A. M. , Rezasoltani, A. and Nikkhah, M. A. (2025). A Hybrid Machine Learning Model Based on Deep Learning for Air Quality Prediction. Pollution11(4), 1199-1215. https://doi.org/10.22059/poll.2025.388743.2750
  27. Murindanyi, S., Mugalu, B. W., Nakatumba-Nabende, J., & Marvin, G. (2023). Interpretable Machine Learning for Predicting Customer Churn in Retail Banking. 2023 7th International Conference on Trends in Electronics and Informatics (ICOEI), 967-974. https://doi.org/10.1109/icoei56765.2023.10125859
  28. Murindanyi, S., Nagwovuma, M., Nansamba, B., & Marvin, G. (2023). Explainable Ensemble Learning and Trustworthy Open AI for Customer Engagement Prediction in Retail Banking. IC3-2023: Proceedings of the 2023 Fifteenth International Conference on Contemporary Computing. https://doi.org/10.1145.3607947.3607983
  29. Nguyen, T. M., Le, T. A., & Nguyen, T. H. (2023). A flexible framework for customer behavior prediction based on ensemble learning. SOICT ’23: Proceedings of the 12th International Symposium on Information and Communication Technology Pages 126 - 134. https://doi.org/10.1145.3628797.3628973
  30. (2023). A study on customer realationship management in banking sector. International Journal for Multidisciplinary Research, 5(6). https://doi.org/10.36948/ijfmr.2023.v05i06.9203
  31. Oueslati, R., Ouertani, M. W., Manita, G., & Chhabra, A. (2026). Predicting software defects using an extreme gradient boosting model tuned with reinforcement learning based spider wasp optimizer. Automated Software Engineering33(1), 1-60. https://doi.org/10.1007/s10515-025-00572-y
  32. Prusty, S., Patnaik, S., & Dash, S. K. (2022). SKCV: Stratified K-fold cross-validation on ML classifiers for predicting cervical cancer. Frontiers in Nanotechnology, 4, 972421. https://doi.org/10.3389/fnano.2022.972421
  33. Rezasoltani,A. , Jafarnejad,A. and Khani,A. M. (2025). A voting-based hybrid machine learning model for predicting backorders in the supply chain. Journal of Decisions and Operations Research10(1), 194-213. https://doi.org/10.22105/dmor.2025.511401.1924
  34. Rezasoltani, A., Khani, A. M., Kashan, A. H., Agah, S., & Agah, F. (2025). Predicting Primary Biliary Cholangitis Stages Using Machine Learning with Automated Hyperparameter Optimization and Recursive Feature Elimination. Journal of Information Systems and Telecommunication (JIST)3(51), 165. https://doi.org/61882/jist.49352.13.51.165
  35. Shi, S., Tse, R., Luo, W., D’Addona, S., & Pau, G. (2022). Machine learning-driven credit risk: a systemic review. Neural Computing and Applications, 34(17), 14327–14339. https://doi.org/10.1007/s00521-022-07472-2
  36. Subramanian, R. S., Yamini, B., Sudha, K., & Sivakumar, S. (2024). Ensemble-based deep learning techniques for customer churn prediction model. Kybernetes. https://doi.org/10.1108/k-08-2023-1516
  37. Tarkocin, C., & Donduran, M. (2023). Constructing early warning indicators for banks using machine learning models. The North American Journal of Economics and Finance, 69, 102018. https://doi.org/10.1016/j.najef.2023.102018
  38. Takahashi, K., Yamamoto, K., Kuchiba, A., & Koyama, T. (2021). Confidence interval for micro-averaged F1 and macro-averaged F1 scores. Applied Intelligence, 52(5), 4961–4972. https://doi.org/10.1007/s10489-021-02635-5
  39. Tavassoli, S., & Koosha, H. (2021). Hybrid ensemble learning approaches to customer churn prediction. Kybernetes, 51(3), 1062–1088. https://doi.org/10.1108/k-04-2020-0214
  40. Tran, H. D., Le, N., & Nguyen, V. (2023). Customer churn prediction in the banking sector using Machine Learning-Based classification models. Interdisciplinary Journal of Information Knowledge and Management, 18, 087–105. https://doi.org/10.28945.5086
  41. Verma, A., Gupta, A., & Sharma, T. (2024). A study on customer relationship management in banking sector. International Journal for Multidisciplinary Research, 6(3). https://doi.org/10.36948/ijfmr.2024.v06i03.20430
  42. Villmann, T., Kaden, M., Lange, M., Sturmer, P., & Hermann, W. (2014). Precision-Recall-Optimization in Learning Vector Quantization Classifiers for Improved Medical Classification Systems. 2014 IEEE Symposium on Computational Intelligence and Data Mining (CIDM), 71-77. https://doi.org/10.1109/cidm.2014.7008150
  43. Watono, B., Utami, E., & Ariatmanto, D. (2024). Implementation of Stacking Ensemble Learning for Bank Term Deposit Acceptance Classification. 2024 International Conference on Smart Computing, IoT and Machine Learning (SIML). https://doi.org/10.1109/siml61815.2024.10578260
  44. Zhang, Z. (2016). Introduction to machine learning: k-nearest neighbors. Annals of translational medicine, 4(11). https://doi.org/10.21037/atm.2016.03.37