Novel hybrid firefly algorithm: an application to enhance XGBoost tuning for intrusion detection classification

. 2022 ; 8 () : e956. [epub] 20220429

Status PubMed-not-MEDLINE Jazyk angličtina Země Spojené státy americké Médium electronic-ecollection

Typ dokumentu časopisecké články

Perzistentní odkaz   https://www.medvik.cz/link/pmid35634110

The research proposed in this article presents a novel improved version of the widely adopted firefly algorithm and its application for tuning and optimising XGBoost classifier hyper-parameters for network intrusion detection. One of the greatest issues in the domain of network intrusion detection systems are relatively high false positives and false negatives rates. In the proposed study, by using XGBoost classifier optimised with improved firefly algorithm, this challenge is addressed. Based on the established practice from the modern literature, the proposed improved firefly algorithm was first validated on 28 well-known CEC2013 benchmark instances a comparative analysis with the original firefly algorithm and other state-of-the-art metaheuristics was conducted. Afterwards, the devised method was adopted and tested for XGBoost hyper-parameters optimisation and the tuned classifier was tested on the widely used benchmarking NSL-KDD dataset and more recent USNW-NB15 dataset for network intrusion detection. Obtained experimental results prove that the proposed metaheuristics has significant potential in tackling machine learning hyper-parameters optimisation challenge and that it can be used for improving classification accuracy and average precision of network intrusion detection systems.

Zobrazit více v PubMed

Ahmed LAH, Hamad YAM. Machine learning techniques for network-based intrusion detection system: a survey paper. 2021 National Computing Colleges Conference; 2021. pp. 1–7.

Ajdani M, Ghaffary H. Introduced a new method for enhancement of intrusion detection with random forest and PSO algorithm. Security and Privacy. 2021;4(2):e147. doi: 10.1002/spy2.147. DOI

Alatas B. Chaotic bee colony algorithms for global numerical optimization. Expert Systems with Applications. 2010;37(8):5682–5687. doi: 10.1016/j.eswa.2010.02.042. DOI

Bacanin N, Alhazmi K, Zivkovic M, Venkatachalam K, Bezdan T, Nebhen J. Training multi-layer perceptron with enhanced brain storm optimization metaheuristics. Computers, Materials & Continua. 2022a;70(2):4199–4215. doi: 10.32604/cmc.2022.020449. DOI

Bacanin N, Arnaut U, Zivkovic M, Bezdan T, Rashid TA. Energy efficient clustering in wireless sensor networks by opposition-based initialization bat algorithm. Computer Networks and Inventive Communication Technologies; Berlin: Springer; 2022b. pp. 1–16.

Bacanin N, Bezdan T, Tuba E, Strumberger I, Tuba M. Optimizing convolutional neural network hyperparameters by enhanced swarm intelligence metaheuristics. Algorithms. 2020;13(3):67. doi: 10.3390/a13030067. DOI

Bacanin N, Bezdan T, Tuba E, Strumberger I, Tuba M, Zivkovic M. Task scheduling in cloud computing environment by grey wolf optimizer. 2019 27th Telecommunications Forum (TELFOR); Piscataway: IEEE; 2019a. pp. 1–4.

Bacanin N, Bezdan T, Venkatachalam K, Zivkovic M, Strumberger I, Abouhawwash M, Ahmed A. Artificial neural networks hidden unit and weight connection optimization by quasi-refection-based learning artificial bee colony algorithm. IEEE Access. 2021a;9:169135–169155. doi: 10.1109/ACCESS.2021.3135201. DOI

Bacanin N, Bezdan T, Zivkovic M, Chhabra A. Weight optimization in artificial neural network training by improved monarch butterfly algorithm. Mobile Computing and Sustainable Informatics; Berlin: Springer; 2022c. pp. 397–409.

Bacanin N, Petrovic A, Zivkovic M, Bezdan T, Antonijevic M. Feature selection in machine learning by hybrid sine cosine metaheuristics. International Conference on Advances in Computing and Data Sciences; Berlin: Springer; 2021b. pp. 604–616.

Bacanin N, Stoean R, Zivkovic M, Petrovic A, Rashid TA, Bezdan T. Performance of a novel chaotic firefly algorithm with enhanced exploration for tackling global optimization problems: Application for dropout regularization. Mathematics. 2021c;9(21):2705. doi: 10.3390/math9212705. DOI

Bacanin N, Tuba M. Firefly algorithm for cardinality constrained mean-variance portfolio optimization problem with entropy diversity constraint. The Scientific World Journal. 2014;2014(1):1–16. doi: 10.1155/2014/721521. PubMed DOI PMC

Bacanin N, Tuba E, Zivkovic M, Strumberger I, Tuba M. Whale optimization algorithm with exploratory move for wireless sensor networks localization. International Conference on Hybrid Intelligent Systems; Berlin: Springer; 2019b. pp. 328–338.

Bacanin N, Zivkovic M, Bezdan T, Cvetnic D, Gajic L. Dimensionality reduction using hybrid brainstorm optimization algorithm. Proceedings of International Conference on Data Science and Applications; Berlin: Springer; 2022d. pp. 679–692.

Basha J, Bacanin N, Vukobrat N, Zivkovic M, Venkatachalam K, Hubálovský S, Trojovskỳ P. Chaotic harris hawks optimization with quasi-reflection-based learning: an application to enhance CNN design. Sensors. 2021;21(19):6654. doi: 10.3390/s21196654. PubMed DOI PMC

Beheshti Z, Shamsuddin SMH. A review of population-based meta-heuristic algorithms. International Journal of Advances in Soft Computing and its Applications. 2013;5(1):1–35.

Bergstra J, Bardenet R, Bengio Y, Kégl B. Algorithms for hyper-parameter optimization. In: Shawe-Taylor J, Zemel R, Bartlett P, Pereira F, Weinberger KQ, editors. Advances in Neural Information Processing Systems. Vol. 24. New York: Curran Associates, Inc; 2011.

Bezdan T, Milosevic S, Venkatachalam K, Zivkovic M, Bacanin N, Strumberger I. Optimizing convolutional neural network by hybridized elephant herding optimization algorithm for magnetic resonance image classification of glioma brain tumor grade. 2021 Zooming Innovation in Consumer Technologies Conference (ZINC); Piscataway: IEEE; 2021a. pp. 171–176.

Bezdan T, Petrovic A, Zivkovic M, Strumberger I, Devi VK, Bacanin N. Current best opposition-based learning salp swarm algorithm for global numerical optimization. 2021 Zooming Innovation in Consumer Technologies Conference (ZINC); Piscataway: IEEE; 2021b. pp. 5–10.

Bezdan T, Stoean C, Naamany AA, Bacanin N, Rashid TA, Zivkovic M, Venkatachalam K. Hybrid fruit-fly optimization algorithm with K-means for text document clustering. Mathematics. 2021c;9(16):1929. doi: 10.3390/math9161929. DOI

Bezdan T, Zivkovic M, Tuba E, Strumberger I, Bacanin N, Tuba M. Glioma brain tumor grade classification from MRI using convolutional neural networks designed by modified FA. International Conference on Intelligent and Fuzzy Systems; Berlin: Springer; 2020a. pp. 955–963.

Bezdan T, Zivkovic M, Tuba E, Strumberger I, Bacanin N, Tuba M. Multi-objective task scheduling in cloud computing environment by hybridized bat algorithm. International Conference on Intelligent and Fuzzy Systems; Berlin: Springer; 2020b. pp. 718–725.

Bhati BS, Chugh G, Al-Turjman F, Bhati NS. An improved ensemble based intrusion detection technique using XGBoost. Transactions on Emerging Telecommunications Technologies. 2021;32(6):e4076. doi: 10.1002/ett.4076. DOI

Chen T, Guestrin C. XGBoost: a scalable tree boosting system. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining; 2016. pp. 785–794.

Chen H, Xu Y, Wang M, Zhao X. A balanced whale optimization algorithm for constrained engineering design problems. Applied Mathematical Modelling. 2019;71:45–59. doi: 10.1016/j.apm.2019.02.004. DOI

Cuk A, Bezdan T, Bacanin N, Zivkovic M, Venkatachalam K, Rashid TA, Devi VK. Feedforward multi-layer perceptron training by hybridized method between genetic algorithm and artificial bee colony. Data Science and Data Analytics. 2021;279:1–14. doi: 10.1201/9781003111290. DOI

Dhaliwal SS, Nahid A-A, Abbas R. Effective intrusion detection system using XGBoost. Information. 2018;9(7):149. doi: 10.3390/info9070149. DOI

Dhanabal L, Shantharajah S. A study of NSL-KDD dataset for intrusion detection system based on classification algorithms. International Journal of Advanced Research in Computer and Communication Engineering. 2015;4(6):446–452.

Dorigo M, Birattari M, Stutzle T. Ant colony optimization. IEEE Computational Intelligence Magazine. 2006;1(4):28–39. doi: 10.1109/MCI.2006.329691. DOI

dos Santos Coelho L, Mariani VC. Use of chaotic sequences in a biologically inspired algorithm for engineering design optimization. Expert Systems with Applications. 2008;34(3):1905–1913. doi: 10.1016/j.eswa.2007.02.002. DOI

Friedman M. The use of ranks to avoid the assumption of normality implicit in the analysis of variance. Journal of the American Statistical Association. 1937;32(200):675–701. doi: 10.1080/01621459.1937.10503522. DOI

Friedman M. A comparison of alternative tests of significance for the problem of M rankings. The Annals of Mathematical Statistics. 1940;11(1):86–92. doi: 10.1214/aoms/1177731944. DOI

Gajic L, Cvetnic D, Zivkovic M, Bezdan T, Bacanin N, Milosevic S. Multi-layer perceptron training using hybridized bat algorithm. Computational Vision and Bio-Inspired Computing; Berlin: Springer; 2021. pp. 689–705.

Govindarajan M, Chandrasekaran R. Intrusion detection using k-nearest neighbor. 2009 First International Conference on Advanced Computing; Piscataway: IEEE; 2009. pp. 13–20.

Haupt RL, Haupt SE. Practical genetic algorithms. Hoboken: John Wiley & Sons; 2004.

Heidari AA, Mirjalili S, Faris H, Aljarah I, Mafarja M, Chen H. Harris hawks optimization: algorithm and applications. Future Generation Computer Systems. 2019;97:849–872. doi: 10.1016/j.future.2019.02.028. DOI

Hodo E, Bellekens X, Hamilton A, Dubouilh P-L, Iorkyase E, Tachtatzis C, Atkinson R. Threat analysis of IoT networks using artificial neural network intrusion detection system. 2016 International Symposium on Networks, Computers and Communications; Piscataway: IEEE; 2016. pp. 1–6.

Igel C, Hansen N, Roth S. Covariance matrix adaptation for multi-objective optimization. Evolutionary Computation. 2007;15(1):1–28. doi: 10.1162/evco.2007.15.1.1. PubMed DOI

Ikram ST, Cherukuri AK, Poorva B, Ushasree PS, Zhang Y, Liu X, Li G. Anomaly detection using XGBoost ensemble of deep neural network models. Cybernetics and Information Technologies. 2021;21(3):175–188. doi: 10.2478/cait-2021-0037. DOI

Iman RL, Davenport JM. Approximations of the critical region of the fbietkan statistic. Communications in Statistics-Theory and Methods. 1980;9(6):571–595. doi: 10.1080/03610928008827904. DOI

Jiang H, He Z, Ye G, Zhang H. Network intrusion detection based on PSO-XGBoost model. IEEE Access. 2020;8:58392–58401. doi: 10.1109/ACCESS.2020.2982418. DOI

Jnr EO-N, Ziggah YY, Relvas S. Hybrid ensemble intelligent model based on wavelet transform, swarm intelligence and artificial neural network for electricity demand forecasting. Sustainable Cities and Society. 2021;66(4):102679. doi: 10.1016/j.scs.2020.102679. DOI

Jordan MI, Mitchell TM. Machine learning: trends, perspectives, and prospects. Science. 2015;349(6245):255–260. doi: 10.1126/science.aaa8415. PubMed DOI

Karaboga D, Basturk B. A powerful and efficient algorithm for numerical function optimization: artificial bee colony (ABC) algorithm. Journal of Global Optimization. 2007;39(3):459–471. doi: 10.1007/s10898-007-9149-x. DOI

Karaboga D, Basturk B. On the performance of artificial bee colony (ABC) algorithm. Applied Soft Computing. 2008;8(1):687–697. doi: 10.1016/j.asoc.2007.05.007. DOI

Karaboğa D, Okdem S. A simple and global optimization algorithm for engineering problems: differential evolution algorithm. Turkish Journal of Electrical Engineering and Computer Sciences. 2004;12:53–60.

Kasongo SM, Sun Y. Performance analysis of intrusion detection systems using a feature selection method on the UNSW-NB15 dataset. Journal of Big Data. 2020;7(1):1–20. doi: 10.1186/s40537-020-00379-6. DOI

Kennedy J, Eberhart R. Particle swarm optimization. Proceedings of ICNN’95-International Conference on Neural Networks; Piscataway: IEEE; 1995. pp. 1942–1948.

Li C, Zhou J, Xiao J, Xiao H. Parameters identification of chaotic system by chaotic gravitational search algorithm. Chaos, Solitons & Fractals. 2012;45(4):539–547. doi: 10.1016/j.chaos.2012.02.005. DOI

Liang X, Cai Z, Wang M, Zhao X, Chen H, Li C. Chaotic oppositional sine–cosine method for solving global optimization problems. Engineering with Computers. 2020;38:1–17. doi: 10.1007/s00366-020-01083-y. DOI

Lichtblau D, Stoean C. Cancer diagnosis through a tandem of classifiers for digitized histopathological slides. PLOS ONE. 2019;14(1):1–20. doi: 10.1371/journal.pone.0209274. PubMed DOI PMC

Mavrovouniotis M, Li C, Yang S. A survey of swarm intelligence for dynamic optimization: algorithms and applications. Swarm and Evolutionary Computation. 2017;33:1–17. doi: 10.1016/j.swevo.2016.12.005. DOI

Milosevic S, Bezdan T, Zivkovic M, Bacanin N, Strumberger I, Tuba M. Feed-forward neural network training by hybrid bat algorithm. Modelling and Development of Intelligent Systems: 7th International Conference, MDIS 2020, Sibiu, Romania, October 22–24, 2020, Revised Selected Papers; Berlin: Springer International Publishing; 2021. pp. 52–66.

Mirjalili S. SCA: a sine cosine algorithm for solving optimization problems. Knowledge-Based Systems. 2016;96:120–133. doi: 10.1016/j.knosys.2015.12.022. DOI

Mirjalili S, Lewis A. The whale optimization algorithm. Advances in Engineering Software. 2016;95:51–67. doi: 10.1016/j.advengsoft.2016.01.008. DOI

Mirjalili S, Mirjalili SM, Lewis A. Grey wolf optimizer. Advances in Engineering Software. 2014;69:46–61. doi: 10.1016/j.advengsoft.2013.12.007. DOI

Moradi P, Imanian N, Qader NN, Jalili M. Improving exploration property of velocity-based artificial bee colony algorithm using chaotic systems. Information Sciences. 2018;465:130–143. doi: 10.1016/j.ins.2018.06.064. DOI

Moustafa N, Slay J. UNSW-NB15: a comprehensive data set for network intrusion detection systems (UNSW-NB15 network data set). 2015 Military Communications and Information Systems Conference (MilCIS); Piscataway: IEEE; 2015. pp. 1–6.

Moustafa N, Slay J. The evaluation of network anomaly detection systems: statistical analysis of the UNSW-NB15 data set and the comparison with the KDD99 data set. Information Security Journal: A Global Perspective. 2016;25(1–3):18–31. doi: 10.1080/19393555.2015.1125974. DOI

Mugunthan S. Soft computing based autonomous low rate DDOS attack detection and security for cloud computing. Journal of Soft Computing Paradigm. 2019;1(2):80–90. doi: 10.36548/jscp.2019.2.003. DOI

Mukherjee S, Sharma N. Intrusion detection using Naive Bayes classifier with feature reduction. Procedia Technology. 2012;4(7–8):119–128. doi: 10.1016/j.protcy.2012.05.017. DOI

Mukkamala S, Janoski G, Sung A. Intrusion detection using neural networks and support vector machines. Proceedings of the 2002 International Joint Conference on Neural Networks. IJCNN’02 (Cat. No. 02CH37290); Piscataway: IEEE; 2002. pp. 1702–1707.

Neupane K, Haddad R, Chen L. Next generation firewall for network security: a survey. SoutheastCon 2018; Piscataway: IEEE; 2018. pp. 1–6.

Patel A, Qassim Q, Wills C. A survey of intrusion detection and prevention systems. Information Management & Computer Security. 2010;18(4):277–290. doi: 10.1108/09685221011079199. DOI

Postavaru S, Stoean R, Stoean C, Caparros GJ. Adaptation of deep convolutional neural networks for cancer grading from histopathological images. In: Rojas I, Joya G, Catala A, editors. Advances in Computational Intelligence. Cham: Springer International Publishing; 2017. pp. 38–49.

Protić DD. Review of KDD Cup ’99, NSL-KDD and Kyoto 2006+ datasets. Vojnotehnicki Glasnik. 2018;66(3):580–596. doi: 10.5937/vojtehg66-16670. DOI

Rashedi E, Nezamabadi-pour H. Improving the precision of CBIR systems by feature selection using binary gravitational search algorithm. The 16th CSI International Symposium on Artificial Intelligence and Signal Processing (AISP 2012); 2012. pp. 39–42.

Sarafrazi S, Nezamabadi-pour H, Saryazdi S. Disruption: a new operator in gravitational search algorithm. Scientia Iranica. 2011;18(3):539–548. doi: 10.1016/j.scient.2011.04.003. DOI

Sathesh A. Enhanced soft computing approaches for intrusion detection schemes in social media networks. Journal of Soft Computing Paradigm. 2019;1(2):69–79. doi: 10.36548/jscp.2019.2.002. DOI

Shams M, Rashedi E, Hakimi A. Clustered-gravitational search algorithm and its application in parameter optimization of a low noise amplifier. Applied Mathematics and Computation. 2015;258:436–453. doi: 10.1016/j.amc.2015.02.020. DOI

Sheskin DJ. Handbook of parametric and nonparametric statistical procedures. Boca Raton: Chapman and Hall/CRC; 2020.

Sofaer HR, Hoeting JA, Jarnevich CS. The area under the precision-recall curve as a performance metric for rare binary events. Methods in Ecology and Evolution. 2019;10(4):565–577. doi: 10.1111/2041-210X.13140. DOI

Spall JC. Stochastic optimization. Handbook of Computational Statistics. 2011:173–201. doi: 10.1007/978-3-642-21551-3_7. DOI

Stoean R. Analysis on the potential of an EA—surrogate modelling tandem for deep learning parametrization: an example for cancer classification from medical images. Neural Computing and Applications. 2018;32(2):313–322. doi: 10.1007/s00521-018-3709-5. DOI

Strumberger I, Bacanin N, Tuba M. Enhanced firefly algorithm for constrained numerical optimization. 2017 IEEE Congress on Evolutionary Computation (CEC); Piscataway: IEEE; 2017. pp. 2120–2127.

Strumberger I, Tuba E, Bacanin N, Zivkovic M, Beko M, Tuba M. Designing convolutional neural network architecture by the firefly algorithm. 2019 International Young Engineers Forum (YEF-ECE); Piscataway: IEEE; 2019. pp. 59–65.

Tama BA, Lim S. Ensemble learning for intrusion detection systems: a systematic mapping study and cross-benchmark evaluation. Computer Science Review. 2021;39(1):100357. doi: 10.1016/j.cosrev.2020.100357. DOI

Tavallaee M, Bagheri E, Lu W, Ghorbani AA. A detailed analysis of the KDD cup 99 data set. 2009 IEEE Symposium on Computational Intelligence for Security and Defense Applications; Piscataway: IEEE; 2009. pp. 1–6.

Thaseen IS, Kumar CA. Intrusion detection model using fusion of chi-square feature selection and multi class SVM. Journal of King Saud University-Computer and Information Sciences. 2017;29(4):462–472. doi: 10.1016/j.jksuci.2015.12.004. DOI

Verwoerd T, Hunt R. Intrusion detection techniques and approaches. Computer Communications. 2002;25(15):1356–1365. doi: 10.1016/S0140-3664(02)00037-3. DOI

Wang G-G. Moth search algorithm: a bio-inspired metaheuristic algorithm for global optimization problems. Memetic Computing. 2018;10(2):151–164. doi: 10.1007/s12293-016-0212-3. DOI

Wang G-G, Deb S, Cui Z. Monarch butterfly optimization. Neural Computing and Applications. 2019;31(7):1995–2014. doi: 10.1007/s00521-015-1923-y. DOI

Wang H, Zhou X, Sun H, Yu X, Zhao J, Zhang H, Cui L. Firefly algorithm with adaptive control parameters. Soft Computing. 2017;21(17):5091–5102. doi: 10.1007/s00500-016-2104-3. DOI

Xu G-H, Zhang T-W, Lai Q. A new firefly algorithm with mean condition partial attraction. Applied Intelligence. 2021;52:1–14. doi: 10.1007/s10489-021-02642-6. DOI

Yang X-S. Firefly algorithms for multimodal optimization. In: Watanabe O, Zeugmann T, editors. Stochastic Algorithms: Foundations and Applications. Berlin, Berlin Heidelberg: Springer; 2009. pp. 169–178.

Yang X-S, He X. Firefly algorithm: recent advances and applications. International Journal of Swarm Intelligence. 2013;1(1):36–50. doi: 10.1504/IJSI.2013.055801. DOI

Zandevakili H, Rashedi E, Mahani A. Gravitational search algorithm with both attractive and repulsive forces. Soft Computing. 2019;23(3):1–43. doi: 10.1007/s00500-017-2785-2. DOI

Zivkovic M, Bacanin N, Tuba E, Strumberger I, Bezdan T, Tuba M. Wireless sensor networks life time optimization based on the improved firefly algorithm. 2020 International Wireless Communications and Mobile Computing (IWCMC); Piscataway: IEEE; 2020a. pp. 1176–1181.

Zivkovic M, Bacanin N, Venkatachalam K, Nayyar A, Djordjevic A, Strumberger I, Al-Turjman F. COVID-19 cases prediction by using hybrid machine learning and beetle antennae search approach. Sustainable Cities and Society. 2021a;66(3):102669. doi: 10.1016/j.scs.2020.102669. PubMed DOI PMC

Zivkovic M, Bacanin N, Zivkovic T, Strumberger I, Tuba E, Tuba M. Enhanced grey wolf algorithm for energy efficient wireless sensor networks. 2020 Zooming Innovation in Consumer Technologies Conference (ZINC); Piscataway: IEEE; 2020b. pp. 87–92.

Zivkovic M, Bezdan T, Strumberger I, Bacanin N, Venkatachalam K. Improved harris hawks optimization algorithm for workflow scheduling challenge in cloud–edge environment. Computer Networks, Big Data and IoT; Berlin: Springer; 2021b. pp. 87–102.

Zivkovic M, Venkatachalam K, Bacanin N, Djordjevic A, Antonijevic M, Strumberger I, Rashid TA. Hybrid genetic algorithm and machine learning method for COVID-19 cases prediction. Proceedings of International Conference on Sustainable Expert Systems: ICSES 2020; Berlin: Springer Nature; 2021c. p. 169.

Zivkovic M, Zivkovic T, Venkatachalam K, Bacanin N. Enhanced dragonfly algorithm adapted for wireless sensor network lifetime optimization. Data Intelligence and Cognitive Informatics; Berlin: Springer; 2021d. pp. 803–817.

Najít záznam

Citační ukazatele

Nahrávání dat ...

Možnosti archivace

Nahrávání dat ...