Novel hybrid firefly algorithm: an application to enhance XGBoost tuning for intrusion detection classification
Status PubMed-not-MEDLINE Jazyk angličtina Země Spojené státy americké Médium electronic-ecollection
Typ dokumentu časopisecké články
PubMed
35634110
PubMed Central
PMC9137854
DOI
10.7717/peerj-cs.956
PII: cs-956
Knihovny.cz E-zdroje
- Klíčová slova
- Benchmark, Firefly algorithm, Intrusion detection, Machine learning, Optimisation,
- Publikační typ
- časopisecké články MeSH
The research proposed in this article presents a novel improved version of the widely adopted firefly algorithm and its application for tuning and optimising XGBoost classifier hyper-parameters for network intrusion detection. One of the greatest issues in the domain of network intrusion detection systems are relatively high false positives and false negatives rates. In the proposed study, by using XGBoost classifier optimised with improved firefly algorithm, this challenge is addressed. Based on the established practice from the modern literature, the proposed improved firefly algorithm was first validated on 28 well-known CEC2013 benchmark instances a comparative analysis with the original firefly algorithm and other state-of-the-art metaheuristics was conducted. Afterwards, the devised method was adopted and tested for XGBoost hyper-parameters optimisation and the tuned classifier was tested on the widely used benchmarking NSL-KDD dataset and more recent USNW-NB15 dataset for network intrusion detection. Obtained experimental results prove that the proposed metaheuristics has significant potential in tackling machine learning hyper-parameters optimisation challenge and that it can be used for improving classification accuracy and average precision of network intrusion detection systems.
Zobrazit více v PubMed
Ahmed LAH, Hamad YAM. Machine learning techniques for network-based intrusion detection system: a survey paper. 2021 National Computing Colleges Conference; 2021. pp. 1–7.
Ajdani M, Ghaffary H. Introduced a new method for enhancement of intrusion detection with random forest and PSO algorithm. Security and Privacy. 2021;4(2):e147. doi: 10.1002/spy2.147. DOI
Alatas B. Chaotic bee colony algorithms for global numerical optimization. Expert Systems with Applications. 2010;37(8):5682–5687. doi: 10.1016/j.eswa.2010.02.042. DOI
Bacanin N, Alhazmi K, Zivkovic M, Venkatachalam K, Bezdan T, Nebhen J. Training multi-layer perceptron with enhanced brain storm optimization metaheuristics. Computers, Materials & Continua. 2022a;70(2):4199–4215. doi: 10.32604/cmc.2022.020449. DOI
Bacanin N, Arnaut U, Zivkovic M, Bezdan T, Rashid TA. Energy efficient clustering in wireless sensor networks by opposition-based initialization bat algorithm. Computer Networks and Inventive Communication Technologies; Berlin: Springer; 2022b. pp. 1–16.
Bacanin N, Bezdan T, Tuba E, Strumberger I, Tuba M. Optimizing convolutional neural network hyperparameters by enhanced swarm intelligence metaheuristics. Algorithms. 2020;13(3):67. doi: 10.3390/a13030067. DOI
Bacanin N, Bezdan T, Tuba E, Strumberger I, Tuba M, Zivkovic M. Task scheduling in cloud computing environment by grey wolf optimizer. 2019 27th Telecommunications Forum (TELFOR); Piscataway: IEEE; 2019a. pp. 1–4.
Bacanin N, Bezdan T, Venkatachalam K, Zivkovic M, Strumberger I, Abouhawwash M, Ahmed A. Artificial neural networks hidden unit and weight connection optimization by quasi-refection-based learning artificial bee colony algorithm. IEEE Access. 2021a;9:169135–169155. doi: 10.1109/ACCESS.2021.3135201. DOI
Bacanin N, Bezdan T, Zivkovic M, Chhabra A. Weight optimization in artificial neural network training by improved monarch butterfly algorithm. Mobile Computing and Sustainable Informatics; Berlin: Springer; 2022c. pp. 397–409.
Bacanin N, Petrovic A, Zivkovic M, Bezdan T, Antonijevic M. Feature selection in machine learning by hybrid sine cosine metaheuristics. International Conference on Advances in Computing and Data Sciences; Berlin: Springer; 2021b. pp. 604–616.
Bacanin N, Stoean R, Zivkovic M, Petrovic A, Rashid TA, Bezdan T. Performance of a novel chaotic firefly algorithm with enhanced exploration for tackling global optimization problems: Application for dropout regularization. Mathematics. 2021c;9(21):2705. doi: 10.3390/math9212705. DOI
Bacanin N, Tuba M. Firefly algorithm for cardinality constrained mean-variance portfolio optimization problem with entropy diversity constraint. The Scientific World Journal. 2014;2014(1):1–16. doi: 10.1155/2014/721521. PubMed DOI PMC
Bacanin N, Tuba E, Zivkovic M, Strumberger I, Tuba M. Whale optimization algorithm with exploratory move for wireless sensor networks localization. International Conference on Hybrid Intelligent Systems; Berlin: Springer; 2019b. pp. 328–338.
Bacanin N, Zivkovic M, Bezdan T, Cvetnic D, Gajic L. Dimensionality reduction using hybrid brainstorm optimization algorithm. Proceedings of International Conference on Data Science and Applications; Berlin: Springer; 2022d. pp. 679–692.
Basha J, Bacanin N, Vukobrat N, Zivkovic M, Venkatachalam K, Hubálovský S, Trojovskỳ P. Chaotic harris hawks optimization with quasi-reflection-based learning: an application to enhance CNN design. Sensors. 2021;21(19):6654. doi: 10.3390/s21196654. PubMed DOI PMC
Beheshti Z, Shamsuddin SMH. A review of population-based meta-heuristic algorithms. International Journal of Advances in Soft Computing and its Applications. 2013;5(1):1–35.
Bergstra J, Bardenet R, Bengio Y, Kégl B. Algorithms for hyper-parameter optimization. In: Shawe-Taylor J, Zemel R, Bartlett P, Pereira F, Weinberger KQ, editors. Advances in Neural Information Processing Systems. Vol. 24. New York: Curran Associates, Inc; 2011.
Bezdan T, Milosevic S, Venkatachalam K, Zivkovic M, Bacanin N, Strumberger I. Optimizing convolutional neural network by hybridized elephant herding optimization algorithm for magnetic resonance image classification of glioma brain tumor grade. 2021 Zooming Innovation in Consumer Technologies Conference (ZINC); Piscataway: IEEE; 2021a. pp. 171–176.
Bezdan T, Petrovic A, Zivkovic M, Strumberger I, Devi VK, Bacanin N. Current best opposition-based learning salp swarm algorithm for global numerical optimization. 2021 Zooming Innovation in Consumer Technologies Conference (ZINC); Piscataway: IEEE; 2021b. pp. 5–10.
Bezdan T, Stoean C, Naamany AA, Bacanin N, Rashid TA, Zivkovic M, Venkatachalam K. Hybrid fruit-fly optimization algorithm with K-means for text document clustering. Mathematics. 2021c;9(16):1929. doi: 10.3390/math9161929. DOI
Bezdan T, Zivkovic M, Tuba E, Strumberger I, Bacanin N, Tuba M. Glioma brain tumor grade classification from MRI using convolutional neural networks designed by modified FA. International Conference on Intelligent and Fuzzy Systems; Berlin: Springer; 2020a. pp. 955–963.
Bezdan T, Zivkovic M, Tuba E, Strumberger I, Bacanin N, Tuba M. Multi-objective task scheduling in cloud computing environment by hybridized bat algorithm. International Conference on Intelligent and Fuzzy Systems; Berlin: Springer; 2020b. pp. 718–725.
Bhati BS, Chugh G, Al-Turjman F, Bhati NS. An improved ensemble based intrusion detection technique using XGBoost. Transactions on Emerging Telecommunications Technologies. 2021;32(6):e4076. doi: 10.1002/ett.4076. DOI
Chen T, Guestrin C. XGBoost: a scalable tree boosting system. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining; 2016. pp. 785–794.
Chen H, Xu Y, Wang M, Zhao X. A balanced whale optimization algorithm for constrained engineering design problems. Applied Mathematical Modelling. 2019;71:45–59. doi: 10.1016/j.apm.2019.02.004. DOI
Cuk A, Bezdan T, Bacanin N, Zivkovic M, Venkatachalam K, Rashid TA, Devi VK. Feedforward multi-layer perceptron training by hybridized method between genetic algorithm and artificial bee colony. Data Science and Data Analytics. 2021;279:1–14. doi: 10.1201/9781003111290. DOI
Dhaliwal SS, Nahid A-A, Abbas R. Effective intrusion detection system using XGBoost. Information. 2018;9(7):149. doi: 10.3390/info9070149. DOI
Dhanabal L, Shantharajah S. A study of NSL-KDD dataset for intrusion detection system based on classification algorithms. International Journal of Advanced Research in Computer and Communication Engineering. 2015;4(6):446–452.
Dorigo M, Birattari M, Stutzle T. Ant colony optimization. IEEE Computational Intelligence Magazine. 2006;1(4):28–39. doi: 10.1109/MCI.2006.329691. DOI
dos Santos Coelho L, Mariani VC. Use of chaotic sequences in a biologically inspired algorithm for engineering design optimization. Expert Systems with Applications. 2008;34(3):1905–1913. doi: 10.1016/j.eswa.2007.02.002. DOI
Friedman M. The use of ranks to avoid the assumption of normality implicit in the analysis of variance. Journal of the American Statistical Association. 1937;32(200):675–701. doi: 10.1080/01621459.1937.10503522. DOI
Friedman M. A comparison of alternative tests of significance for the problem of M rankings. The Annals of Mathematical Statistics. 1940;11(1):86–92. doi: 10.1214/aoms/1177731944. DOI
Gajic L, Cvetnic D, Zivkovic M, Bezdan T, Bacanin N, Milosevic S. Multi-layer perceptron training using hybridized bat algorithm. Computational Vision and Bio-Inspired Computing; Berlin: Springer; 2021. pp. 689–705.
Govindarajan M, Chandrasekaran R. Intrusion detection using k-nearest neighbor. 2009 First International Conference on Advanced Computing; Piscataway: IEEE; 2009. pp. 13–20.
Haupt RL, Haupt SE. Practical genetic algorithms. Hoboken: John Wiley & Sons; 2004.
Heidari AA, Mirjalili S, Faris H, Aljarah I, Mafarja M, Chen H. Harris hawks optimization: algorithm and applications. Future Generation Computer Systems. 2019;97:849–872. doi: 10.1016/j.future.2019.02.028. DOI
Hodo E, Bellekens X, Hamilton A, Dubouilh P-L, Iorkyase E, Tachtatzis C, Atkinson R. Threat analysis of IoT networks using artificial neural network intrusion detection system. 2016 International Symposium on Networks, Computers and Communications; Piscataway: IEEE; 2016. pp. 1–6.
Igel C, Hansen N, Roth S. Covariance matrix adaptation for multi-objective optimization. Evolutionary Computation. 2007;15(1):1–28. doi: 10.1162/evco.2007.15.1.1. PubMed DOI
Ikram ST, Cherukuri AK, Poorva B, Ushasree PS, Zhang Y, Liu X, Li G. Anomaly detection using XGBoost ensemble of deep neural network models. Cybernetics and Information Technologies. 2021;21(3):175–188. doi: 10.2478/cait-2021-0037. DOI
Iman RL, Davenport JM. Approximations of the critical region of the fbietkan statistic. Communications in Statistics-Theory and Methods. 1980;9(6):571–595. doi: 10.1080/03610928008827904. DOI
Jiang H, He Z, Ye G, Zhang H. Network intrusion detection based on PSO-XGBoost model. IEEE Access. 2020;8:58392–58401. doi: 10.1109/ACCESS.2020.2982418. DOI
Jnr EO-N, Ziggah YY, Relvas S. Hybrid ensemble intelligent model based on wavelet transform, swarm intelligence and artificial neural network for electricity demand forecasting. Sustainable Cities and Society. 2021;66(4):102679. doi: 10.1016/j.scs.2020.102679. DOI
Jordan MI, Mitchell TM. Machine learning: trends, perspectives, and prospects. Science. 2015;349(6245):255–260. doi: 10.1126/science.aaa8415. PubMed DOI
Karaboga D, Basturk B. A powerful and efficient algorithm for numerical function optimization: artificial bee colony (ABC) algorithm. Journal of Global Optimization. 2007;39(3):459–471. doi: 10.1007/s10898-007-9149-x. DOI
Karaboga D, Basturk B. On the performance of artificial bee colony (ABC) algorithm. Applied Soft Computing. 2008;8(1):687–697. doi: 10.1016/j.asoc.2007.05.007. DOI
Karaboğa D, Okdem S. A simple and global optimization algorithm for engineering problems: differential evolution algorithm. Turkish Journal of Electrical Engineering and Computer Sciences. 2004;12:53–60.
Kasongo SM, Sun Y. Performance analysis of intrusion detection systems using a feature selection method on the UNSW-NB15 dataset. Journal of Big Data. 2020;7(1):1–20. doi: 10.1186/s40537-020-00379-6. DOI
Kennedy J, Eberhart R. Particle swarm optimization. Proceedings of ICNN’95-International Conference on Neural Networks; Piscataway: IEEE; 1995. pp. 1942–1948.
Li C, Zhou J, Xiao J, Xiao H. Parameters identification of chaotic system by chaotic gravitational search algorithm. Chaos, Solitons & Fractals. 2012;45(4):539–547. doi: 10.1016/j.chaos.2012.02.005. DOI
Liang X, Cai Z, Wang M, Zhao X, Chen H, Li C. Chaotic oppositional sine–cosine method for solving global optimization problems. Engineering with Computers. 2020;38:1–17. doi: 10.1007/s00366-020-01083-y. DOI
Lichtblau D, Stoean C. Cancer diagnosis through a tandem of classifiers for digitized histopathological slides. PLOS ONE. 2019;14(1):1–20. doi: 10.1371/journal.pone.0209274. PubMed DOI PMC
Mavrovouniotis M, Li C, Yang S. A survey of swarm intelligence for dynamic optimization: algorithms and applications. Swarm and Evolutionary Computation. 2017;33:1–17. doi: 10.1016/j.swevo.2016.12.005. DOI
Milosevic S, Bezdan T, Zivkovic M, Bacanin N, Strumberger I, Tuba M. Feed-forward neural network training by hybrid bat algorithm. Modelling and Development of Intelligent Systems: 7th International Conference, MDIS 2020, Sibiu, Romania, October 22–24, 2020, Revised Selected Papers; Berlin: Springer International Publishing; 2021. pp. 52–66.
Mirjalili S. SCA: a sine cosine algorithm for solving optimization problems. Knowledge-Based Systems. 2016;96:120–133. doi: 10.1016/j.knosys.2015.12.022. DOI
Mirjalili S, Lewis A. The whale optimization algorithm. Advances in Engineering Software. 2016;95:51–67. doi: 10.1016/j.advengsoft.2016.01.008. DOI
Mirjalili S, Mirjalili SM, Lewis A. Grey wolf optimizer. Advances in Engineering Software. 2014;69:46–61. doi: 10.1016/j.advengsoft.2013.12.007. DOI
Moradi P, Imanian N, Qader NN, Jalili M. Improving exploration property of velocity-based artificial bee colony algorithm using chaotic systems. Information Sciences. 2018;465:130–143. doi: 10.1016/j.ins.2018.06.064. DOI
Moustafa N, Slay J. UNSW-NB15: a comprehensive data set for network intrusion detection systems (UNSW-NB15 network data set). 2015 Military Communications and Information Systems Conference (MilCIS); Piscataway: IEEE; 2015. pp. 1–6.
Moustafa N, Slay J. The evaluation of network anomaly detection systems: statistical analysis of the UNSW-NB15 data set and the comparison with the KDD99 data set. Information Security Journal: A Global Perspective. 2016;25(1–3):18–31. doi: 10.1080/19393555.2015.1125974. DOI
Mugunthan S. Soft computing based autonomous low rate DDOS attack detection and security for cloud computing. Journal of Soft Computing Paradigm. 2019;1(2):80–90. doi: 10.36548/jscp.2019.2.003. DOI
Mukherjee S, Sharma N. Intrusion detection using Naive Bayes classifier with feature reduction. Procedia Technology. 2012;4(7–8):119–128. doi: 10.1016/j.protcy.2012.05.017. DOI
Mukkamala S, Janoski G, Sung A. Intrusion detection using neural networks and support vector machines. Proceedings of the 2002 International Joint Conference on Neural Networks. IJCNN’02 (Cat. No. 02CH37290); Piscataway: IEEE; 2002. pp. 1702–1707.
Neupane K, Haddad R, Chen L. Next generation firewall for network security: a survey. SoutheastCon 2018; Piscataway: IEEE; 2018. pp. 1–6.
Patel A, Qassim Q, Wills C. A survey of intrusion detection and prevention systems. Information Management & Computer Security. 2010;18(4):277–290. doi: 10.1108/09685221011079199. DOI
Postavaru S, Stoean R, Stoean C, Caparros GJ. Adaptation of deep convolutional neural networks for cancer grading from histopathological images. In: Rojas I, Joya G, Catala A, editors. Advances in Computational Intelligence. Cham: Springer International Publishing; 2017. pp. 38–49.
Protić DD. Review of KDD Cup ’99, NSL-KDD and Kyoto 2006+ datasets. Vojnotehnicki Glasnik. 2018;66(3):580–596. doi: 10.5937/vojtehg66-16670. DOI
Rashedi E, Nezamabadi-pour H. Improving the precision of CBIR systems by feature selection using binary gravitational search algorithm. The 16th CSI International Symposium on Artificial Intelligence and Signal Processing (AISP 2012); 2012. pp. 39–42.
Sarafrazi S, Nezamabadi-pour H, Saryazdi S. Disruption: a new operator in gravitational search algorithm. Scientia Iranica. 2011;18(3):539–548. doi: 10.1016/j.scient.2011.04.003. DOI
Sathesh A. Enhanced soft computing approaches for intrusion detection schemes in social media networks. Journal of Soft Computing Paradigm. 2019;1(2):69–79. doi: 10.36548/jscp.2019.2.002. DOI
Shams M, Rashedi E, Hakimi A. Clustered-gravitational search algorithm and its application in parameter optimization of a low noise amplifier. Applied Mathematics and Computation. 2015;258:436–453. doi: 10.1016/j.amc.2015.02.020. DOI
Sheskin DJ. Handbook of parametric and nonparametric statistical procedures. Boca Raton: Chapman and Hall/CRC; 2020.
Sofaer HR, Hoeting JA, Jarnevich CS. The area under the precision-recall curve as a performance metric for rare binary events. Methods in Ecology and Evolution. 2019;10(4):565–577. doi: 10.1111/2041-210X.13140. DOI
Spall JC. Stochastic optimization. Handbook of Computational Statistics. 2011:173–201. doi: 10.1007/978-3-642-21551-3_7. DOI
Stoean R. Analysis on the potential of an EA—surrogate modelling tandem for deep learning parametrization: an example for cancer classification from medical images. Neural Computing and Applications. 2018;32(2):313–322. doi: 10.1007/s00521-018-3709-5. DOI
Strumberger I, Bacanin N, Tuba M. Enhanced firefly algorithm for constrained numerical optimization. 2017 IEEE Congress on Evolutionary Computation (CEC); Piscataway: IEEE; 2017. pp. 2120–2127.
Strumberger I, Tuba E, Bacanin N, Zivkovic M, Beko M, Tuba M. Designing convolutional neural network architecture by the firefly algorithm. 2019 International Young Engineers Forum (YEF-ECE); Piscataway: IEEE; 2019. pp. 59–65.
Tama BA, Lim S. Ensemble learning for intrusion detection systems: a systematic mapping study and cross-benchmark evaluation. Computer Science Review. 2021;39(1):100357. doi: 10.1016/j.cosrev.2020.100357. DOI
Tavallaee M, Bagheri E, Lu W, Ghorbani AA. A detailed analysis of the KDD cup 99 data set. 2009 IEEE Symposium on Computational Intelligence for Security and Defense Applications; Piscataway: IEEE; 2009. pp. 1–6.
Thaseen IS, Kumar CA. Intrusion detection model using fusion of chi-square feature selection and multi class SVM. Journal of King Saud University-Computer and Information Sciences. 2017;29(4):462–472. doi: 10.1016/j.jksuci.2015.12.004. DOI
Verwoerd T, Hunt R. Intrusion detection techniques and approaches. Computer Communications. 2002;25(15):1356–1365. doi: 10.1016/S0140-3664(02)00037-3. DOI
Wang G-G. Moth search algorithm: a bio-inspired metaheuristic algorithm for global optimization problems. Memetic Computing. 2018;10(2):151–164. doi: 10.1007/s12293-016-0212-3. DOI
Wang G-G, Deb S, Cui Z. Monarch butterfly optimization. Neural Computing and Applications. 2019;31(7):1995–2014. doi: 10.1007/s00521-015-1923-y. DOI
Wang H, Zhou X, Sun H, Yu X, Zhao J, Zhang H, Cui L. Firefly algorithm with adaptive control parameters. Soft Computing. 2017;21(17):5091–5102. doi: 10.1007/s00500-016-2104-3. DOI
Xu G-H, Zhang T-W, Lai Q. A new firefly algorithm with mean condition partial attraction. Applied Intelligence. 2021;52:1–14. doi: 10.1007/s10489-021-02642-6. DOI
Yang X-S. Firefly algorithms for multimodal optimization. In: Watanabe O, Zeugmann T, editors. Stochastic Algorithms: Foundations and Applications. Berlin, Berlin Heidelberg: Springer; 2009. pp. 169–178.
Yang X-S, He X. Firefly algorithm: recent advances and applications. International Journal of Swarm Intelligence. 2013;1(1):36–50. doi: 10.1504/IJSI.2013.055801. DOI
Zandevakili H, Rashedi E, Mahani A. Gravitational search algorithm with both attractive and repulsive forces. Soft Computing. 2019;23(3):1–43. doi: 10.1007/s00500-017-2785-2. DOI
Zivkovic M, Bacanin N, Tuba E, Strumberger I, Bezdan T, Tuba M. Wireless sensor networks life time optimization based on the improved firefly algorithm. 2020 International Wireless Communications and Mobile Computing (IWCMC); Piscataway: IEEE; 2020a. pp. 1176–1181.
Zivkovic M, Bacanin N, Venkatachalam K, Nayyar A, Djordjevic A, Strumberger I, Al-Turjman F. COVID-19 cases prediction by using hybrid machine learning and beetle antennae search approach. Sustainable Cities and Society. 2021a;66(3):102669. doi: 10.1016/j.scs.2020.102669. PubMed DOI PMC
Zivkovic M, Bacanin N, Zivkovic T, Strumberger I, Tuba E, Tuba M. Enhanced grey wolf algorithm for energy efficient wireless sensor networks. 2020 Zooming Innovation in Consumer Technologies Conference (ZINC); Piscataway: IEEE; 2020b. pp. 87–92.
Zivkovic M, Bezdan T, Strumberger I, Bacanin N, Venkatachalam K. Improved harris hawks optimization algorithm for workflow scheduling challenge in cloud–edge environment. Computer Networks, Big Data and IoT; Berlin: Springer; 2021b. pp. 87–102.
Zivkovic M, Venkatachalam K, Bacanin N, Djordjevic A, Antonijevic M, Strumberger I, Rashid TA. Hybrid genetic algorithm and machine learning method for COVID-19 cases prediction. Proceedings of International Conference on Sustainable Expert Systems: ICSES 2020; Berlin: Springer Nature; 2021c. p. 169.
Zivkovic M, Zivkovic T, Venkatachalam K, Bacanin N. Enhanced dragonfly algorithm adapted for wireless sensor network lifetime optimization. Data Intelligence and Cognitive Informatics; Berlin: Springer; 2021d. pp. 803–817.