CESNET-TLS-Year22: A year-spanning TLS network traffic dataset from backbone lines
Status PubMed-not-MEDLINE Jazyk angličtina Země Velká Británie, Anglie Médium electronic
Typ dokumentu časopisecké články
Grantová podpora
VJ02010024
Ministerstvo Vnitra České Republiky (Ministry of the Interior of the Czech Republic)
VJ02010024
Ministerstvo Vnitra České Republiky (Ministry of the Interior of the Czech Republic)
VJ02010024
Ministerstvo Vnitra České Republiky (Ministry of the Interior of the Czech Republic)
VJ02010024
Ministerstvo Vnitra České Republiky (Ministry of the Interior of the Czech Republic)
VJ02010024
Ministerstvo Vnitra České Republiky (Ministry of the Interior of the Czech Republic)
PubMed
39424641
PubMed Central
PMC11489426
DOI
10.1038/s41597-024-03927-4
PII: 10.1038/s41597-024-03927-4
Knihovny.cz E-zdroje
- Publikační typ
- časopisecké články MeSH
The modern approach for network traffic classification (TC), which is an important part of operating and securing networks, is to use machine learning (ML) models that are able to learn intricate relationships between traffic characteristics and communicating applications. A crucial prerequisite is having representative datasets. However, datasets collected from real production networks are not being published in sufficient numbers. Thus, this paper presents a novel dataset, CESNET-TLS-Year22, that captures the evolution of TLS traffic in an ISP network over a year. The dataset contains 180 web service labels and standard TC features, such as packet sequences. The unique year-long time span enables comprehensive evaluation of TC models and assessment of their robustness in the face of the ever-changing environment of production networks.
Zobrazit více v PubMed
Wang, C., Finamore, A., Yang, L., Fauvel, K. & Rossi, D. AppClassNet: A commercial-grade dataset for application identification research. SIGCOMM Comput. Commun. Rev.52, 19–27, 10.1145/3561954.3561958 (2022).
Luxemburk, J. & Čejka, T. Fine-grained TLS services classification with reject option. Computer Networks220, 109467, 10.1016/j.comnet.2022.109467 (2023).
Luxemburk, J., Hynek, K., Čejka, T., Lukačovič, A. & Šiška, P. CESNET-QUIC22: A large one-month QUIC network traffic dataset from backbone lines. Data in Brief46, 108888, 10.1016/j.dib.2023.108888 (2023). PubMed PMC
Hynek, K., Luxemburk, J., Pešek, J., Čejka, T. & Šiška, P. CESNET-TLS-Year22: A year-spanning TLS network traffic dataset from backbone lines. Zenodo10.5281/zenodo.10608607 (2024). PubMed PMC
Pendlebury, F., Pierazzi, F., Jordaney, R., Kinder, J. & Cavallaro, L. TESSERACT: Eliminating experimental bias in malware classification across space and time. In 28th USENIX Security Symposium (USENIX Security 19), 729–746 (USENIX Association, Santa Clara, CA, 2019).
Bovenzi, G. et al. Benchmarking class incremental learning in deep learning traffic classification. IEEE Transactions on Network and Service Management 1–1, 10.1109/TNSM.2023.3287430 (2023).
Bovenzi, G., Monda, D. D., Montieri, A., Persico, V. & Pescape, A. META MIMETIC: Few-shot classification of mobile-app encrypted traffic via multimodal meta-learning. In 35th International Teletraffic Congress (ITC-35), 1–9 (Torino, Italy, 2023).
Guarino, I., Wang, C., Finamore, A., Pescapè, A. & Rossi, D. Many or few samples?: Comparing transfer, contrastive and meta-learning in encrypted traffic classification. In 2023 7th Network Traffic Measurement and Analysis Conference (TMA), 1–10, 10.23919/TMA58422.2023.10198965 (2023).
Hofstede, R. et al. Flow monitoring explained: From packet capture to data analysis with NetFlow and IPFIX. IEEE Communications Surveys & Tutorials16, 2037–2064, 10.1109/COMST.2014.2321898 (2014).
Aitken, P., Claise, B. & Trammell, B. Specification of the IP Flow Information Export (IPFIX) Protocol for the Exchange of Flow Information. RFC 7011, 10.17487/RFC7011 (2013).
Čejka, T., Bartoš, V., Švepeš, M., Rosa, Z. & Kubátová, H. NEMEA: A framework for network traffic analysis. In 2016 12th International Conference on Network and Service Management (CNSM), 195–201, 10.1109/CNSM.2016.7818417 (2016).
Claise, B., Quittek, J., Meyer, J., Bryant, S. & Aitken, P. Information Model for IP Flow Information Export. RFC 5102, 10.17487/RFC5102 (2008).
Beneš, T., Pešek, J. & Čejka, T. Look at my network: An insight into the ISP backbone traffic. In 2023 19th International Conference on Network and Service Management (CNSM), 1–7, 10.23919/CNSM59352.2023.10327823 (2023).
Luxemburk, J., Hynek, K. & Čejka, T. Encrypted traffic classification: the QUIC case. In 2023 7th Network Traffic Measurement and Analysis Conference (TMA), 1–10, 10.23919/TMA58422.2023.10199052 (2023).
Husák, M., Laštovička, M. & Plesník, T. Handling internet activism during the Russian invasion of Ukraine: A campus network perspective. Digital Threats3, 10.1145/3534566 (2022).
Luxemburk, J. & Hynek, K. DataZoo: Streamlining traffic classification experiments. In Proceedings of the 2023 on Explainable and Safety Bounded, Fidelitous, Machine Learning for Networking, SAFE ‘23, 3–7, 10.1145/3630050.3630176 (Association for Computing Machinery, New York, NY, USA, 2023).