CESNET-TLS-Year22: A year-spanning TLS network traffic dataset from backbone lines

. 2024 Oct 18 ; 11 (1) : 1156. [epub] 20241018

Status PubMed-not-MEDLINE Jazyk angličtina Země Velká Británie, Anglie Médium electronic

Typ dokumentu časopisecké články

Perzistentní odkaz   https://www.medvik.cz/link/pmid39424641

Grantová podpora
VJ02010024 Ministerstvo Vnitra České Republiky (Ministry of the Interior of the Czech Republic)
VJ02010024 Ministerstvo Vnitra České Republiky (Ministry of the Interior of the Czech Republic)
VJ02010024 Ministerstvo Vnitra České Republiky (Ministry of the Interior of the Czech Republic)
VJ02010024 Ministerstvo Vnitra České Republiky (Ministry of the Interior of the Czech Republic)
VJ02010024 Ministerstvo Vnitra České Republiky (Ministry of the Interior of the Czech Republic)

Odkazy

PubMed 39424641
PubMed Central PMC11489426
DOI 10.1038/s41597-024-03927-4
PII: 10.1038/s41597-024-03927-4
Knihovny.cz E-zdroje

The modern approach for network traffic classification (TC), which is an important part of operating and securing networks, is to use machine learning (ML) models that are able to learn intricate relationships between traffic characteristics and communicating applications. A crucial prerequisite is having representative datasets. However, datasets collected from real production networks are not being published in sufficient numbers. Thus, this paper presents a novel dataset, CESNET-TLS-Year22, that captures the evolution of TLS traffic in an ISP network over a year. The dataset contains 180 web service labels and standard TC features, such as packet sequences. The unique year-long time span enables comprehensive evaluation of TC models and assessment of their robustness in the face of the ever-changing environment of production networks.

Erratum v

PubMed

Zobrazit více v PubMed

Wang, C., Finamore, A., Yang, L., Fauvel, K. & Rossi, D. AppClassNet: A commercial-grade dataset for application identification research. SIGCOMM Comput. Commun. Rev.52, 19–27, 10.1145/3561954.3561958 (2022).

Luxemburk, J. & Čejka, T. Fine-grained TLS services classification with reject option. Computer Networks220, 109467, 10.1016/j.comnet.2022.109467 (2023).

Luxemburk, J., Hynek, K., Čejka, T., Lukačovič, A. & Šiška, P. CESNET-QUIC22: A large one-month QUIC network traffic dataset from backbone lines. Data in Brief46, 108888, 10.1016/j.dib.2023.108888 (2023). PubMed PMC

Hynek, K., Luxemburk, J., Pešek, J., Čejka, T. & Šiška, P. CESNET-TLS-Year22: A year-spanning TLS network traffic dataset from backbone lines. Zenodo10.5281/zenodo.10608607 (2024). PubMed PMC

Pendlebury, F., Pierazzi, F., Jordaney, R., Kinder, J. & Cavallaro, L. TESSERACT: Eliminating experimental bias in malware classification across space and time. In 28th USENIX Security Symposium (USENIX Security 19), 729–746 (USENIX Association, Santa Clara, CA, 2019).

Bovenzi, G. et al. Benchmarking class incremental learning in deep learning traffic classification. IEEE Transactions on Network and Service Management 1–1, 10.1109/TNSM.2023.3287430 (2023).

Bovenzi, G., Monda, D. D., Montieri, A., Persico, V. & Pescape, A. META MIMETIC: Few-shot classification of mobile-app encrypted traffic via multimodal meta-learning. In 35th International Teletraffic Congress (ITC-35), 1–9 (Torino, Italy, 2023).

Guarino, I., Wang, C., Finamore, A., Pescapè, A. & Rossi, D. Many or few samples?: Comparing transfer, contrastive and meta-learning in encrypted traffic classification. In 2023 7th Network Traffic Measurement and Analysis Conference (TMA), 1–10, 10.23919/TMA58422.2023.10198965 (2023).

Hofstede, R. et al. Flow monitoring explained: From packet capture to data analysis with NetFlow and IPFIX. IEEE Communications Surveys & Tutorials16, 2037–2064, 10.1109/COMST.2014.2321898 (2014).

Aitken, P., Claise, B. & Trammell, B. Specification of the IP Flow Information Export (IPFIX) Protocol for the Exchange of Flow Information. RFC 7011, 10.17487/RFC7011 (2013).

Čejka, T., Bartoš, V., Švepeš, M., Rosa, Z. & Kubátová, H. NEMEA: A framework for network traffic analysis. In 2016 12th International Conference on Network and Service Management (CNSM), 195–201, 10.1109/CNSM.2016.7818417 (2016).

Claise, B., Quittek, J., Meyer, J., Bryant, S. & Aitken, P. Information Model for IP Flow Information Export. RFC 5102, 10.17487/RFC5102 (2008).

Beneš, T., Pešek, J. & Čejka, T. Look at my network: An insight into the ISP backbone traffic. In 2023 19th International Conference on Network and Service Management (CNSM), 1–7, 10.23919/CNSM59352.2023.10327823 (2023).

Luxemburk, J., Hynek, K. & Čejka, T. Encrypted traffic classification: the QUIC case. In 2023 7th Network Traffic Measurement and Analysis Conference (TMA), 1–10, 10.23919/TMA58422.2023.10199052 (2023).

Husák, M., Laštovička, M. & Plesník, T. Handling internet activism during the Russian invasion of Ukraine: A campus network perspective. Digital Threats3, 10.1145/3534566 (2022).

Luxemburk, J. & Hynek, K. DataZoo: Streamlining traffic classification experiments. In Proceedings of the 2023 on Explainable and Safety Bounded, Fidelitous, Machine Learning for Networking, SAFE ‘23, 3–7, 10.1145/3630050.3630176 (Association for Computing Machinery, New York, NY, USA, 2023).

Nejnovějších 20 citací...

Zobrazit více v
Medvik | PubMed

CESNET-TLS-Year22: A year-spanning TLS network traffic dataset from backbone lines

. 2024 Oct 18 ; 11 (1) : 1156. [epub] 20241018

Najít záznam

Citační ukazatele

Pouze přihlášení uživatelé

Možnosti archivace

Nahrávání dat ...