An encrypted network video stream dataset
Status PubMed-not-MEDLINE Jazyk angličtina Země Nizozemsko Médium electronic-ecollection
Typ dokumentu časopisecké články
PubMed
37456120
PubMed Central
PMC10338327
DOI
10.1016/j.dib.2023.109335
PII: S2352-3409(23)00453-5
Knihovny.cz E-zdroje
- Klíčová slova
- Encrypted, Identification, Machine learning, Video stream,
- Publikační typ
- časopisecké články MeSH
Most of the video content on the Internet today is distributed through online streaming platforms. To ensure user privacy, data transmissions are often encrypted using cryptographic protocols. In previous research, we first experimentally validated the idea that the amount of transmitted data belonging to a particular video stream is not constant over time or that it changes periodically and forms a specific fingerprint. Based on the knowledge of the fingerprint of a specific video stream, this video stream can be subsequently identified. Over several months of intensive work, our team has created a large dataset containing a large number of video streams that were captured by network traffic probes during their playback by end users. The video streams were deliberately chosen to fall thematically into pre-selected categories. We selected two primary platforms for streaming - PeerTube and YouTube The first platform was chosen because of the possibility of modifying any streaming parameters, while the second one was chosen because it is used by many people worldwide. Our dataset can be used to create and train machine learning models or heuristic algorithms, allowing encrypted video stream identification according to their content resp. type category or specifically.
Faculty of Information Technology Department of Computer Systems Czech Technical University Prague
Faculty of Science Department of Informatics University of South Bohemia in České Budějovice
Zobrazit více v PubMed
Wu H., Yu Z., Cheng G., Guo S. IEEE INFOCOM 2020 - IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS), Toronto, ON, Canada. 2020. Identification of encrypted video streaming based on differential fingerprints; pp. 74–79. DOI
Afandi W., Bukhari S.M.A.H., Khan M.U.S., Maqsood T., Khan S.U. Fingerprinting technique for YouTube videos identification in network traffic. IEEE Access. 2022;10:76731–76741. doi: 10.1109/ACCESS.2022.3192458. DOI
Schuster R., Shmatikov V., Tromer E. USENIX Security Symposium. 2017. Beauty and the Burst: remote identification of encrypted video streams; pp. 1357–1374.
Khan Muhammad Usman Shahid, Maqsood Tahir, Bukhari Syed Muhammad Ammar Hassan, Hassan Syed Shouzeb, Afandi Waleed, Kamal Ali Sher. Video identification in encrypted network traffic dataset (VPN) IEEE Dataport. 2022 doi: 10.21227/tzc8-1f71. DOI
Gerard Drapper Gil, Arash Habibi Lashkari, Mohammad Mamun, Ali A. Ghorbani, “Characterization of encrypted and VPN traffic using time-related features,” In Proceedings of the 2nd International Conference on Information Systems Security and Privacy(ICISSP 2016), pages 407-414, Rome, Italy.
Loh F., Wamser F., Poignée F., et al. YouTube dataset on mobile streaming for internet traffic modeling and streaming analysis. Sci. Data. 2022;9:293. doi: 10.1038/s41597-022-01418-y. DOI
Apache Software Foundation, Apache Kafka. 2023 https://kafka.apache.org/ available online.
yt-dlp team, yt-dlp, 2023 https://github.com/yt-dlp/yt-dlp, available online.
Selenium team, Selenium, 2023 https://www.selenium.dev/, available online.
Chromium and WebDriver teams, 2023 ChromeDriver, https://chromedriver.chromium.org/, available online.
Framasoft team, 2023 PeerTube platform, https://joinpeertube.org/, available online.