JavaScript NENÍ povolen !

Prosím povolte JavaScript.

Článek

FT
PubMed

Záznam pochází z PubMed

Fully Automated DCNN-Based Thermal Images Annotation Using Neural Network Pretrained on RGB Data

Ligocki, Adam
Autor Ligocki, Adam ORCID Robotics and AI Research Group, Faculty of Electrical Engineering, Brno University of Technology, 61600 Brno, Czech Republic
Jelinek, Ales
Autor Jelinek, Ales ORCID Cybernetics and Robotics Research Group, Central European Institute of Technology, Brno University of Technology, 61600 Brno, Czech Republic
Zalud, Ludek
Autor Zalud, Ludek ORCID Robotics and AI Research Group, Faculty of Electrical Engineering, Brno University of Technology, 61600 Brno, Czech Republic
Rahtu, Esa
Autor Rahtu, Esa ORCID Artificial Intelligence and Vision Research Group, Department of Computer Science, Tampere University, 33101 Tampere, Finland

Sensors (Basel, Switzerland). 2021 Feb 23 ; 21 (4) : . [epub] 20210223

Sensors (Basel)
ISSN 1424-8220
Zdroj

Status PubMed-not-MEDLINE Jazyk angličtina Země Švýcarsko Médium electronic

Typ dokumentu časopisecké články

Perzistentní odkaz https://www.medvik.cz/link/pmid33672344

Online Plný text

PubMed 33672344
PubMed Central PMC7926581
DOI 10.3390/s21041552
PII: s21041552
Knihovny.cz E-zdroje

Klíčová slova
IR, RGB, YOLO, data annotation, deep convolutional neural networks, object detector, thermal, transfer learning,
Publikační typ
časopisecké články MeSH

One of the biggest challenges of training deep neural network is the need for massive data annotation. To train the neural network for object detection, millions of annotated training images are required. However, currently, there are no large-scale thermal image datasets that could be used to train the state of the art neural networks, while voluminous RGB image datasets are available. This paper presents a method that allows to create hundreds of thousands of annotated thermal images using the RGB pre-trained object detector. A dataset created in this way can be used to train object detectors with improved performance. The main gain of this work is the novel method for fully automatic thermal image labeling. The proposed system uses the RGB camera, thermal camera, 3D LiDAR, and the pre-trained neural network that detects objects in the RGB domain. Using this setup, it is possible to run the fully automated process that annotates the thermal images and creates the automatically annotated thermal training dataset. As the result, we created a dataset containing hundreds of thousands of annotated objects. This approach allows to train deep learning models with similar performance as the common human-annotation-based methods do. This paper also proposes several improvements to fine-tune the results with minimal human intervention. Finally, the evaluation of the proposed solution shows that the method gives significantly better results than training the neural network with standard small-scale hand-annotated thermal image datasets.

Artificial Intelligence and Vision Research Group Department of Computer Science Tampere University 33101 Tampere Finland

Cybernetics and Robotics Research Group Central European Institute of Technology Brno University of Technology 61600 Brno Czech Republic

Robotics and AI Research Group Faculty of Electrical Engineering Brno University of Technology 61600 Brno Czech Republic

Zobrazit více v PubMed

Zalud L., Kocmanova P. Fusion of thermal imaging and CCD camera-based data for stereovision visual telepresence; Proceedings of the 2013 IEEE International Symposium on Safety, Security, and Rescue Robotics (SSRR); Linköping, Sweden. 21–26 October 2013; pp. 1–6.

Krizhevsky A., Sutskever I., Hinton G.E. Imagenet classification with deep convolutional neural networks. Adv. Neural Inf. Process. Syst. 2012;25:1097–1105. doi: 10.1145/3065386. DOI

Simonyan K., Zisserman A. Very deep convolutional networks for large-scale image recognition. arXiv. 20141409.1556

Szegedy C., Liu W., Jia Y., Sermanet P., Reed S., Anguelov D., Erhan D., Vanhoucke V., Rabinovich A. Going deeper with convolutions; Proceedings of the IEEE Conference on Computer Vision and Pattern recognition; Boston, MA, USA. 7–12 June 2015; pp. 1–9.

He K., Zhang X., Ren S., Sun J. Deep residual learning for image recognition; Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition; Las Vegas, NV, USA. 27–30 June 2016; pp. 770–778.

Huang G., Liu Z., Van Der Maaten L., Weinberger K.Q. Densely connected convolutional networks; Proceedings of the IEEE Conference on Computer Vision and Pattern recognition; Honolulu, HI, USA. 21–26 July 2017; pp. 4700–4708.

Russakovsky O., Deng J., Su H., Krause J., Satheesh S., Ma S., Huang Z., Karpathy A., Khosla A., Bernstein M., et al. Imagenet large scale visual recognition challenge. Int. J. Comput. Vis. 2015;115:211–252. doi: 10.1007/s11263-015-0816-y. DOI

Ligocki A., Jelinek A., Zalud L. Brno Urban Dataset–The New Data for Self-Driving Agents and Mapping Tasks. arXiv. 20191909.06897

Ligocki A., Jelinek A., Zalud L. Atlas Fusion–Modern Framework for Autonomous Agent Sensor Data Fusion. arXiv. 20202010.11991

FLIR Systems, I. FREE FLIR Thermal Dataset for Algorithm Training. [(accessed on 1 June 2020)]; Available online: https://www.flir.com/oem/adas/adas-dataset-form/

FLIR Systems, I. Enhanced San Francisco Dataset. [(accessed on 1 June 2020)]; Available online: https://www.flir.eu/oem/adas/dataset/san-francisco-dataset/

FLIR Systems, I. FLIR European Regional Thermal Dataset for Algorithm Training. [(accessed on 1 June 2020)]; Available online: https://www.flir.eu/oem/adas/dataset/european-regional-thermal-dataset/

Esteva A., Kuprel B., Novoa R.A., Ko J., Swetter S.M., Blau H.M., Thrun S. Dermatologist-level classification of skin cancer with deep neural networks. Nature. 2017;542:115–118. doi: 10.1038/nature21056. PubMed DOI PMC

Hwang S., Park J., Kim N., Choi Y., Kweon I.S. Multispectral Pedestrian Detection: Benchmark Dataset and Baselines; Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR); Boston, MA, USA. 7–12 June 2015.

Torabi A., Massé G., Bilodeau G.A. An iterative integrated framework for thermal–visible image registration, sensor fusion, and people tracking for video surveillance applications. Comput. Vis. Image Underst. 2012;116:210–221. doi: 10.1016/j.cviu.2011.10.006. DOI

Khellal A., Ma H., Fei Q. International Conference on Intelligent Robotics and Applications. Springer; Berlin/Heidelberg, Germany: 2015. Pedestrian classification and detection in far infrared images; pp. 511–522.

Portmann J., Lynen S., Chli M., Siegwart R. People detection and tracking from aerial thermal views; Proceedings of the 2014 IEEE international conference on robotics and automation (ICRA); Hong Kong, China. 31 May–7 June 2014; pp. 1794–1800.

Davis J.W., Sharma V. Background-subtraction using contour-based fusion of thermal and visible imagery. Comput. Vis. Image Underst. 2007;106:162–182. doi: 10.1016/j.cviu.2006.06.010. DOI

Wu Z., Fuller N., Theriault D., Betke M. A thermal infrared video benchmark for visual analysis; Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops; Columbus, OH, USA. 23–28 June 2014; pp. 201–208.

Geiger A., Lenz P., Stiller C., Urtasun R. Vision meets robotics: The kitti dataset. Int. J. Robot. Res. 2013;32:1231–1237. doi: 10.1177/0278364913491297. DOI

Yu F., Xian W., Chen Y., Liu F., Liao M., Madhavan V., Darrell T. Bdd100k: A diverse driving video database with scalable annotation tooling. arXiv. 20181805.04687

Huang X., Wang P., Cheng X., Zhou D., Geng Q., Yang R. The apolloscape open dataset for autonomous driving and its application. IEEE Trans. Pattern Anal. Mach. Intell. 2019;42:2702–2719. doi: 10.1109/TPAMI.2019.2926463. PubMed DOI

Maddern W., Pascoe G., Linegar C., Newman P. 1 year, 1000 km: The Oxford RobotCar dataset. Int. J. Robot. Res. 2017;36:3–15. doi: 10.1177/0278364916679498. DOI

Nyberg A. Transforming Thermal Images to Visible Spectrum Images Using Deep Learning. [(accessed on 20 February 2021)]; Available online: https://www.diva-portal.org/smash/get/diva2:1255342/FULLTEXT01.pdf.

Goodfellow I., Pouget-Abadie J., Mirza M., Xu B., Warde-Farley D., Ozair S., Courville A., Bengio Y. Generative adversarial nets. arXiv. 20141406.2661

Zhang L., Gonzalez-Garcia A., van de Weijer J., Danelljan M., Khan F.S. Synthetic data generation for end-to-end thermal infrared tracking. IEEE Trans. Image Process. 2018;28:1837–1850. doi: 10.1109/TIP.2018.2879249. PubMed DOI

Kniaz V.V., Knyaz V.A., Hladuvka J., Kropatsch W.G., Mizginov V. Thermalgan: Multimodal color-to-thermal image translation for person re-identification in multispectral dataset; Proceedings of the European Conference on Computer Vision (ECCV); Munich, Germany. 8–14 September 2018.

Tumas P., Serackis A. Automated image annotation based on YOLOv3; Proceedings of the 2018 IEEE 6th Workshop on Advances in Information, Electronic and Electrical Engineering (AIEEE); Vilnius, Lithuania. 8–10 November 2018; pp. 1–3.

Ivašić-Kos M., Krišto M., Pobar M. Human detection in thermal imaging using YOLO; Proceedings of the 2019 5th International Conference on Computer and Technology Applications; Istanbul, Turkey. 16–17 April 2019; pp. 20–24.

Gomez A., Conti F., Benini L. Thermal image-based CNN’s for ultra-low power people recognition; Proceedings of the 15th ACM International Conference on Computing Frontiers; Ischia, Italy. 8–10 May 2018; pp. 326–331.

Park J., Chen J., Cho Y.K., Kang D.Y., Son B.J. CNN-based person detection using infrared images for night-time intrusion warning systems. Sensors. 2020;20:34. doi: 10.3390/s20010034. PubMed DOI PMC

Ghose D., Desai S.M., Bhattacharya S., Chakraborty D., Fiterau M., Rahman T. Pedestrian Detection in Thermal Images using Saliency Maps; Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops; Long Beach, CA, USA. 16–20 June 2019; pp. 11–17.

Adam L., Ales J.L.Z. Brno Urban Dataset. [(accessed on 20 January 2021)]; Available online: https://github.com/Robotics-BUT/Brno-Urban-Dataset.

Ultralytics Yolov5. [(accessed on 20 January 2021)]; Available online: https://github.com/ultralytics/yolov5.

Redmon J., Divvala S., Girshick R., Farhadi A. You only look once: Unified, real-time object detection; Proceedings of the IEEE Conference on Computer Vision and Pattern recognition; Las Vegas, NV, USA. 27–30 June 2016; pp. 779–788.

Girshick R. Fast r-cnn; Proceedings of the IEEE International Conference on Computer Vision; Santiago, Chile. 7–13 December 2015; pp. 1440–1448.

Redmon J., Farhadi A. YOLO9000: Better, faster, stronger; Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition; Honolulu, HI, USA. 21–26 July 2017; pp. 7263–7271.

Paszke A., Gross S., Massa F., Lerer A., Bradbury J., Chanan G., Killeen T., Lin Z., Gimelshein N., Antiga L., et al. Advances in Neural Information Processing Systems 32. Curran Associates, Inc.; New York, NY, USA: 2019. PyTorch: An Imperative Style, High-Performance Deep Learning Library; pp. 8024–8035.

Adam L., Ales J.L.Z. Atlas Fusion. [(accessed on 20 January 2021)]; Available online: https://github.com/Robotics-BUT/Atlas-Fusion.

Merriaux P., Dupuis Y., Boutteau R., Vasseur P., Savatier X. LiDAR point clouds correction acquired from a moving car based on CAN-bus data. arXiv. 20171706.05886

Zhang B., Zhang X., Wei B., Qi C. A Point Cloud Distortion Removing and Mapping Algorithm based on Lidar and IMU UKF Fusion; Proceedings of the 2019 IEEE/ASME International Conference on Advanced Intelligent Mechatronics (AIM); Hong Kong, China. 8–12 July 2019; pp. 966–971.

Lin T.Y., Maire M., Belongie S., Hays J., Perona P., Ramanan D., Dollár P., Zitnick C.L. European Conference on Computer Vision. Springer; Berlin/Heidelberg, Germany: 2014. Microsoft coco: Common objects in context; pp. 740–755.

Jung A.B. imgaug. [(accessed on 30 October 2018)]; Available online: https://github.com/aleju/imgaug.

Zoph B., Cubuk E.D., Ghiasi G., Lin T.Y., Shlens J., Le Q.V. European Conference on Computer Vision. Springer; Berlin/Heidelberg, Germany: 2020. Learning data augmentation strategies for object detection; pp. 566–583.

Oksuz K., Cam B.C., Kalkan S., Akbas E. Imbalance problems in object detection: A review. arXiv. 2019 doi: 10.1109/TPAMI.2020.2981890.1909.00169 PubMed DOI

Everingham M., Van Gool L., Williams C.K., Winn J., Zisserman A. The pascal visual object classes (voc) challenge. Int. J. Comput. Vis. 2010;88:303–338. doi: 10.1007/s11263-009-0275-4. DOI

Nejnovějších 20 citací...

Zobrazit více v
Medvik | PubMed

Brno urban dataset: Winter extension

Data in brief. 2022 Feb ; 40 () : 107667. [epub] 20211203

Data Brief
ISSN 2352-3409
Zdroj

Najít záznam

v BMČ

Fully Automated DCNN-Based Thermal Images Annotation Using Neural Network Pretrained on RGB Data

Najít záznam

Citační ukazatele

Možnosti archivace