Contrastive Learning for Image Registration in Visual Teach and Repeat Navigation
Jazyk angličtina Země Švýcarsko Médium electronic
Typ dokumentu časopisecké články
Grantová podpora
20-27034J
Czech Science Foundation
PubMed
35458959
PubMed Central
PMC9030179
DOI
10.3390/s22082975
PII: s22082975
Knihovny.cz E-zdroje
- Klíčová slova
- contrastive learning, image representations, long-term autonomy, machine learning, visual teach and repeat navigation,
- MeSH
- neuronové sítě * MeSH
- robotika * metody MeSH
- Publikační typ
- časopisecké články MeSH
Visual teach and repeat navigation (VT&R) is popular in robotics thanks to its simplicity and versatility. It enables mobile robots equipped with a camera to traverse learned paths without the need to create globally consistent metric maps. Although teach and repeat frameworks have been reported to be relatively robust to changing environments, they still struggle with day-to-night and seasonal changes. This paper aims to find the horizontal displacement between prerecorded and currently perceived images required to steer a robot towards the previously traversed path. We employ a fully convolutional neural network to obtain dense representations of the images that are robust to changes in the environment and variations in illumination. The proposed model achieves state-of-the-art performance on multiple datasets with seasonal and day/night variations. In addition, our experiments show that it is possible to use the model to generate additional training examples that can be used to further improve the original model's robustness. We also conducted a real-world experiment on a mobile robot to demonstrate the suitability of our method for VT&R.
Zobrazit více v PubMed
Debeunne C., Vivet D. A Review of Visual-LiDAR Fusion based Simultaneous Localization and Mapping. Sensors. 2020;20:2068. doi: 10.3390/s20072068. PubMed DOI PMC
Štibinger P., Broughton G., Majer F., Rozsypálek Z., Wang A., Jindal K., Zhou A., Thakur D., Loianno G., Krajník T., et al. Mobile Manipulator for Autonomous Localization, Grasping and Precise Placement of Construction Material in a Semi-structured Environment. IEEE Robot. Autom. Lett. 2021;6:2595–2602. doi: 10.1109/LRA.2021.3061377. DOI
Thrun S., Burgard W., Fox D. Probabilistic Robotics. MIT Press; Cambridge, MA, USA: 2010.
Cadena C., Carlone L., Carrillo H., Latif Y., Scaramuzza D., Neira J., Reid I., Leonard J.J. Past, Present, and Future of Simultaneous Localization and Mapping: Toward the Robust-Perception Age. IEEE Tran. Robot. 2016;32:1309–1332. doi: 10.1109/TRO.2016.2624754. DOI
Krajník T., Fentanes J.P., Santos J., Duckett T. FreMEn: Frequency Map Enhancement for Long-Term Mobile Robot Autonomy in Changing Environments. IEEE Trans. Robot. 2017;33:1–14. doi: 10.1109/TRO.2017.2665664. DOI
Hawes N., Burbridge C., Jovan F., Kunze L., Lacerda B., Mudrová L., Young J., Wyatt J., Hebesberger D., Körtner T., et al. The strands project: Long-term autonomy in everyday environments. IEEE Robot. Autom. Mag. 2017;24:146–156. doi: 10.1109/MRA.2016.2636359. DOI
Zhang Z., Sattler T., Scaramuzza D. Reference Pose Generation for Long-term Visual Localization via Learned Features and View Synthesis. Int. J. Comput. Vis. 2020;129:821–844. doi: 10.1007/s11263-020-01399-8. PubMed DOI PMC
Rosen D.M., Mason J., Leonard J.J. Towards lifelong feature-based mapping in semi-static environments; Proceedings of the 2016 IEEE International Conference on Robotics and Automation (ICRA); Stockholm, Sweden. 16–21 May 2016; pp. 1063–1070.
Lowe D. Object recognition from local scale-invariant features; Proceedings of the Seventh IEEE International Conference on Computer Vision; ICCV, Kerkyra, Greece. 20–27 September 1999; DOI
Bay H., Ess A., Tuytelaars T., Gool L.V. Speeded-Up Robust Features (SURF) Comput. Vis. Image Underst. 2008;110:346–359. doi: 10.1016/j.cviu.2007.09.014. DOI
Krizhevsky A., Sutskever I., Hinton G.E. ImageNet classification with deep convolutional neural networks. Commun. ACM. 2017;60:84–90. doi: 10.1145/3065386. DOI
Hoffer E., Ailon N. Similarity-Based Pattern Recognition Lecture Notes in Computer Science. Springer; Cham, Switzerland: 2015. Deep Metric Learning Using Triplet Network; pp. 84–92. DOI
He K., Fan H., Wu Y., Xie S., Girshick R. Momentum Contrast for Unsupervised Visual Representation Learning; Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR); Seattle, WA, USA. 13–19 June 2020; DOI
Jaiswal A., Babu A.R., Zadeh M.Z., Banerjee D., Makedon F. A Survey on Contrastive Self-Supervised Learning. Technologies. 2020;9:2. doi: 10.3390/technologies9010002. DOI
Krajník T., Cristóforis P., Kusumam K., Neubert P., Duckett T. Image features for visual teach-and-repeat navigation in changing environments. Robot. Auton. Syst. 2016;88:127–141. doi: 10.1016/j.robot.2016.11.011. DOI
Clement L., Kelly J., Barfoot T.D. Robust Monocular Visual Teach and Repeat Aided by Local Ground Planarity and Color-constant Imagery. J. Field Robot. 2016;34:74–97. doi: 10.1002/rob.21655. DOI
Furgale P., Barfoot T.D. Visual teach and repeat for long-range rover autonomy. J. Field Robot. 2010;27:534–560. doi: 10.1002/rob.20342. DOI
Calonder M., Lepetit V., Strecha C., Fua P. European Conference on Computer Vision. Springer; Berlin/Heidelberg, Germany: 2010. BRIEF: Binary robust independent elementary features; pp. 778–792.
Chen Z., Birchfield S.T. Vision-Based Path Following without Calibration. Mob. Robot. Navig. 2010:427–446. doi: 10.5772/8981. DOI
Chen Z., Birchfield S. Qualitative Vision-Based Path Following. IEEE Trans. Robot. 2009;25:749–754. doi: 10.1109/TRO.2009.2017140. DOI
Krajník T., Faigl J., Vonásek V., Košnar K., Kulich M., Přeučil L. Simple yet stable bearing-only navigation. J. Field Robot. 2010;27:511–533. doi: 10.1002/rob.20354. DOI
Krajník T., Majer F., Halodová L., Vintr T. Navigation without localisation: Reliable teach and repeat based on the convergence theorem; Proceedings of the 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS); Madrid, Spain. 1–5 October 2018; pp. 1657–1664.
Dall’Osto D., Fischer T., Milford M. Fast and robust bio-inspired teach and repeat navigation; Proceedings of the 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS); Prague, Czech Republic. 27 September–1 October 2021; DOI
Thrun S. A Lifelong Learning Perspective for Mobile Robot Control; Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems; Munich, Germany. 12–16 September 1994; pp. 201–214. DOI
Churchill W., Newman P. Experience-based navigation for long-term localisation. Int. J. Robot. Res. 2013;32:1645–1661. doi: 10.1177/0278364913499193. DOI
Dayoub F., Duckett T. An adaptive appearance-based map for long-term topological localization of mobile robots; Proceedings of the 2008 IEEE/RSJ International Conference on Intelligent Robots and Systems; Nice, France. 22–26 September 2008; pp. 3364–3369.
Halodová L., Dvořáková E., Majer F., Vintr T., Mozos O.M., Dayoub F., Krajník T. Predictive and adaptive maps for long-term visual navigation in changing environments; Proceedings of the 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS); Macau, China. 3–8 November 2019.
Karoly A.I., Galambos P., Kuti J., Rudas I.J. Deep Learning in Robotics: Survey on Model Structures and Training Strategies. IEEE Trans. Syst. Man, Cybern. Syst. 2021;51:266–279. doi: 10.1109/TSMC.2020.3018325. DOI
Khosla P., Teterwak P., Wang C., Sarna A., Tian Y., Isola P., Maschinot A., Liu C., Krishnan D. Supervised Contrastive Learning. In: Larochelle H., Ranzato M., Hadsell R., Balcan M.F., Lin H., editors. Advances in Neural Information Processing Systems. Vol. 33. Curran Associates, Inc.; Red Hook, NY, USA: 2020. pp. 18661–18673.
Neubert P., Sünderhauf N., Protzel P. Superpixel-based appearance change prediction for long-term navigation across seasons. Robot. Auton. Syst. 2015;69:15–27. doi: 10.1016/j.robot.2014.08.005. DOI
Sunderhauf N., Shirazi S., Dayoub F., Upcroft B., Milford M. On the performance of ConvNet features for place recognition; Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS); Hamburg, Germany. 28 September–2 October 2015; DOI
Suenderhauf N., Shirazi S., Jacobson A., Dayoub F., Pepperell E., Upcroft B., Milford M. Place Recognition with ConvNet Landmarks: Viewpoint-Robust, Condition-Robust, Training-Free; Proceedings of the Robotics: Science and Systems; Rome, Italy. 13–17 July 2015; DOI
Lowry S., Sunderhauf N., Newman P., Leonard J.J., Cox D., Corke P., Milford M.J. Visual Place Recognition: A Survey. IEEE Trans. Robot. 2016;32:1–19. doi: 10.1109/TRO.2015.2496823. DOI
Dosovitskiy A., Beyer L., Kolesnikov A., Weissenborn D., Zhai X., Unterthiner T., Dehghani M., Minderer M., Heigold G., Gelly S., et al. An Image is Worth 16 × 16 Words: Transformers for Image Recognition at Scale. arXiv. 20202010.11929
Carion N., Massa F., Synnaeve G., Usunier N., Kirillov A., Zagoruyko S. End-to-End Object Detection with Transformers. Springer; Cham, Switzerland: 2020. pp. 213–229. DOI
Tan M., Le Q.V. EfficientNetV2: Smaller Models and Faster Training. ICML. 2021;139:10096–10106. doi: 10.48550/arXiv.2104.00298. DOI
Guo D., Wang J., Cui Y., Wang Z., Chen S. SiamCAR: Siamese Fully Convolutional Classification and Regression for Visual Tracking; Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR); Seattle, WA, USA. 13–19 June 2020; DOI
Ichida A.Y., Meneguzzi F., Ruiz D.D. Measuring Semantic Similarity between Sentences Using A Siamese Neural Network; Proceedings of the 2018 International Joint Conference on Neural Networks (IJCNN), Rio de Janeiro; Brazil. 8–13 July 2018; DOI
Chen T., Kornblith S., Norouzi M., Hinton G. A Simple Framework for Contrastive Learning of Visual Representations. In: Daumé H. III, Singh A., editors. Proceedings of the 37th International Conference on Machine Learning; PMLR, Virtual. 13–18 July 2020; pp. 1597–1607.
Chopra S., Hadsell R., Lecun Y. Learning a Similarity Metric Discriminatively, with Application to Face Verification; Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR05); San Diego, CA, USA. 20–26 June 2005; DOI
Bromley J., Guyon I., LeCun Y., Säckinger E., Shah R. Series in Machine Perception and Artificial Intelligence Advances in Pattern Recognition Systems Using Neural Network Technologies. World Scientific Publishing Ltd.; Singapore: Jan 1, 1994. Signature Verification Using A “Siamese” Time Delay Neural Network; pp. 25–44. DOI
Spencer J., Bowden R., Hadfield S. Same features, different day: Weakly supervised feature learning for seasonal invariance; Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR); Seattle, WA, USA. 13–19 June 2020; DOI
Broughton G., Linder P., Rouček T., Vintr T., Krajník T. Robust Image Alignment for Outdoor Teach-and-Repeat Navigation; Proceedings of the 2021 European Conference on Mobile Robots (ECMR); Bonn, Germany. 31 August–3 September 2021; pp. 1–6. DOI
Rozsypalek Z., Broughton G., Linder P., Roucek T., Kusumam K., Krajnik T. Semi-Supervised Learning for Image Alignment in Teach and Repeat navigation; Proceedings of the Symposium on Applied Computing (SAC); Brno, Czech Republic. 25–29 April 2022.
Cen M., Jung C. Fully Convolutional Siamese Fusion Networks for Object Tracking; Proceedings of the 2018 25th IEEE International Conference on Image Processing (ICIP); Athens, Greece. 7–10 October 2018; DOI
Yang L., Jiang P., Wang F., Wang X. Robust Real-Time Visual Object Tracking via Multi-Scale Fully Convolutional Siamese Networks; Proceedings of the 2017 International Conference on Digital Image Computing: Techniques and Applications (DICTA); Sydney, NSW, Australia. 29 November–1 December 2017; DOI
Corporation N.B. Nordlandsbanen: Minute by Minute, Season by Season. 15 January 2013. [(accessed on 10 April 2022)]. Available online: https://nrkbeta.no/2013/01/15/nordlandsbanen-minute-by-minute-season-by-season/
Yan Z., Sun L., Krajnik T., Ruichek Y. EU Long-term Dataset with Multiple Sensors for Autonomous Driving; Proceedings of the 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS); Las Vegas, NV, USA. 25–29 October 2020.
Fox D., Thrun S., Burgard W., Dellaert F. Sequential Monte Carlo Methods in Practice. Springer; New York, NY, USA: 2001. Particle Filters for Mobile Robot Localization; pp. 401–428. DOI
Paszke A., Gross S., Massa F., Lerer A., Bradbury J., Chanan G., Killeen T., Lin Z., Gimelshein N., Antiga L., et al. PyTorch: An Imperative Style, High-Performance Deep Learning Library. In: Wallach H., Larochelle H., Beygelzimer A., d’Alché-Buc F., Fox E., Garnett R., editors. Advances in Neural Information Processing Systems 32. Curran Associates, Inc.; Red Hook, NY, USA: 2019. pp. 8024–8035.
Kingma D.P., Ba J. Adam: A method for stochastic optimization; Proceedings of the International Conference on Learning Representations (ICLR); San Diego, CA, USA. 7–9 May 2015.
He K., Zhang X., Ren S., Sun J. Deep residual learning for image recognition; Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR); Las Vegas, NV, USA. 27–30 June 2016; DOI
Hu J., Shen L., Sun G. Squeeze-and-excitation networks; Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City; UT, USA. 18–22 June 2018; DOI
Carlevaris-Bianco N., Ushani A.K., Eustice R.M. University of Michigan North Campus long-term vision and lidar dataset. Int. J. Robot. Res. 2016;35:1023–1035. doi: 10.1177/0278364915614638. DOI
Krajník T., Pedre S., Přeučil L. Monocular navigation for long-term autonomy; Proceedings of the 2013 16th International Conference on Advanced Robotics (ICAR); Montevideo, Uruguay. 25–29 November 2013; pp. 1–6.
Neubert P., Protzel P. Benchmarking superpixel descriptors; Proceedings of the European Signal Processing Conference (EUSIPCO); Nice, France. 31 August–4 September 2015.
DeTone D., Malisiewicz T., Rabinovich A. SuperPoint: Self-Supervised Interest Point Detection and Description; Proceedings of the CVPR Deep Learning for Visual SLAM Workshop, Salt Lake City; UT, USA. 18–22 June 2018.
Halodová L., Dvořáková E., Majer F., Ulrich J., Vintr T., Kusumam K., Krajník T. Adaptive Image Processing Methods for Outdoor Autonomous Vehicles; Proceedings of the Modelling and Simulation for Autonomous Systems (MESAS); Palermo, Italy. 29–31 October 2019; pp. 456–476. DOI
Krajník T., Cristóforis P., Nitsche M., Kusumam K., Duckett T. Image features and seasons revisited; Proceedings of the 2015 European Conference on Mobile Robots (ECMR); Lincoln, UK. 2–4 September 2015; pp. 1–7.
Nitsche M., Pire T., Krajník T., Kulich M., Mejail M. Conference Towards Autonomous Robotic Systems. Springer; Berlin, Germany: 2014. Monte carlo localization for teach-and-repeat feature-based navigation; pp. 13–24.