Plant recognition by AI: Deep neural nets, transformers, and kNN in deep embeddings
Status PubMed-not-MEDLINE Jazyk angličtina Země Švýcarsko Médium electronic-ecollection
Typ dokumentu časopisecké články
PubMed
36237508
PubMed Central
PMC9551576
DOI
10.3389/fpls.2022.787527
Knihovny.cz E-zdroje
- Klíčová slova
- classification, computer vision, fine-grained, machine learning, plant, recognition, species, species recognition,
- Publikační typ
- časopisecké články MeSH
The article reviews and benchmarks machine learning methods for automatic image-based plant species recognition and proposes a novel retrieval-based method for recognition by nearest neighbor classification in a deep embedding space. The image retrieval method relies on a model trained via the Recall@k surrogate loss. State-of-the-art approaches to image classification, based on Convolutional Neural Networks (CNN) and Vision Transformers (ViT), are benchmarked and compared with the proposed image retrieval-based method. The impact of performance-enhancing techniques, e.g., class prior adaptation, image augmentations, learning rate scheduling, and loss functions, is studied. The evaluation is carried out on the PlantCLEF 2017, the ExpertLifeCLEF 2018, and the iNaturalist 2018 Datasets-the largest publicly available datasets for plant recognition. The evaluation of CNN and ViT classifiers shows a gradual improvement in classification accuracy. The current state-of-the-art Vision Transformer model, ViT-Large/16, achieves 91.15% and 83.54% accuracy on the PlantCLEF 2017 and ExpertLifeCLEF 2018 test sets, respectively; the best CNN model (ResNeSt-269e) error rate dropped by 22.91% and 28.34%. Apart from that, additional tricks increased the performance for the ViT-Base/32 by 3.72% on ExpertLifeCLEF 2018 and by 4.67% on PlantCLEF 2017. The retrieval approach achieved superior performance in all measured scenarios with accuracy margins of 0.28%, 4.13%, and 10.25% on ExpertLifeCLEF 2018, PlantCLEF 2017, and iNat2018-Plantae, respectively.
Zobrazit více v PubMed
Belhumeur P. N., Chen D., Feiner S., Jacobs D. W., Kress W. J., Ling H., et al. . (2008). “Searching the world's Herbaria: a system for visual identification of plant species,” in Computer Vision-ECCV 2008 (Berlin; Heidelberg: Springer; ), 116–129. 10.1007/978-3-540-88693-8_9 DOI
Bonnet P., Goëau H., Hang S. T., Lasseck M., Šulc M., Malécot V., et al. . (2018). “Plant identification: experts vs. machines in the era of deep learning,” in Multimedia Tools and Applications for Environmental & Biodiversity Informatics (Cham: Springer International Publishing), 131–149. 10.1007/978-3-319-76445-0_8 DOI
Buslaev A., Iglovikov V. I., Khvedchenya E., Parinov A., Druzhinin M., Kalinin A. A. (2020). Albumentations: fast and flexible image augmentations. Information 11, 125. 10.3390/info11020125 DOI
Caglayan A., Guclu O., Can A. B. (2013). “A plant recognition approach using shape and color features in leaf images,” in International Conference on Image Analysis and Processing (Berlin; Heidelberg: Springer; ), 161–170. 10.1007/978-3-642-41184-7_17 DOI
Cui Y., Song Y., Sun C., Howard A., Belongie S. (2018). “Large scale fine-grained categorization and domain-specific transfer learning,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (Salt Lake City, UT). 10.1109/CVPR.2018.00432 DOI
Dosovitskiy A., Beyer L., Kolesnikov A., Weissenborn D., Zhai X., Unterthiner T., et al. . (2021). “An image is worth 16x16 words: transformers for image recognition at scale,” in International Conference on Learning Representations (Vienna: ).
Garcin C., Joly A., Bonnet P., Lombardo J.-C., Affouard A., Chouet M., et al. . (2021). “Pl@ ntnet-300k: a plant image dataset with high label ambiguity and a long-tailed distribution,” in NeurIPS 2021-35th Conference on Neural Information Processing Systems, ed J. Vanschoren and S. Yeung. Available online at: https://datasets-benchmarks-proceedings.neurips.cc/paper/2021/file/7e7757b1e12abcb736ab9a754ffb617a-Paper-round2.pdf
Gaston K. J., O'Neill M. A. (2004). Automated species identification: why not? Philos. Trans. R. Soc. Lond. Ser. B Biol. Sci. 359, 655–667. 10.1098/rstb.2003.1442 PubMed DOI PMC
Ghazi M. M., Yanikoglu B., Aptoula E. (2017). Plant identification using deep neural networks via optimization of transfer learning parameters. Neurocomputing 235, 228–235. 10.1016/j.neucom.2017.01.018 DOI
Goëau H., Bonnet P., Joly A. (2016). “Plant identification in an open-world (lifeclef 2016),” in CLEF Working Notes 2016 (Évora).
Goëau H., Bonnet P., Joly A. (2017). “Plant identification based on noisy web data: the amazing performance of deep learning (lifeclef 2017),” in CEUR Workshop Proceedings (Dublin: ).
Goëau H., Bonnet P., Joly A. (2018). “Overview of expertlifeclef 2018: how far automated identification systems are from the best experts?” in CLEF Working Notes 2018 (Avignon: ).
Goëau H., Bonnet P., Joly A. (2019). “Overview of lifeclef plant identification task 2019: diving into data deficient tropical countries,” in CLEF 2019-Conference and Labs of the Evaluation Forum (Lugano: CEUR; ), 1–13.
Goëau H., Bonnet P., Joly A. (2020). “Overview of lifeclef plant identification task 2020,” in CLEF Task Overview 2020, CLEF: Conference and Labs of the Evaluation Forum (Thessaloniki: ).
Goëau H., Bonnet P., Joly A. (2021). “Overview of PlantCLEF 2021: cross-domain plant identification,” in Working Notes of CLEF 2021 - Conference and Labs of the Evaluation Forum (Bucharest).
Goodfellow I., Bengio Y., Courville A. (2016). Deep Learning Book. MIT Press. Available online at: http://www.deeplearningbook.org
He K., Zhang X., Ren S., Sun J. (2016). “Deep residual learning for image recognition,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (Las Vegas, NV), 770–778. 10.1109/CVPR.2016.90 DOI
Hu J., Shen L., Sun G. (2018). “Squeeze-and-excitation networks,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (Salt Lake City, UT), 7132–7141. 10.1109/CVPR.2018.00745 DOI
Joly A., Goëau H., Botella C., Glotin H., Bonnet P., Planqué R., et al. . (2018). “Overview of lifeclef 2018: a large-scale evaluation of species identification and recommendation algorithms in the era of AI,” in Proceedings of CLEF 2018 (Cham: Springer International Publishing), 247–266. 10.1007/978-3-319-98932-7_24 DOI
Joly A., Goëau H., Botella C., Kahl S., Servajean M., Glotin H., et al. . (2019). “Overview of lifeclef 2019: identification of amazonian plants, south & north American birds, and niche prediction,” in International Conference of the Cross-Language Evaluation Forum for European Languages (Berlin; Heidelberg: Springer; ), 387–401. 10.1007/978-3-030-28577-7_29 DOI
Joly A., Goëau H., Kahl S., Deneu B., Servajean M., Cole E., et al. . (2020). “Overview of lifeclef 2020: a system-oriented evaluation of automated species identification and species distribution prediction,” in International Conference of the Cross-Language Evaluation Forum for European Languages (Cham: Springer; ), 342–363. 10.1007/978-3-030-58219-7_23 DOI
Joly A., Goëau H., Kahl S., Picek L., Lorieul T., Cole E., et al. . (2021). “Overview of lifeclef 2021: an evaluation of machine-learning based species identification and species distribution prediction,” in International Conference of the Cross-Language Evaluation Forum for European Languages (Cham: Springer; ), 371–393. 10.1007/978-3-030-85251-1_24 DOI
Keaton M. R., Zaveri R. J., Kovur M., Henderson C., Adjeroh D. A., Doretto G. (2021). Fine-grained visual classification of plant species in the wild: object detection as a reinforced means of attention. arXiv preprint arXiv:2106.02141. 10.48550/ARXIV.2106.02141 DOI
Khosla P., Teterwak P., Wang C., Sarna A., Tian Y., Isola P., et al. . (2020). “Supervised contrastive learning,” in Advances in Neural Information Processing Systems, Vol. 33, ed H. Larochelle, M. Ranzato, R. Hadsell, M. F. Balcan, and H. Lin (Curran Associates, Inc.), 18661–18673. Available online at: https://proceedings.neurips.cc/paper/2020/file/d89a66c7c80a29b1bdbab0f2a1a94af8-Paper.pdf
Krause J., Sapp B., Howard A., Zhou H., Toshev A., Duerig T., et al. . (2016). “The unreasonable effectiveness of noisy data for fine-grained recognition,” in European Conference on Computer Vision (Cham: Springer; ), 301–320. 10.1007/978-3-319-46487-9_19 DOI
Lasseck M. (2017). “Image-based plant species identification with deep convolutional neural networks,” in CLEF (Dublin: ).
Lee S. H., Chan C. S., Remagnino P. (2018). Multi-organ plant classification based on convolutional and recurrent neural networks. IEEE Trans. Image Process. 27, 4287–4301. 10.1109/TIP.2018.2836321 PubMed DOI
Lin T.-Y., Goyal P., Girshick R., He K., Dollár P. (2017). “Focal loss for dense object detection,” in Proceedings of the IEEE International Conference on Computer Vision (Venice), 2980–2988. 10.1109/ICCV.2017.324 DOI
Loshchilov I., Hutter F. (2019). “Decoupled weight decay regularization,” in International Conference on Learning Representations (New Orleans, LA: ).
Malik O. A., Faisal M., Hussein B. R. (2021). “Ensemble deep learning models for fine-grained plant species identification,” in 2021 IEEE Asia-Pacific Conference on Computer Science and Data Engineering (CSDE) (IEEE: ), 1–6. 10.1109/CSDE53843.2021.9718387 DOI
Munisami T., Ramsurn M., Kishnah S., Pudaruth S. (2015). Plant leaf recognition using shape features and colour histogram with k-nearest neighbour classifiers. Proc. Comput. Sci. 58, 740–747. 10.1016/j.procs.2015.08.095 DOI
Patel Y., Tolias G., Matas J. (2021). “Recall@k surrogate loss with large batches and similarity mixup,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (New Orleans, LA: ), 7502–7511.
Picek L., Sulc M., Matas J. (2019). “Recognition of the amazonian flora by inceptionnetworks with test-time class prior estimation,” in CLEF (Working Notes) (Lugano: ).
Picek L., Šulc M., Matas J., Jeppesen T. S., Heilmann-Clausen J., Læssøe T., et al. . (2022). “Danish fungi 2020 - not just another image recognition dataset,” in Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) (Waikoloa), 1525–1535. 10.1109/WACV51458.2022.00334 DOI
Polyak B. T., Juditsky A. B. (1992). Acceleration of stochastic approximation by averaging. SIAM J. Control Opt. 30, 838–855. 10.1137/0330046 PubMed DOI
Prasad S., Kudiri K. M., Tripathi R. (2011). “Relative sub-image based features for leaf recognition using support vector machine,” in Proceedings of the 2011 International Conference on Communication, Computing & Security (Rourkela Odisha), 343–346. 10.1145/1947940.1948012 DOI
Priya C. A., Balasaravanan T., Thanamani A. S. (2012). “An efficient leaf recognition algorithm for plant classification using support vector machine,” in International Conference on Pattern Recognition, Informatics and Medical Engineering (PRIME-2012) (Tamilnadu: IEEE; ), 428–432. 10.1109/ICPRIME.2012.6208384 DOI
Saerens M., Latinne P., Decaestecker C. (2002). Adjusting the outputs of a classifier to new a priori probabilities: a simple procedure. Neural Comput. 14, 21–41. 10.1162/089976602753284446 PubMed DOI
Sipka T., Sulc M., Matas J. (2022). “The hitchhiker's guide to prior-shift adaptation,” in Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (IEEE: ), 1516–1524. 10.1109/WACV51458.2022.00209 DOI
Šulc M. (2020). Fine-grained recognition of plants and fungi from images (Ph.D. thesis: ). Czech Technical University in Prague, Prague, Czechia.
Šulc M., Matas J. (2017). Fine-grained recognition of plants from images. Plant Methods 13, 115. 10.1186/s13007-017-0265-4 PubMed DOI PMC
Šulc M., Matas J. (2019). “Improving cnn classifiers by estimating test-time priors,” in Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Workshops (Seoul: ). 10.1109/ICCVW.2019.00402 DOI
Šulc M., Picek L., Matas J. (2018). “Plant recognition by inception networks with test-time class prior estimation,” in CLEF (Working Notes) (Avignon: ).
Szegedy C., Ioffe S., Vanhoucke V., Alemi A. A. (2017). “Inception-v4, inception-resnet and the impact of residual connections on learning,” in Thirty-first AAAI Conference on Artificial Intelligence (AAAI: ).
Tan M., Le Q. V. (2021). “Efficientnetv2: smaller models and faster training,” in Proceedings of the 38th International Conference on Machine Learning, ed M, Marina and Z, Tong (PMLR), 10096–10106. Available online at: http://proceedings.mlr.press/v139/tan21a/tan21a.pdf
Touvron H., Sablayrolles A., Douze M., Cord M., Jégou H. (2021). “Grafit: learning fine-grained image representations with coarse labels,” in Proceedings of the IEEE/CVF International Conference on Computer Vision (Montreal), 874–884. 10.1109/ICCV48922.2021.00091 DOI
Van Horn G., Mac Aodha O., Song Y., Cui Y., Sun C., Shepard A., et al. . (2018). “The inaturalist species classification and detection dataset,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (Salt Lake City, UT), 8769–8778. 10.1109/CVPR.2018.00914 DOI
Vaswani A., Shazeer N., Parmar N., Uszkoreit J., Jones L., Gomez A. N., et al. . (2017). “Attention is all you need,” in Advances in Neural Information Processing Systems, Vol. 30, eds I. Guyon, U. Von Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett (Curran Associates, Inc.), 5998–6008. Available online at: https://proceedings.neurips.cc/paper/2017/file/3f5ee243547dee91fbd053c1c4a845aa-Paper.pdf
Wah C., Branson S., Welinder P., Perona P., Belongie S. (2011). The Caltech-UCSD Birds-200-2011 Dataset. Technical Report CNS-TR-2011-001, California Institute of Technology.
Wäldchen J., Mäder P. (2018). Machine learning for image based species identification. Methods Ecol. Evol. 9, 2216–2225. 10.1111/2041-210X.13075 DOI
Wightman R. (2019). PyTorch Image Models. Available online at: https://github.com/rwightman/pytorch-image-models
Wu D., Han X., Wang G., Sun Y., Zhang H., Fu H. (2019). Deep learning with taxonomic loss for plant identification. Comput. Intell. Neurosci. 2019, 2015017. 10.1155/2019/2015017 PubMed DOI PMC
Wu Q., Zhou C., Wang C. (2006). Feature extraction and automatic recognition of plant leaf using artificial neural network. Adv. Artif. Intell. 3, 5–12.
Wu S. G., Bao F. S., Xu E. Y., Wang Y.-X., Chang Y.-F., Xiang Q.-L. (2007). “A leaf recognition algorithm for plant classification using probabilistic neural network,” in 2007 IEEE International Symposium on Signal Processing and Information Technology (IEEE: ), 11–16. 10.1109/ISSPIT.2007.4458016 DOI
Xie S., Girshick R., Dollár P., Tu Z., He K. (2017). “Aggregated residual transformations for deep neural networks,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (Honolulu), 1492–1500. 10.1109/CVPR.2017.634 DOI
Zhang H., Wu C., Zhang Z., Zhu Y., Lin H., Zhang Z., et al. . (2020). “ResNest: split-attention networks,” Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops (New Orleans, LA), 2736–2746.
Zheng H., Fu J., Zha Z.-J., Luo J. (2019). “Looking for the devil in the details: learning trilinear attention sampling network for fine-grained image recognition,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (Long Beach, CA), 5012–5021. 10.1109/CVPR.2019.00515 DOI