Wall segmentation in 2D images using convolutional neural networks
Status PubMed-not-MEDLINE Jazyk angličtina Země Spojené státy americké Médium electronic-ecollection
Typ dokumentu časopisecké články
PubMed
37810356
PubMed Central
PMC10557507
DOI
10.7717/peerj-cs.1565
PII: cs-1565
Knihovny.cz E-zdroje
- Klíčová slova
- ADE20K, Encoder-decoder, PSPNet, Semantic segmentation, Wall segmentation,
- Publikační typ
- časopisecké články MeSH
Wall segmentation is a special case of semantic segmentation, and the task is to classify each pixel into one of two classes: wall and no-wall. The segmentation model returns a mask showing where objects like windows and furniture are located, as well as walls. This article proposes the module's structure for semantic segmentation of walls in 2D images, which can effectively address the problem of wall segmentation. The proposed model achieved higher accuracy and faster execution than other solutions. An encoder-decoder architecture of the segmentation module was used. Dilated ResNet50/101 network was used as an encoder, representing ResNet50/101 network in which dilated convolutional layers replaced the last convolutional layers. The ADE20K dataset subset containing only interior images, was used for model training, while only its subset was used for model evaluation. Three different approaches to model training were analyzed in the research. On the validation dataset, the best approach based on the proposed structure with the ResNet101 network resulted in an average accuracy at the pixel level of 92.13% and an intersection over union (IoU) of 72.58%. Moreover, all proposed approaches can be applied to recognize other objects in the image to solve specific tasks.
Department of Informatics and Computing Singidunum University Belgrade Serbia
University of Belgrade School of Electrical Engineering Belgrade Serbia
Zobrazit více v PubMed
Badrinarayanan V, Kendall A, Cipolla R. SegNet: a deep convolutional encoder-decoder architecture for image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2017;39(12):2481–2495. doi: 10.1109/TPAMI.2016.2644615. PubMed DOI
Bao H, Dong L, Wei F. Beit: bert pre-training of image transformers. ArXiv preprint. 2021. DOI
Barchid S, Mennesson J, Djéraba C. Review on indoor rgb-d semantic segmentation with deep convolutional neural networks. 2021 International Conference on Content-Based Multimedia Indexing (CBMI); Piscataway: IEEE; 2021. pp. 1–4.
Bjekic M. Wall segmentation. 2022. https://github.com/bjekic/WallSegmentation https://github.com/bjekic/WallSegmentation
Bjekic M, Lazovic A. Getting started with wall segmentation. 9th International Scientific Conference Technics and Informatics in Education–TIE.2022.
Chen Z, Duan Y, Wang W, He J, Lu T, Dai J, Qiao Y. Vision transformer adapter for dense predictions. ArXiv preprint. 2022 doi: 10.48550/arXiv.2205.08534. DOI
Chen L-C, Papandreou G, Kokkinos I, Murphy K, Yuille AL. Deeplab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2017;40(4):834–848. doi: 10.48550/arXiv.1606.00915. PubMed DOI
Gu W, Bai S, Kong L. A review on 2D instance segmentation based on deep neural networks. Image and Vision Computing. 2022;120:104401. doi: 10.1016/j.imavis.2022.104401. DOI
He K, Zhang X, Ren S, Sun J. Deep residual learning for image recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition; 2016. pp. 770–778.
Huang J, Kuang Z-F, Zhang F-L, Mu T-J. WallNet: reconstructing general room layouts from rgb images. Graphical Models. 2020;111:101076. doi: 10.1016/j.gmod.2020.101076. DOI
Jadon S. A survey of loss functions for semantic segmentation. 2020 IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB); Piscataway: IEEE; 2020. pp. 1–7.
Karbowiak L, Bobulski J. Background segmentation in difficult weather conditions. PeerJ Computer Science. 2022;8:e962. doi: 10.7717/peerj-cs.962. PubMed DOI PMC
Koval V, Zahorodnia D, Adamiv O. An image segmentation method for obstacle detection in a mobile robot environment. 2019 9th International Conference on Advanced Computer Information Technologies (ACIT); Piscataway: IEEE; 2019. pp. 475–478.
Liu Q, Gong X, Li J, Wang H, Liu R, Liu D, Zhou R, Xie T, Fu R, Duan X. A multitask model for realtime fish detection and segmentation based on yolov5. PeerJ Computer Science. 2023;9(4):e1262. doi: 10.7717/peerj-cs.1262. PubMed DOI PMC
Liu Z, Hu H, Lin Y, Yao Z, Xie Z, Wei Y, Ning J, Cao Y, Zhang Z, Dong L. Swin transformer v2: scaling up capacity and resolution. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition; 2022. pp. 12009–12019.
Liu T, Wei Y, Zhao Y, Liu S, Wei S. Magic-wall: visualizing room decoration by enhanced wall segmentation. IEEE Transactions on Image Processing. 2019;28(9):4219–4232. doi: 10.1109/TIP.2019.2908064. PubMed DOI
Mason M. Understanding bayes error: how a low cost machine learning strategy could have a big impact. 2022. https://www.linkedin.com/pulse/understanding-bayes-error-how-low-cost-machine-learning-malcolm-mason https://www.linkedin.com/pulse/understanding-bayes-error-how-low-cost-machine-learning-malcolm-mason
Minaee S, Boykov YY, Porikli F, Plaza AJ, Kehtarnavaz N, Terzopoulos D. Image segmentation using deep learning: a survey. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2021;5:1. doi: 10.1109/TPAMI.2021.3059968. PubMed DOI
Neupane B, Horanont T, Aryal J. Deep learning-based semantic segmentation of urban features in satellite images: a review and meta-analysis. Remote Sensing. 2021;13(4):808. doi: 10.3390/rs13040808. DOI
Nguyen HT, Bao Tran T, Luong HH, Nguyen Huynh TK. Decoders configurations based on Unet family and feature pyramid network for COVID-19 segmentation on CT images. PeerJ Computer Science. 2021;7(7):e719. doi: 10.7717/peerj-cs.719. PubMed DOI PMC
Rezaei M, Houshmand M, Fatahi Valilai O. An autonomous framework for interpretation of 3D objects geometric data using 2D images for application in additive manufacturing. PeerJ Computer Science. 2021;7(2):e629. doi: 10.7717/peerj-cs.629. PubMed DOI PMC
Ronneberger O, Fischer P, Brox T. U-net: convolutional networks for biomedical image segmentation. International Conference on Medical Image Computing and Computer-Assisted Intervention; Cham: Springer; 2015. pp. 234–241.
Shun Z, Li D, Jiang H, Li J, Peng R, Lin B, Liu Q, Gong X, Zheng X, Liu T. Research on remote sensing image extraction based on deep learning. PeerJ Computer Science. 2022;8(12):e847. doi: 10.7717/peerj-cs.847. PubMed DOI PMC
Siddique N, Paheding S, Elkin CP, Devabhaktuni V. U-Net and its variants for medical image segmentation: a review of theory and applications. IEEE Access. 2021;9:82031–82057. doi: 10.1109/ACCESS.2021.3086020. DOI
Singh K, Rawat R, Ashu A. Image segmentation in agriculture crop and weed detection using image processing and deep learning techniques. International Journal of Research in Engineering, Science and Management. 2021;4(5):235–238.
Tran L-A, Le M-H. Robust u-net-based road lane markings detection for autonomous driving. 2019 International Conference on System Science and Engineering (ICSSE); Piscataway: IEEE; 2019. pp. 62–66.
Wang W, Bao H, Dong L, Bjorck J, Peng Z, Liu Q, Aggarwal K, Mohammed OK, Singhal S, Som S. Image as a foreign language: Beit pretraining for all vision and vision-language tasks. ArXiv preprint. 2022. DOI
Wei Y, Hu H, Xie Z, Zhang Z, Cao Y, Bao J, Chen D, Guo B. Contrastive learning rivals masked image modeling in fine-tuning via feature distillation. ArXiv preprint. 2022. DOI
Xu B, Sun Y, Meng X, Liu Z, Li W. MreNet: a vision transformer network for estimating room layouts from a single rgb panorama. Applied Sciences. 2022;12(19):9696. doi: 10.3390/app12199696. DOI
Yu F, Koltun V. Multi-scale context aggregation by dilated convolutions. ArXiv preprint. 2015. DOI
Zhao H, Shi J, Qi X, Wang X, Jia J. Pyramid scene parsing network. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition; 2017. pp. 2881–2890.
Zhou B, Zhao H, Puig X, Fidler S, Barriuso A, Torralba A. Scene parsing through ade20k dataset. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition; 2017. pp. 633–641.
Zhou B, Zhao H, Puig X, Xiao T, Fidler S, Barriuso A, Torralba A. Semantic understanding of scenes through the ade20k dataset. International Journal of Computer Vision. 2019;127(3):302–321. doi: 10.1007/s11263-018-1140-0. DOI