Wall segmentation in 2D images using convolutional neural networks

. 2023 ; 9 () : e1565. [epub] 20230911

Status PubMed-not-MEDLINE Jazyk angličtina Země Spojené státy americké Médium electronic-ecollection

Typ dokumentu časopisecké články

Perzistentní odkaz   https://www.medvik.cz/link/pmid37810356

Wall segmentation is a special case of semantic segmentation, and the task is to classify each pixel into one of two classes: wall and no-wall. The segmentation model returns a mask showing where objects like windows and furniture are located, as well as walls. This article proposes the module's structure for semantic segmentation of walls in 2D images, which can effectively address the problem of wall segmentation. The proposed model achieved higher accuracy and faster execution than other solutions. An encoder-decoder architecture of the segmentation module was used. Dilated ResNet50/101 network was used as an encoder, representing ResNet50/101 network in which dilated convolutional layers replaced the last convolutional layers. The ADE20K dataset subset containing only interior images, was used for model training, while only its subset was used for model evaluation. Three different approaches to model training were analyzed in the research. On the validation dataset, the best approach based on the proposed structure with the ResNet101 network resulted in an average accuracy at the pixel level of 92.13% and an intersection over union (IoU) of 72.58%. Moreover, all proposed approaches can be applied to recognize other objects in the image to solve specific tasks.

Zobrazit více v PubMed

Badrinarayanan V, Kendall A, Cipolla R. SegNet: a deep convolutional encoder-decoder architecture for image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2017;39(12):2481–2495. doi: 10.1109/TPAMI.2016.2644615. PubMed DOI

Bao H, Dong L, Wei F. Beit: bert pre-training of image transformers. ArXiv preprint. 2021. DOI

Barchid S, Mennesson J, Djéraba C. Review on indoor rgb-d semantic segmentation with deep convolutional neural networks. 2021 International Conference on Content-Based Multimedia Indexing (CBMI); Piscataway: IEEE; 2021. pp. 1–4.

Bjekic M. Wall segmentation. 2022. https://github.com/bjekic/WallSegmentation https://github.com/bjekic/WallSegmentation

Bjekic M, Lazovic A. Getting started with wall segmentation. 9th International Scientific Conference Technics and Informatics in Education–TIE.2022.

Chen Z, Duan Y, Wang W, He J, Lu T, Dai J, Qiao Y. Vision transformer adapter for dense predictions. ArXiv preprint. 2022 doi: 10.48550/arXiv.2205.08534. DOI

Chen L-C, Papandreou G, Kokkinos I, Murphy K, Yuille AL. Deeplab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2017;40(4):834–848. doi: 10.48550/arXiv.1606.00915. PubMed DOI

Gu W, Bai S, Kong L. A review on 2D instance segmentation based on deep neural networks. Image and Vision Computing. 2022;120:104401. doi: 10.1016/j.imavis.2022.104401. DOI

He K, Zhang X, Ren S, Sun J. Deep residual learning for image recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition; 2016. pp. 770–778.

Huang J, Kuang Z-F, Zhang F-L, Mu T-J. WallNet: reconstructing general room layouts from rgb images. Graphical Models. 2020;111:101076. doi: 10.1016/j.gmod.2020.101076. DOI

Jadon S. A survey of loss functions for semantic segmentation. 2020 IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB); Piscataway: IEEE; 2020. pp. 1–7.

Karbowiak L, Bobulski J. Background segmentation in difficult weather conditions. PeerJ Computer Science. 2022;8:e962. doi: 10.7717/peerj-cs.962. PubMed DOI PMC

Koval V, Zahorodnia D, Adamiv O. An image segmentation method for obstacle detection in a mobile robot environment. 2019 9th International Conference on Advanced Computer Information Technologies (ACIT); Piscataway: IEEE; 2019. pp. 475–478.

Liu Q, Gong X, Li J, Wang H, Liu R, Liu D, Zhou R, Xie T, Fu R, Duan X. A multitask model for realtime fish detection and segmentation based on yolov5. PeerJ Computer Science. 2023;9(4):e1262. doi: 10.7717/peerj-cs.1262. PubMed DOI PMC

Liu Z, Hu H, Lin Y, Yao Z, Xie Z, Wei Y, Ning J, Cao Y, Zhang Z, Dong L. Swin transformer v2: scaling up capacity and resolution. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition; 2022. pp. 12009–12019.

Liu T, Wei Y, Zhao Y, Liu S, Wei S. Magic-wall: visualizing room decoration by enhanced wall segmentation. IEEE Transactions on Image Processing. 2019;28(9):4219–4232. doi: 10.1109/TIP.2019.2908064. PubMed DOI

Mason M. Understanding bayes error: how a low cost machine learning strategy could have a big impact. 2022. https://www.linkedin.com/pulse/understanding-bayes-error-how-low-cost-machine-learning-malcolm-mason https://www.linkedin.com/pulse/understanding-bayes-error-how-low-cost-machine-learning-malcolm-mason

Minaee S, Boykov YY, Porikli F, Plaza AJ, Kehtarnavaz N, Terzopoulos D. Image segmentation using deep learning: a survey. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2021;5:1. doi: 10.1109/TPAMI.2021.3059968. PubMed DOI

Neupane B, Horanont T, Aryal J. Deep learning-based semantic segmentation of urban features in satellite images: a review and meta-analysis. Remote Sensing. 2021;13(4):808. doi: 10.3390/rs13040808. DOI

Nguyen HT, Bao Tran T, Luong HH, Nguyen Huynh TK. Decoders configurations based on Unet family and feature pyramid network for COVID-19 segmentation on CT images. PeerJ Computer Science. 2021;7(7):e719. doi: 10.7717/peerj-cs.719. PubMed DOI PMC

Rezaei M, Houshmand M, Fatahi Valilai O. An autonomous framework for interpretation of 3D objects geometric data using 2D images for application in additive manufacturing. PeerJ Computer Science. 2021;7(2):e629. doi: 10.7717/peerj-cs.629. PubMed DOI PMC

Ronneberger O, Fischer P, Brox T. U-net: convolutional networks for biomedical image segmentation. International Conference on Medical Image Computing and Computer-Assisted Intervention; Cham: Springer; 2015. pp. 234–241.

Shun Z, Li D, Jiang H, Li J, Peng R, Lin B, Liu Q, Gong X, Zheng X, Liu T. Research on remote sensing image extraction based on deep learning. PeerJ Computer Science. 2022;8(12):e847. doi: 10.7717/peerj-cs.847. PubMed DOI PMC

Siddique N, Paheding S, Elkin CP, Devabhaktuni V. U-Net and its variants for medical image segmentation: a review of theory and applications. IEEE Access. 2021;9:82031–82057. doi: 10.1109/ACCESS.2021.3086020. DOI

Singh K, Rawat R, Ashu A. Image segmentation in agriculture crop and weed detection using image processing and deep learning techniques. International Journal of Research in Engineering, Science and Management. 2021;4(5):235–238.

Tran L-A, Le M-H. Robust u-net-based road lane markings detection for autonomous driving. 2019 International Conference on System Science and Engineering (ICSSE); Piscataway: IEEE; 2019. pp. 62–66.

Wang W, Bao H, Dong L, Bjorck J, Peng Z, Liu Q, Aggarwal K, Mohammed OK, Singhal S, Som S. Image as a foreign language: Beit pretraining for all vision and vision-language tasks. ArXiv preprint. 2022. DOI

Wei Y, Hu H, Xie Z, Zhang Z, Cao Y, Bao J, Chen D, Guo B. Contrastive learning rivals masked image modeling in fine-tuning via feature distillation. ArXiv preprint. 2022. DOI

Xu B, Sun Y, Meng X, Liu Z, Li W. MreNet: a vision transformer network for estimating room layouts from a single rgb panorama. Applied Sciences. 2022;12(19):9696. doi: 10.3390/app12199696. DOI

Yu F, Koltun V. Multi-scale context aggregation by dilated convolutions. ArXiv preprint. 2015. DOI

Zhao H, Shi J, Qi X, Wang X, Jia J. Pyramid scene parsing network. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition; 2017. pp. 2881–2890.

Zhou B, Zhao H, Puig X, Fidler S, Barriuso A, Torralba A. Scene parsing through ade20k dataset. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition; 2017. pp. 633–641.

Zhou B, Zhao H, Puig X, Xiao T, Fidler S, Barriuso A, Torralba A. Semantic understanding of scenes through the ade20k dataset. International Journal of Computer Vision. 2019;127(3):302–321. doi: 10.1007/s11263-018-1140-0. DOI

Najít záznam

Citační ukazatele

Pouze přihlášení uživatelé

Možnosti archivace

Nahrávání dat ...