Using EfficientNet-B7 (CNN), Variational Auto Encoder (VAE) and Siamese Twins' Networks to Evaluate Human Exercises as Super Objects in a TSSCI Images

. 2023 May 22 ; 13 (5) : . [epub] 20230522

Status PubMed-not-MEDLINE Jazyk angličtina Země Švýcarsko Médium electronic

Typ dokumentu časopisecké články

Perzistentní odkaz   https://www.medvik.cz/link/pmid37241044

Grantová podpora
8773451 Ministry of Science Technology, Israel
8773451 The Ministry of Education, Youth and Sports of the Czech Republic

In this article, we introduce a new approach to human movement by defining the movement as a static super object represented by a single two-dimensional image. The described method is applicable in remote healthcare applications, such as physiotherapeutic exercises. It allows researchers to label and describe the entire exercise as a standalone object, isolated from the reference video. This approach allows us to perform various tasks, including detecting similar movements in a video, measuring and comparing movements, generating new similar movements, and defining choreography by controlling specific parameters in the human body skeleton. As a result of the presented approach, we can eliminate the need to label images manually, disregard the problem of finding the start and the end of an exercise, overcome synchronization issues between movements, and perform any deep learning network-based operation that processes super objects in images in general. As part of this article, we will demonstrate two application use cases: one illustrates how to verify and score a fitness exercise. In contrast, the other illustrates how to generate similar movements in the human skeleton space by addressing the challenge of supplying sufficient training data for deep learning applications (DL). A variational auto encoder (VAE) simulator and an EfficientNet-B7 classifier architecture embedded within a Siamese twin neural network are presented in this paper in order to demonstrate the two use cases. These use cases demonstrate the versatility of our innovative concept in measuring, categorizing, inferring human behavior, and generating gestures for other researchers.

Zobrazit více v PubMed

Segal Y., Hadar O., Lhotska L. PHealth 2022. IOS Press; Amsterdam, The Netherlands: 2022. Assessing Human Mobility by Constructing a Skeletal Database and Augmenting it Using a Generative Adversarial Network (GAN) Simulator; pp. 97–103. PubMed DOI

Segal Y., Yona Y., Danan O., Birman R., Hadar O., Kutilek P., Hejda J., Hourova M., Kral P., Lhotska L., et al. IEEE E-Health and Bioengineering. IEEE E-Health and Bioengineering; Lasi, Romania: 2021. Camera Setup and OpenPose software without GPU for calibration and recording in telerehabilitation.

Kutilek P., Hejda J., Lhotska L., Adolf J., Dolezal J., Hourova M., Kral P., Segal Y., Birman R., Hadar O. Camera System for Efficient non-contact Measurement in Distance Medicine; Proceedings of the Prague: 2020 19th International Conference on Mechatronics—Mechatronika (ME); Prague, Czech Republic. 2–4 December 2020; pp. 1–6.

Blobel B., Oemig F., Ruotsalainen P., Lopez D.M. Transformation of Health and Social Care Systems—An Interdisciplinary Approach Toward a Foundational Architecture. [(accessed on 6 May 2023)];Front. Med. 2022 9:802487. doi: 10.3389/fmed.2022.802487. Available online: https://www.frontiersin.org/articles/10.3389/fmed.2022.802487. PubMed DOI PMC

Adolf J., Dolezal J., Macas M., Lhotska L. Remote Physical Therapy: Requirements for a Single RGB Camera Motion Sensing; Proceedings of the 2021 International Conference on Applied Electronics (AE); Pilsen, Czechoslovakia. 7–8 September 2021; pp. 1–4. DOI

Carissimi N., Rota P., Beyan C., Murino V. Filling the Gaps: Predicting Missing Joints of Human Poses Using Denoising Autoencoders. In: Leal-Taixé L., Roth S., editors. Computer Vision—ECCV 2018 Workshops. Volume 11130. Springer International Publishing; Cham, Switzerland: 2019. pp. 364–379. (Lecture Notes in Computer Science). DOI

Koch G. Master’s Thesis. Graduate Department of Computer Science, University of Toronto; Toronto, ON, Canada: 2015. [(accessed on 6 May 2023)]. Siamese Neural Networks for One-Shot Image Recognition. Available online: http://www.cs.toronto.edu/~gkoch/files/msc-thesis.pdf.

Cao Z., Hidalgo G., Simon T., Wei S.-E., Sheikh Y. OpenPose: Realtime Multi-Person 2D Pose. IEEE Conf. Comput. Vis. Pattern Recognit. CVPR. 2017;43:7291–7299. PubMed

Lugaresi C., Tang J., Nash H., McClanahan C., Uboweja E., Hays M., Zhang F., Chang C.L., Yong M., Lee J., et al. MediaPipe: A Framework for Perceiving and Processing Reality. Third Workshop on Computer Vision for AR/VR at IEEE Computer Vision and Pattern Recognition (CVPR) 2019. [(accessed on 6 March 2023)]. Available online: https://mixedreality.cs.cornell.edu/s/NewTitle_May1_MediaPipe_CVPR_CV4ARVR_Workshop_2019.pdf.

Adolf J., Dolezal J., Kutilek P., Hejda J., Lhotska L. Single Camera-Based Remote Physical Therapy: Verification on a Large Video Dataset. Appl. Sci. 2022;12:799. doi: 10.3390/app12020799. DOI

Liao Y., Vakanski A., Xian M. A Deep Learning Framework for Assessing Physical Rehabilitation Exercises. IEEE Trans. Neural Syst. Rehabil. Eng. 2020;28:468–477. doi: 10.1109/TNSRE.2020.2966249. PubMed DOI PMC

Kingma D.P., Welling M. Auto-Encoding Variational Bayes. arXiv. 20221312.6114

Xi W., Devineau G., Moutarde F., Yang J. Generative Model for Skeletal Human Movements Based on Conditional DC-GAN Applied to Pseudo-Images. Algorithms. 2020;13:319. doi: 10.3390/a13120319. DOI

Yang Z., Li Y., Yang J., Luo J. Action Recognition With Spatio–Temporal Visual Attention on Skeleton Image Sequences. IEEE Trans. Circuits Syst. Video Technol. 2019;29:2405–2415. doi: 10.1109/TCSVT.2018.2864148. DOI

Caetano C., Sena J., Brémond F., Santos J.A.D., Schwartz W.R. SkeleMotion: A New Representation of Skeleton Joint Sequences Based on Motion Information for 3D Action Recognition. arXiv. 19071907.13025

Ren B., Liu M., Ding R., Liu H. A Survey on 3D Skeleton-Based Action Recognition Using Learning Method. arXiv. 20202002.05907 PubMed PMC

Ma L., Jia X., Sun Q., Schiele B., Tuytelaars T., Gool L.V. Pose Guided Person Image Generation. arXiv. 20181705.09368

Caetano C., Brémond F., Schwartz W.R. Skeleton Image Representation for 3D Action Recognition based on Tree Structure and Reference Joints. arXiv. 20191909.05704

Barron J.L., Fleet D.J., Beauchemin S.S., Burkitt T.A. Performance of Optical Flow Techniques. Int. J. Comput. Vis. 1994;12:43–77. doi: 10.1007/BF01420984. DOI

Kuipers J.B. Quaternions and Rotation Sequences: A Primer with Applications to Orbits, Aerospace and Virtual Reality. Princeton University Press; Princeton, NJ, USA: 2002.

LaValle S.M. Planning Algorithms. Cambridge University Press; Cambridge, UK: 2006.

Visual Reconstruction, MIT Press. [(accessed on 20 January 2023)]. Available online: https://mitpress.mit.edu/9780262524063/visual-reconstruction/

Osokin D. Real-time 2D Multi-Person Pose Estimation on CPU: Lightweight OpenPose. arXiv. 20181811.12004

Brownlee J. How to Normalize and Standardize Time Series Data in Python. MachineLearningMastery.com. Dec 11, 2016. [(accessed on 21 January 2023)]. Available online: https://machinelearningmastery.com/normalize-standardize-time-series-data-python/

Normalization, Codecademy. [(accessed on 21 January 2023)]. Available online: https://www.codecademy.com/article/normalization.

How to Normalize the RMSE. [(accessed on 21 January 2023)]. Available online: https://www.marinedatascience.co/blog/2019/01/07/normalizing-the-rmse//

Boudreau E. Unit-Length Scaling: The Ultimate In Continuous Feature-Scaling? Medium. Jul 27, 2020. [(accessed on 21 January 2023)]. Available online: https://towardsdatascience.com/unit-length-scaling-the-ultimate-in-continuous-feature-scaling-c5db0b0dab57.

Tan M., Le Q. EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks; Proceedings of the 36th International Conference on Machine Learning, PMLR; Long Beach, CA, USA. 9–15 June 2019; [(accessed on 18 January 2023)]. pp. 6105–6114. Available online: https://proceedings.mlr.press/v97/tan19a.html.

van der Maaten L. Visualizing Data using t-SNE. J. Mach. Learn. Res. 2008;9:2579–2605.

Nejnovějších 20 citací...

Zobrazit více v
Medvik | PubMed

Selected Papers from the pHealth 2022 Conference, Oslo, Norway, 8-10 November 2022

. 2024 Sep 06 ; 14 (9) : . [epub] 20240906

Najít záznam

Citační ukazatele

Nahrávání dat ...

Možnosti archivace

Nahrávání dat ...