Using EfficientNet-B7 (CNN), Variational Auto Encoder (VAE) and Siamese Twins' Networks to Evaluate Human Exercises as Super Objects in a TSSCI Images
Status PubMed-not-MEDLINE Jazyk angličtina Země Švýcarsko Médium electronic
Typ dokumentu časopisecké články
Grantová podpora
8773451
Ministry of Science Technology, Israel
8773451
The Ministry of Education, Youth and Sports of the Czech Republic
PubMed
37241044
PubMed Central
PMC10221908
DOI
10.3390/jpm13050874
PII: jpm13050874
Knihovny.cz E-zdroje
- Klíčová slova
- MediaPipe (MP), OpenPose (OP), Siamese twin neural network, computational creativity, computational imagination, human body movements, human pose estimation (HPE), rehabilitation, simulator, tree structure skeleton color image (TSSCI), tree structure skeleton image (TSSI), variational auto encoder (VAE),
- Publikační typ
- časopisecké články MeSH
In this article, we introduce a new approach to human movement by defining the movement as a static super object represented by a single two-dimensional image. The described method is applicable in remote healthcare applications, such as physiotherapeutic exercises. It allows researchers to label and describe the entire exercise as a standalone object, isolated from the reference video. This approach allows us to perform various tasks, including detecting similar movements in a video, measuring and comparing movements, generating new similar movements, and defining choreography by controlling specific parameters in the human body skeleton. As a result of the presented approach, we can eliminate the need to label images manually, disregard the problem of finding the start and the end of an exercise, overcome synchronization issues between movements, and perform any deep learning network-based operation that processes super objects in images in general. As part of this article, we will demonstrate two application use cases: one illustrates how to verify and score a fitness exercise. In contrast, the other illustrates how to generate similar movements in the human skeleton space by addressing the challenge of supplying sufficient training data for deep learning applications (DL). A variational auto encoder (VAE) simulator and an EfficientNet-B7 classifier architecture embedded within a Siamese twin neural network are presented in this paper in order to demonstrate the two use cases. These use cases demonstrate the versatility of our innovative concept in measuring, categorizing, inferring human behavior, and generating gestures for other researchers.
Zobrazit více v PubMed
Segal Y., Hadar O., Lhotska L. PHealth 2022. IOS Press; Amsterdam, The Netherlands: 2022. Assessing Human Mobility by Constructing a Skeletal Database and Augmenting it Using a Generative Adversarial Network (GAN) Simulator; pp. 97–103. PubMed DOI
Segal Y., Yona Y., Danan O., Birman R., Hadar O., Kutilek P., Hejda J., Hourova M., Kral P., Lhotska L., et al. IEEE E-Health and Bioengineering. IEEE E-Health and Bioengineering; Lasi, Romania: 2021. Camera Setup and OpenPose software without GPU for calibration and recording in telerehabilitation.
Kutilek P., Hejda J., Lhotska L., Adolf J., Dolezal J., Hourova M., Kral P., Segal Y., Birman R., Hadar O. Camera System for Efficient non-contact Measurement in Distance Medicine; Proceedings of the Prague: 2020 19th International Conference on Mechatronics—Mechatronika (ME); Prague, Czech Republic. 2–4 December 2020; pp. 1–6.
Blobel B., Oemig F., Ruotsalainen P., Lopez D.M. Transformation of Health and Social Care Systems—An Interdisciplinary Approach Toward a Foundational Architecture. [(accessed on 6 May 2023)];Front. Med. 2022 9:802487. doi: 10.3389/fmed.2022.802487. Available online: https://www.frontiersin.org/articles/10.3389/fmed.2022.802487. PubMed DOI PMC
Adolf J., Dolezal J., Macas M., Lhotska L. Remote Physical Therapy: Requirements for a Single RGB Camera Motion Sensing; Proceedings of the 2021 International Conference on Applied Electronics (AE); Pilsen, Czechoslovakia. 7–8 September 2021; pp. 1–4. DOI
Carissimi N., Rota P., Beyan C., Murino V. Filling the Gaps: Predicting Missing Joints of Human Poses Using Denoising Autoencoders. In: Leal-Taixé L., Roth S., editors. Computer Vision—ECCV 2018 Workshops. Volume 11130. Springer International Publishing; Cham, Switzerland: 2019. pp. 364–379. (Lecture Notes in Computer Science). DOI
Koch G. Master’s Thesis. Graduate Department of Computer Science, University of Toronto; Toronto, ON, Canada: 2015. [(accessed on 6 May 2023)]. Siamese Neural Networks for One-Shot Image Recognition. Available online: http://www.cs.toronto.edu/~gkoch/files/msc-thesis.pdf.
Cao Z., Hidalgo G., Simon T., Wei S.-E., Sheikh Y. OpenPose: Realtime Multi-Person 2D Pose. IEEE Conf. Comput. Vis. Pattern Recognit. CVPR. 2017;43:7291–7299. PubMed
Lugaresi C., Tang J., Nash H., McClanahan C., Uboweja E., Hays M., Zhang F., Chang C.L., Yong M., Lee J., et al. MediaPipe: A Framework for Perceiving and Processing Reality. Third Workshop on Computer Vision for AR/VR at IEEE Computer Vision and Pattern Recognition (CVPR) 2019. [(accessed on 6 March 2023)]. Available online: https://mixedreality.cs.cornell.edu/s/NewTitle_May1_MediaPipe_CVPR_CV4ARVR_Workshop_2019.pdf.
Adolf J., Dolezal J., Kutilek P., Hejda J., Lhotska L. Single Camera-Based Remote Physical Therapy: Verification on a Large Video Dataset. Appl. Sci. 2022;12:799. doi: 10.3390/app12020799. DOI
Liao Y., Vakanski A., Xian M. A Deep Learning Framework for Assessing Physical Rehabilitation Exercises. IEEE Trans. Neural Syst. Rehabil. Eng. 2020;28:468–477. doi: 10.1109/TNSRE.2020.2966249. PubMed DOI PMC
Kingma D.P., Welling M. Auto-Encoding Variational Bayes. arXiv. 20221312.6114
Xi W., Devineau G., Moutarde F., Yang J. Generative Model for Skeletal Human Movements Based on Conditional DC-GAN Applied to Pseudo-Images. Algorithms. 2020;13:319. doi: 10.3390/a13120319. DOI
Yang Z., Li Y., Yang J., Luo J. Action Recognition With Spatio–Temporal Visual Attention on Skeleton Image Sequences. IEEE Trans. Circuits Syst. Video Technol. 2019;29:2405–2415. doi: 10.1109/TCSVT.2018.2864148. DOI
Caetano C., Sena J., Brémond F., Santos J.A.D., Schwartz W.R. SkeleMotion: A New Representation of Skeleton Joint Sequences Based on Motion Information for 3D Action Recognition. arXiv. 19071907.13025
Ren B., Liu M., Ding R., Liu H. A Survey on 3D Skeleton-Based Action Recognition Using Learning Method. arXiv. 20202002.05907 PubMed PMC
Ma L., Jia X., Sun Q., Schiele B., Tuytelaars T., Gool L.V. Pose Guided Person Image Generation. arXiv. 20181705.09368
Caetano C., Brémond F., Schwartz W.R. Skeleton Image Representation for 3D Action Recognition based on Tree Structure and Reference Joints. arXiv. 20191909.05704
Barron J.L., Fleet D.J., Beauchemin S.S., Burkitt T.A. Performance of Optical Flow Techniques. Int. J. Comput. Vis. 1994;12:43–77. doi: 10.1007/BF01420984. DOI
Kuipers J.B. Quaternions and Rotation Sequences: A Primer with Applications to Orbits, Aerospace and Virtual Reality. Princeton University Press; Princeton, NJ, USA: 2002.
LaValle S.M. Planning Algorithms. Cambridge University Press; Cambridge, UK: 2006.
Visual Reconstruction, MIT Press. [(accessed on 20 January 2023)]. Available online: https://mitpress.mit.edu/9780262524063/visual-reconstruction/
Osokin D. Real-time 2D Multi-Person Pose Estimation on CPU: Lightweight OpenPose. arXiv. 20181811.12004
Brownlee J. How to Normalize and Standardize Time Series Data in Python. MachineLearningMastery.com. Dec 11, 2016. [(accessed on 21 January 2023)]. Available online: https://machinelearningmastery.com/normalize-standardize-time-series-data-python/
Normalization, Codecademy. [(accessed on 21 January 2023)]. Available online: https://www.codecademy.com/article/normalization.
How to Normalize the RMSE. [(accessed on 21 January 2023)]. Available online: https://www.marinedatascience.co/blog/2019/01/07/normalizing-the-rmse//
Boudreau E. Unit-Length Scaling: The Ultimate In Continuous Feature-Scaling? Medium. Jul 27, 2020. [(accessed on 21 January 2023)]. Available online: https://towardsdatascience.com/unit-length-scaling-the-ultimate-in-continuous-feature-scaling-c5db0b0dab57.
Tan M., Le Q. EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks; Proceedings of the 36th International Conference on Machine Learning, PMLR; Long Beach, CA, USA. 9–15 June 2019; [(accessed on 18 January 2023)]. pp. 6105–6114. Available online: https://proceedings.mlr.press/v97/tan19a.html.
van der Maaten L. Visualizing Data using t-SNE. J. Mach. Learn. Res. 2008;9:2579–2605.
Selected Papers from the pHealth 2022 Conference, Oslo, Norway, 8-10 November 2022