Proceedings of International Conference on Applied Innovation in IT
2026/03/31, Volume 14, Issue 1, pp.651-660
Recognizing Gesture images with ViT and Spatial Attention Regularization
Zahraa Thamer, Noor S.Sagheer, Ashwan A.Abdulmunem, Hawraa Thamer1 and Og˘uz Ata Abstract: One important area of Human-Computer Interaction (HCI) is image-based gesture recognition. Despite tremendous advancements, it is still very difficult to achieve reliable and accurate gesture recognition in unrestricted, real-world settings. Conventional techniques frequently find it difficult to handle changes in lighting, background noise, occlusions, size variations, and the innate similarity between various gestures. To enhance the discriminative ability of the Vision Transformer (ViT) model for intricate hand gestures, this work presents a carefully planned fine-tuning methodology. Encourage ViT to concentrate on salient gesture regions while remaining resilient to environmental noise; the proposed method combines an adaptive learning rate scheduling system with a novel spatial attention regulator during fine-tuning. Experiments on a challenging and varied gesture dataset demonstrate that the proposed approach significantly performs better than state-of-the-art methods, attaining superior accuracy reaching 100% and demonstrating generalization capabilities. This study opens the door for more user-friendly human-computer interaction systems by providing a highly effective and flexible framework for sophisticated image-based gesture recognition systems.
Keywords: Computer Vision, ViT, Gesture Images, Spatial Attention Regularization, Image Analysis, Hand Gesture Recognition d.
DOI: Under indexing
Download: PDF
References:
- A. Osman Hashi, S. Zaiton Mohd Hashim, and A. Bte Asamah, “A systematic review of hand gesture recognition: An update from 2018 to 2024,” IEEE Access, vol. 12, pp. 143599-143626, 2024, doi: 10.1109/ACCESS.2024.3421992.
- A. Krizhevsky, I. Sutskever, and G. E. Hinton, “ImageNet classification with deep convolutional neural networks,” Commun. ACM, vol. 60, no. 6, pp. 84-90, May 2017, doi: 10.1145/3065386.
- P. Molchanov, S. Gupta, K. Kim, and J. Kautz, “Hand gesture recognition with 3D convolutional neural networks,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. Workshops (CVPRW), Boston, MA, USA, 2015, pp. 1-7, doi: 10.1109/CVPRW.2015.7301342.
- P. Mittal, B. Sharma, and D. P. Yadav, “Comparative analysis between CNN and ViT using brain MRI dataset,” in Proc. 8th Int. Conf. Parallel, Distributed and Grid Comput. (PDGC), Solan, India, 2024, pp. 290-295, doi: 10.1109/PDGC64653.2024.10984339.
- I. Pacal, B. Ozdemir, J. Zeynalov, H. Gasimov, and N. Pacal, “A novel CNN-ViT-based deep learning model for early skin cancer diagnosis,” Biomed. Signal Process. Control, vol. 104, p. 107627, 2025, doi: 10.1016/j.bspc.2025.107627.
- A. Al-Zebari, N. Omar, and A. Sengur, “Vision transformers-based hand gesture classification,” in Proc. 3rd Int. Informatics and Software Eng. Conf. (IISEC), Ankara, Turkey, 2022, pp. 1-3, doi: 10.1109/IISEC56263.2022.9998295.
- T. Kaggle, “Hand gesture recognition dataset,” Kaggle, 2022. [Online]. Available: https://www.kaggle.com/datasets/tapakah68/hand-gesture-recognition-dataset
-
- T.-H. Nguyen, B.-V. Ngo, and T.-N. Nguyen, “Vision-based hand gesture recognition using a YOLOv8n model for the navigation of a smart wheelchair,” Electronics, vol. 14, no. 4, p. 734, 2025, doi: 10.3390/electronics14040734.
- Shivani and S. B. Gupta, “A comprehensive analysis of recognition of hand gestures using machine learning,” Makara J. Technol., vol. 29, no. 1, Art. no. 5, 2025, doi: 10.7454/mst.v29i1.1679.
- C. K. Tan, K. M. Lim, R. K. Y. Chang, C. P. Lee, and A. Alqahtani, “HGR-ViT: Hand gesture recognition with vision transformer,” Sensors, vol. 23, no. 12, p. 5555, 2023, doi: 10.3390/s23125555.
- Y. Altaf, “Efficient hand sign recognition with fine-tuned faster vision transformers: A comparative study on benchmark image datasets,” J. Electr. Syst., vol. 20, no. 3, pp. 8082-8098, 2024.
- A. R. Asif et al., “Performance evaluation of convolutional neural network for hand gesture recognition using EMG,” Sensors, vol. 20, no. 6, p. 1642, 2020, doi: 10.3390/s20061642.
- H. Hellara, R. Barioul, S. Sahnoun, A. Fakhfakh, and O. Kanoun, “Comparative study of sEMG feature evaluation methods based on the hand gesture classification performance,” Sensors, vol. 24, no. 11, p. 3638, 2024, doi: 10.3390/s24113638.
- V.-D. Do, V.-H. Le, H.-S. Do, V.-N. Phan, and T.-H. Te, “TQU-HG dataset and comparative study for hand gesture recognition of RGB-based images using deep learning,” Indones. J. Electr. Eng. Comput. Sci., vol. 34, no. 3, pp. 1603-1617, 2024.
- K. Myagila and H. Kilavo, “A comparative study on performance of SVM and CNN in Tanzania sign language translation using image recognition,” Appl. Artif. Intell., vol. 36, no. 1, p. 2005297, 2021, doi: 10.1080/08839514.2021.2005297.
- S. Bhushan, M. Alshehri, I. Keshta, A. K. Chakraverti, J. Rajpurohit, and A. Abugabah, “An experimental analysis of various machine learning algorithms for hand gesture recognition,” Electronics, vol. 11, no. 6, p. 968, 2022, doi: 10.3390/electronics11060968.
- A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, “Attention is all you need,” in Adv. Neural Inf. Process. Syst. (NeurIPS), vol. 30, 2017, pp. 5998-6008.
- A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly, J. Uszkoreit, and N. Houlsby, “An image is worth 16x16 words: Transformers for image recognition at scale,” arXiv preprint arXiv:2010.11929, 2020, doi: 10.48550/arXiv.2010.11929.
- K. Gupta, A. Singh, S. R. Yeduri, M. B. Srinivas, and L. R. Cenkeramaddi, “Hand gestures recognition using edge computing system based on vision transformer and lightweight CNN,” J. Ambient Intell. Humanized Comput., vol. 14, no. 3, pp. 2601-2615, 2023, doi: 10.1007/s12652-022-04506-4.
|

HOME

- Conference
- Journal
- Paper Submission to Conference
- Paper Submission to Journal
- Fee Payment
- For Authors
- For Reviewers
- Important Dates
- Conference Committee
- Editorial Board
- Reviewers
- Last Proceeding

PROCEEDINGS
-
Volume 14, Issue 1 (ICAIIT 2026)
-
Volume 13, Issue 5 (ICAIIT 2025)
-
Volume 13, Issue 4 (ICAIIT 2025)
-
Volume 13, Issue 3 (ICAIIT 2025)
-
Volume 13, Issue 2 (ICAIIT 2025)
-
Volume 13, Issue 1 (ICAIIT 2025)
-
Volume 12, Issue 2 (ICAIIT 2024)
-
Volume 12, Issue 1 (ICAIIT 2024)
-
Volume 11, Issue 2 (ICAIIT 2023)
-
Volume 11, Issue 1 (ICAIIT 2023)
-
Volume 10, Issue 1 (ICAIIT 2022)
-
Volume 9, Issue 1 (ICAIIT 2021)
-
Volume 8, Issue 1 (ICAIIT 2020)
-
Volume 7, Issue 1 (ICAIIT 2019)
-
Volume 7, Issue 2 (ICAIIT 2019)
-
Volume 6, Issue 1 (ICAIIT 2018)
-
Volume 5, Issue 1 (ICAIIT 2017)
-
Volume 4, Issue 1 (ICAIIT 2016)
-
Volume 3, Issue 1 (ICAIIT 2015)
-
Volume 2, Issue 1 (ICAIIT 2014)
-
Volume 1, Issue 1 (ICAIIT 2013)

LAST CONFERENCE
ICAIIT 2026
-
Photos
-
Reports
PAST CONFERENCES
ETHICS IN PUBLICATIONS
ACCOMODATION
CONTACT US
|
|