SignVerse: Turning Hand Signs into Seamless Conversation with Gen AI

International Journal of Innovative Research in Computer and Communication Engineering

ISSN Approved Journal | Impact factor: 8.771 | ESTD: 2013 | Follows UGC CARE Journal Norms and Guidelines

| Monthly, Peer-Reviewed, Refereed, Scholarly, Multidisciplinary and Open Access Journal | High Impact Factor 8.771 (Calculated by Google Scholar and Semantic Scholar | AI-Powered Research Tool | Indexing in all Major Database & Metadata, Citation Generator | Digital Object Identifier (DOI) |

TITLE	SignVerse: Turning Hand Signs into Seamless Conversation with Gen AI
ABSTRACT	Although there has been a lot of progress in the area of connecting people via computers and silencing the gap between hearing people and deaf people who use sign language, there are still many barriers in communication between the two groups. Traditional sign language recognition (SLR) systems provide only a way to translate each letter of a word individually, only translating the letter A for a person would appear as an A rather than as ”Hello.” The process is very slow and does not provide an accurate understanding of how people communicate naturally. Sign Verse is a new application that will primarily be used as a digital dictionary and also as a smart sentence builder. Sign Verse also uses a standard RGB camera that can display live video of American Sign Language (ASL) and India Sign Language (ISL) at the same time in both languages. Sign Verse uses Google’s MediaPipe to extract 3D spatial features from the live video feed and uses Tensorflow/Keras machine learning classification to classify signs from the live video feed in real-time with minimal delays. Signs that were recognized from the live video feed will be sent into a Large Language Model (LLM) called Google Gemini to predict contextually relevant full sentence suggestions. Sign Verse will process video locally (on the user’s computer) and will transmit only lightweight text to the cloud for the purpose of maintaining the user’s privacy and expediting digital communication for the hearing impaired.
AUTHOR	KHUSHI N. KUMAWAT, AASTHA NITIN AMBAVKAR, SIDDHANT GANESH MANDLIK, SARVESH PRAMOD RAGHATATE, DR.MUKESH ISRANI UG Student, Dept. of I.T., Thadomal Sahani College of Engineering, Mumbai, Maharashtra, India Associate Professor, Dept. of I.T., Thadomal Sahani College of Engineering, Mumbai, Maharashtra, India
VOLUME	183
DOI	DOI: 10.15680/IJIRCCE.2026.1404116
PDF	pdf/116_SignVerse Turning Hand Signs into Seamless Conversation with Gen AI.pdf
KEYWORDS
References	1. World Health Organization, “Deafness and hearing loss,” Fact Sheet, Mar. 2023. [Online]. 2. C. Lugaresi et al., “MediaPipe: A Framework for Building Perception Pipelines,” arXiv preprint arXiv:1906.08172, 2019. 3. M. Abadi et al., “TensorFlow: A System for Large-Scale Machine Learning,” in 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 16), Savannah, GA, USA, 2016, pp. 265– 283. 4. G. Bradski, “The OpenCV Library,” Dr. Dobb’s Journal of Software Tools, vol. 25, no. 11, pp. 120–123, 2000. 5. Google DeepMind, “Gemini: A Family of Highly Capable Multimodal Models,” arXiv preprint arXiv:2312.11805, 2023. 6. A. Vaswani et al., “Attention Is All You Need,” in Advances in Neural Information Processing Systems, vol. 30, Curran Associates, Inc., 2017. 7. N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhut- dinov, “Dropout: A Simple Way to Prevent Neural Networks from Overfitting,” The Journal of Machine Learning Research, vol. 15, no. 1, 8. pp. 1929–1958, 2014. 9. V. I. Levenshtein, “Binary codes capable of correcting deletions, inser- tions, and reversals,” Soviet Physics Doklady, vol. 10, no. 8, pp. 707–710, 1966. 10. H. R. Vaezi Joze and O. Koller, “MS-ASL: A Large-Scale Data Set and Benchmark for Understanding American Sign Language,” The British Machine Vision Conference (BMVC), 2019. 11. A. Sridhar, R. Ganesan, P. Kumar, and M. Khapra, “INCLUDE: A Large Scale Dataset for Indian Sign Language Recognition,” in Proceedings of the 28th ACM International Conference on Multimedia, 2020, pp. 1366–1375. 12. I. S. MacKenzie and R. W. Soukoreff, “Text Entry for Mobile Com- puting: Models and Methods, Theory and Practice,” Human-Computer Interaction, vol. 17, no. 2, pp. 147–198, 2002. 13. P. Ekman and W. V. Friesen, “Facial Action Coding System: A Tech- nique for the Measurement of Facial Movement,” Consulting Psycholo- gists Press, Palo Alto, 1978. 14. W. Z. Khan, E. Ahmed, S. Hakak, I. Yaqoob, and A. Ahmed, “Edge computing: A survey,” Future Generation Computer Systems, vol. 97, 15. pp. 219–235, 2019. 16. B. McMahan, E. Moore, D. Ramage, S. Hampson, and B. A. y Arcas, “Communication-Efficient Learning of Deep Networks from Decentral- ized Data,” in Artificial Intelligence and Statistics, PMLR, 2017, pp. 1273–1282. 17. Khushi N. Kumawat, “Sign Verse: Real-Time Bilingual Sign Language Translation and Context-Aware Sentence Construction,” Project Report 18. / GitHub Repository, 2026.

About Us

The primary objective of IJIRCCE is to serve as an international scholarly platform that enables researchers, innovators, students, and research scholars to disseminate their research findings and technological advancements to a global academic audience.

About Us

GET IN TOUCH

Useful Links

ARTICLES

About Us

GET IN TOUCH

Useful Links