Sathya Nikethan R. V.

Work place: School of Computer Science Engineering and Information Systems (SCORE), Vellore Institute of Technology (VIT), Vellore, 632 014, India

E-mail: sathyanikethan.rv2021@vitstudent.ac.in

Website:

Research Interests: Machine Learning

Biography

Sathya Nikethan R. V. holds a Bachelor’s degree in Computer Applications (BCA) from VIT Vellore University, India. Currently pursuing his Master of Computer Applications (MCA) at VIT Vellore, his research interests focus on machine learning, computer vision, and software development.

Author Articles
Hand Gesture-controlled 2D Virtual Piano with Volume Control

By Vijayan R. Mareeswari V. Sarathi G. Sathya Nikethan R. V.

DOI: https://doi.org/10.5815/ijitcs.2025.05.02, Pub. Date: 8 Oct. 2025

The rise of virtual instruments has revolutionized music production, providing new avenues for creating music without the need for physical instruments. However, these systems rely on costly hardware, such as MIDI controllers, limiting accessibility. As an alternative, 3D gesture-based virtual instruments have been explored to emulate the immersive experience of MIDI controllers. Yet, these approaches introduce accessibility challenges by requiring specialized hardware, such as depth-sensing cameras and motion sensors. In contrast, 2D gesture systems using RGB cameras are more affordable but often lack extended functionalities. To address these challenges, this study presents a 2D virtual piano system that utilizes hand gesture recognition. The system enables accurate gesture-based control, real-time volume adjustments, control over multiple octaves and instruments, and automatic sheet music generation. OpenCV, an open-source computer vision library, and Google’s MediaPipe are employed for real-time hand tracking. The extracted hand landmark coordinates are normalized based on the wrist and scaled for consistent performance across various RGB camera setups. A bidirectional long short-term memory (Bi-LSTM) network is used to evaluate the approach. Experimental results show 95% accuracy on a public Kaggle dynamic gesture dataset and 97% on a custom-designed dataset for virtual piano gestures. Future work will focus on integrating the system with Digital Audio Workstations (DAWs), adding advanced musical features, and improving scalability for multiple-player use.

[...] Read more.
Other Articles