Work place: Department of Computer Applications, College of Smart Computing Roorkee, COER University Roorkee India
E-mail: vijayranasrb25@gmail.com
Website:
Research Interests: Artificial Intelligence
Biography
Vijay Singh Rana is a research scholar in Department of Computer Applications, College of Smart Computing, COER University Roorkee. He is an active researcher with experience in the field of Computer Science and Engineering. He holds an MCA and M.Tech(Computer Engineering) and is currently serving as an Assistant Professor at J.V. Jain College Saharanpur India. His research area includes Artificial Intelligence (AI), Machine Learning(ML), Gesture and Activity Recognition etc.
By Vijay Singh Rana Ankush Joshi Kamal Kant Verma
DOI: https://doi.org/10.5815/ijitcs.2025.06.10, Pub. Date: 8 Dec. 2025
Over the past two decades, the automatic recognition of human activities has been a prominent research field. This task becomes more challenging when dealing with multiple modalities, different activities, and various scenarios. Therefore, this paper addresses activity recognition task by fusion of two modalities such as RGB and depth maps. To achieve this, two distinct lightweight 3D Convolutional Neural Network (3DCNN) are employed to extract space time features from both RGB and depth sequences separately. Subsequently, a Bidirectional LSTM (Bi-LSTM) network is trained using the extracted spatial temporal features, generating activity score corresponding to each sequence in both RGB and depth maps. Then, a decision level fusion is applied to combine the score obtained in the previous step. The novelty of our proposed work is to introduce a lightweight 3DCNN feature extractor, designed to capture both spatial and temporal features form the RGBD video sequences. This improves overall efficiency while simultaneously reducing the computational complexity. Finally, the activities are recognized based the fusion scores. To assess the overall efficiency of our proposed lightweight-3DCNN and BiLSTM method, it is validated on the 3D benchmark dataset UTKinectAction3D, achieving an accuracy of 96.72%. The experimental findings confirm the effectiveness of the proposed representation over existing methods.
[...] Read more.Subscribe to receive issue release notifications and newsletters from MECS Press journals