Work place: Computer Science and Engineering Discipline, Khulna University, Khulna, Bangladesh
E-mail: farhan@cse.ku.ac.bd
Website: https://orcid.org/0000-0002-7454-0638
Research Interests: Data Structures and Algorithms, Computer systems and computational processes, Computational Science and Engineering
Biography
Md. Farhan Sadique is a Lecturer in the Computer Science and Engineering Discipline at Khulna University, Khulna, Bangladesh. He earned both his Bachelor and Master of Science degrees in Computer Science and Engineering from Khulna University.
By Md. Mahmudul Hasan Md. Titas Ahmmed Md. Farhan Sadique
DOI: https://doi.org/10.5815/ijem.2026.03.21, Pub. Date: 8 Jun. 2026
Video clip extraction is the process of generating shorter, focused video segments by identifying and retaining frames that contain a user-specified object of interest (UOoI). Such targeted extraction allows users to access relevant portions of a video without watching the entire recording, with practical use in surveillance review, content management, and educational settings. In this work, we present an object-conditioned video clip extraction framework that uses the pretrained YOLOv7 detector to perform frame-level analysis of an input video. For each frame, the detector produces a set of class labels, which are matched against the user-selected UOoI to produce a binary per-frame detection signal. A one-dimensional temporal-window voting filter is then applied to this signal to suppress isolated false positives and recover isolated false negatives, addressing the single-frame detection noise that produces visible discontinuities in naive frame-by-frame approaches. The voted-positive frame indices are mapped back to source timestamps, and the corresponding audio segments are extracted directly from the source video using ffmpeg, preserving the original audio track in the output clip. The framework uses a dictionary of 80 object categories drawn from the MS COCO label set, and a graphical user interface allows users to select an input video, choose a target object, preview the input, and obtain the extracted clip with audio. We evaluate the framework on the SumMe benchmark, which we re-annotated at the frame level for object presence, and on a newly annotated set of 39 videos collected from public sources. Both datasets were independently labelled by two annotators, with Cohen’s kappa of 0.85 and 0.83, respectively, and disagreements resolved by a third adjudicator. At the default voting configuration of W=5, K=3, the framework attains an F1-score of 70.88% with 90.12% accuracy on SumMe and an F1-score of 69.89% with 85.13% accuracy on the custom dataset. An ablation over voting parameters shows monotonic gains on SumMe across the full sweep, and a smaller, dataset-dependent gain on the custom set. We discuss the remaining limitations of the pipeline, including single-UOoI conditioning, dependence on the MS COCO label vocabulary, and abrupt transitions between non-adjacent extracted segments, and outline directions for addressing them. abstract.
[...] Read more.By Md. Farhan Sadique S M Rafizul Haque
DOI: https://doi.org/10.5815/ijitcs.2020.03.03, Pub. Date: 8 Jun. 2020
Content-based image retrieval (CBIR) is the process of retrieving similar images of a query image from a source of images based on the image contents. In this paper, color and texture features are used to represent image contents. Color layout descriptor (CLD) and gray-level co-occurrence matrix (GLCM) are used as color and texture features respectively. CLD and GLCM are efficient for representing images with local dominant regions. For retrieving similar images of a query image, the features of the query image is matched with that of the images of the source. We use cityblock distance for this feature matching purpose. K-nearest images using cityblock distance are the similar images of a query image. Our CBIR approach is scale invariant as CLD is scale invariant. Another set of features, GLCM defines color patterns. It makes the system efficient for retrieving similar images based on spatial relationships between colors. We also measure the efficiency of our approach using k-nearest neighbors algorithm. Performance of our proposed method, in terms of precision and recall, is promising and better, compared to some recent related works.
[...] Read more.Subscribe to receive issue release notifications and newsletters from MECS Press journals