Md. Farhan Sadique

Work place: Computer Science and Engineering Discipline, Khulna University, Khulna, Bangladesh

E-mail: farhan@cse.ku.ac.bd

Website: https://orcid.org/0000-0002-7454-0638

Research Interests: Data Structures and Algorithms, Computer systems and computational processes, Computational Science and Engineering

Biography

Md. Farhan Sadique is a Lecturer in the Computer Science and Engineering Discipline at Khulna University, Khulna, Bangladesh. He earned both his Bachelor and Master of Science degrees in Computer Science and Engineering from Khulna University.

Author Articles

User Object of Interest Based Video Clip Extraction using Pretrained YOLOv7

By Md. Mahmudul Hasan Md. Titas Ahmmed Md. Farhan Sadique

DOI: https://doi.org/10.5815/ijem.2026.03.21, Pub. Date: 8 Jun. 2026

Video clip extraction is the process of generating shorter, focused video segments by identifying and retaining frames that contain a user-specified object of interest (UOoI). Such targeted extraction allows users to access relevant portions of a video without watching the entire recording, with practical use in surveillance review, content management, and educational settings. In this work, we present an object-conditioned video clip extraction framework that uses the pretrained YOLOv7 detector to perform frame-level analysis of an input video. For each frame, the detector produces a set of class labels, which are matched against the user-selected UOoI to produce a binary per-frame detection signal. A one-dimensional temporal-window voting filter is then applied to this signal to suppress isolated false positives and recover isolated false negatives, addressing the single-frame detection noise that produces visible discontinuities in naive frame-by-frame approaches. The voted-positive frame indices are mapped back to source timestamps, and the corresponding audio segments are extracted directly from the source video using ffmpeg, preserving the original audio track in the output clip. The framework uses a dictionary of 80 object categories drawn from the MS COCO label set, and a graphical user interface allows users to select an input video, choose a target object, preview the input, and obtain the extracted clip with audio. We evaluate the framework on the SumMe benchmark, which we re-annotated at the frame level for object presence, and on a newly annotated set of 39 videos collected from public sources. Both datasets were independently labelled by two annotators, with Cohen’s kappa of 0.85 and 0.83, respectively, and disagreements resolved by a third adjudicator. At the default voting configuration of W=5, K=3, the framework attains an F1-score of 70.88% with 90.12% accuracy on SumMe and an F1-score of 69.89% with 85.13% accuracy on the custom dataset. An ablation over voting parameters shows monotonic gains on SumMe across the full sweep, and a smaller, dataset-dependent gain on the custom set. We discuss the remaining limitations of the pipeline, including single-UOoI conditioning, dependence on the MS COCO label vocabulary, and abrupt transitions between non-adjacent extracted segments, and outline directions for addressing them.

[...] Read more.

Content-Based Image Retrieval Using Color Layout Descriptor, Gray-Level Co-Occurrence Matrix and K-Nearest Neighbors

By Md. Farhan Sadique S M Rafizul Haque

DOI: https://doi.org/10.5815/ijitcs.2020.03.03, Pub. Date: 8 Jun. 2020

Content-based image retrieval (CBIR) is the process of retrieving similar images of a query image from a source of images based on the image contents. In this paper, color and texture features are used to represent image contents. Color layout descriptor (CLD) and gray-level co-occurrence matrix (GLCM) are used as color and texture features respectively. CLD and GLCM are efficient for representing images with local dominant regions. For retrieving similar images of a query image, the features of the query image is matched with that of the images of the source. We use cityblock distance for this feature matching purpose. K-nearest images using cityblock distance are the similar images of a query image. Our CBIR approach is scale invariant as CLD is scale invariant. Another set of features, GLCM defines color patterns. It makes the system efficient for retrieving similar images based on spatial relationships between colors. We also measure the efficiency of our approach using k-nearest neighbors algorithm. Performance of our proposed method, in terms of precision and recall, is promising and better, compared to some recent related works.

[...] Read more.

MECS Press Menu

Md. Farhan Sadique

Author Articles

User Object of Interest Based Video Clip Extraction using Pretrained YOLOv7

Content-Based Image Retrieval Using Color Layout Descriptor, Gray-Level Co-Occurrence Matrix and K-Nearest Neighbors

Other Articles