Geoffrey Mariga Wambugu

Work place: Department of Information Technology, School of Computing & Information Technology, Murang’a University of Technology, Murang’a, Kenya



Research Interests: Data Structures and Algorithms, Natural Language Processing, Computational Learning Theory


Dr. Geoffrey Mariga Wambugu received his B.Sc. degree in Mathematics and Computer Science from Jomo Kenyatta University of Agriculture and Technology, Kenya, in 2000, the M.Sc. degree in Information Systems from The University of Nairobi, Nairobi, Kenya, in 2012, and the Ph.D. in Information Technology JKUAT, in 2019. He has served for over ten years as head of the department in higher education institutions in Kenya and has been involved in the design, development, review and implementation of Computing Curricula in different universities in Kenya. Currently, he is a Senior Lecturer and Dean of the School of Computing and Information Technology at Murang’a University of Technology. His research interests include Probabilistic Machine Learning, Text Analytics, Natural Language Processing, Data mining, and Big Data Analytics.

Author Articles
Evaluating Linear and Non-linear Dimensionality Reduction Approaches for Deep Learning-based Network Intrusion Detection Systems

By Stephen Kahara Wanjau Geoffrey Mariga Wambugu Aaron Mogeni Oirere

DOI:, Pub. Date: 8 Aug. 2023

Dimensionality reduction is an essential ingredient of machine learning modelling that seeks to improve the performance of such models by extracting better quality features from data while removing irrelevant and redundant ones. The technique aids reduce computational load, avoiding data over-fitting, and increasing model interpretability. Recent studies have revealed that dimensionality reduction can benefit from labeled information, through joint approximation of predictors and target variables from a low-rank representation. A multiplicity of linear and non-linear dimensionality reduction techniques are proposed in the literature contingent on the nature of the domain of interest. This paper presents an evaluation of the performance of a hybrid deep learning model using feature extraction techniques while being applied to a benchmark network intrusion detection dataset. We compare the performance of linear and non-linear feature extraction methods namely, the Principal Component Analysis and Isometric Feature Mapping respectively. The Principal Component Analysis is a non-parametric classical method normally used to extract a smaller representative dataset from high-dimensional data and classifies data that is linear in nature while preserving spatial characteristics. In contrast, Isometric Feature Mapping is a representative method in manifold learning that maps high-dimensional information into a lower feature space while endeavouring to maintain the neighborhood for each data point as well as the geodesic distances present among all pairs of data points. These two approaches were applied to the CICIDS 2017 network intrusion detection benchmark dataset to extract features. The extracted features were then utilized in the training of a hybrid deep learning-based intrusion detection model based on convolutional and a bi-direction long short term memory architecture and the model performance results were compared. The empirical results demonstrated the dominance of the Principal Component Analysis as compared to Isometric Feature Mapping in improving the performance of the hybrid deep learning model in classifying network intrusions. The suggested model attained 96.97% and 96.81% in overall accuracy and F1-score, respectively, when the PCA method was used for dimensionality reduction. The hybrid model further achieved a detection rate of 97.91% whereas the false alarm rate was reduced to 0.012 with the discriminative features reduced to 48. Thus the model based on the principal component analysis extracted salient features that improved detection rate and reduced the false alarm rate.

[...] Read more.
A Comparative Analysis of Bat and Genetic Algorithms for Test Case Prioritization in Regression Testing

By Anthony Wambua Wambua Geoffrey Mariga Wambugu

DOI:, Pub. Date: 8 Feb. 2023

Regression testing is carried out to ensure that software modifications do not introduce new potential bugs to the existing software. Existing test cases are applied in the testing, such test cases can run into thousands, and there is not much time to execute all of them. Test Case Prioritization (TCP) is a technique to order test cases so that the test cases potentially revealing more faults are performed first. With TCP being deemed an optimization problem, several metaheuristic nature-inspired algorithms such as Bat, Genetic, Ant colony, and Firefly algorithms have been proposed for TCP. These algorithms have been compared theoretically or based on a single metric. This study employed an experimental design to offer an in-depth comparison of bat and genetic algorithms for TCP. Unprioritized test cases and a brute-force approach were used for comparison. Average Percentage Fault Detection (APFD)- a popular metric, execution time and memory usage were used to evaluate the algorithms’ performance. The study underscored the importance of test case prioritization and established the superiority of the Genetic algorithm over the bat algorithm for TCP in APFD. No stark differences were recorded regarding memory usage and execution time for the two algorithms. Both algorithms seemed to scale well with the growth of test cases.

[...] Read more.
Other Articles