IJISA Vol. 13, No. 3, Jun. 2021
Cover page and Table of Contents: PDF (size: 282KB)
Social media presence is a crucial portion of our life. It is considered one of the most important sources of information than traditional sources. Twitter has become one of the prevalent social sites for exchanging viewpoints and feelings. This work proposes a supervised machine learning system for discovering false news. One of the credibility detection problems is finding new features that are most predictive to better performance classifiers. Both features depending on new content, and features based on the user are used. The features' importance is examined, and their impact on the performance. The reasons for choosing the final feature set using the k-best method are explained. Seven supervised machine learning classifiers are used. They are Naïve Bayes (NB), Support vector machine (SVM), K-nearest neighbors (KNN), Logistic Regression (LR), Random Forest (RF), Maximum entropy (ME), and conditional random forest (CRF). Training and testing models were conducted using the Pheme dataset. The feature's analysis is introduced and compared to the features depending on the content, as the decisive factors in determining the validity. Random forest shows the highest performance while using user-based features only and using a mixture of both types of features; features depending on content and the features based on the user, accuracy (82.2 %) in using user-based features only. We achieved the highest results by using both types of features, utilizing random forest classifier accuracy(83.4%). In contrast, logistic regression was the best as to using features that are based on contents. Performance is measured by different measurements accuracy, precision, recall, and F1_score. We compared our feature set with other studies' features and the impact of our new features. We found that our conclusions exhibit high enhancement concerning discovering and verifying the false news regarding the discovery and verification of false news, comparing it to the current results of how it is developed.[...] Read more.
Traffic is a medium to move from one point to another. Therefore, the role of traffic is very important to support vehicle mobility. If congestion occurs, mobility will be hampered so that it gives influence to other sectors such as financial, air pollution and traffic violations. This study aims to create a model to predict vehicle queue at the traffic lights when its status is red. The prediction is conducted by using Neural Network with Extreme Learning Machine method to predict the length of the vehicle queue, and Correlation Analysis was used to measure the correlation between the connected roads. The conducted experiments use data of the length of the vehicle queue at the traffic lights which was obtained from DISHUB (Transportation Bureau) DI Yogyakarta. Several experiments were carried out to determine the optimum prediction model of vehicle queue length. The experiments found that the optimum model had an average MAPE value of 15.5882% and an average Running Time of 5.2226 seconds.[...] Read more.
With the large quantity of information offered on-line, it's equally essential to retrieve correct information for a user query. A large amount of data is available in digital form in multiple languages. The various approaches want to increase the effectiveness of on-line information retrieval but the standard approach tries to retrieve information for a user query is to go looking at the documents within the corpus as a word by word for the given query. This approach is incredibly time intensive and it's going to miss several connected documents that are equally important. So, to avoid these issues, stemming has been extensively utilized in numerous Information Retrieval Systems (IRS) to extend the retrieval accuracy of all languages. These papers go through the problem of stemming with Web Page Categorization on Gujarati language which basically derived the stem words using GUJSTER algorithms . The GUJSTER algorithm is based on morphological rules which is used to derived root or stem word from inflected words of the same class. In particular, we consider the influence of extracted a stem or root word, to check the integrity of the web page classification using supervised machine learning algorithms. This research work is intended to focus on the analysis of Web Page Categorization (WPC) of Gujarati language and concentrate on a research problem to do verify the influence of a stemming algorithm in a WPC application for the Gujarati language with improved accuracy between from 63% to 98% through Machine Learning supervised models with standard ratio 80% as training and 20% as testing.[...] Read more.
The desire to achieve an adaptive prognostics regression learning processes of physical and empirical phenomenon is a complex task and open problem in radio frequency telecommunication engineering. One key method to solving such complex task or problems is by means of numerical based optimisation algorithms. The Levenberg–Marquardt algorithm (LMA) is an efficient nonlinear parametric machine learning based modelling algorithm with optimal, fast, and accurate convergence speed. This paper proposes and demonstrates the real-time application of the LMA in developing a log-distance like propagation loss model based on received radio strength measurements conducted over deployed long term evolution (LTE) eNodeBs antennas in three different propagation areas. The LTE eNodeB signal propagation areas were selected to reflect typical urban, suburban and rural terrains which represent urban, suburban and rural terrains. The heights of the three eNodeBs are 30, 28 and 32m respectively and each operate at 2.6GHz carrier frequency with 10MHz channel bandwidths. The resultant outcome of the proposed propagation loss modelling using LMA indicates a high approximation efficacy over the popular Gauss-Newton algorithm (GNA) modelling method, which has been used to benchmark the process. Precisely, the developed propagation loss model using LMA method attained lower maximum absolute error (MABE) of 7.73, 14.57and 10.53 for urban, suburban and rural terrains compared to the ones developed by GNA which yielded 15.19, 16.59 and 13.05 MABE values. The improved approximation performance of the LMA over the GNA can be ascribed to its capacity handle multiple free parameters and attain optimum solution irrespective of the selected values of initial guess parameters.[...] Read more.
A complete and accurate cross-language clone detection tool can support software forking process that reuses the more reliable algorithms of legacy systems from one language code base to other. Cross-language clone detection also helps in building code recommendation system. This paper proposes a new technique to detect and classify cross-language clones of C and C++ programs by filtering the nodes of ANTLR-generated parse tree using a common grammar file, CPP14.g4. Parsing the input files using CPP14.g4 provides all the lexical and semantic information of input source code. Selective filtering of nodes performs serialization of two parse trees. Vector representation using term frequency inverse document frequency (TF-IDF) of the resultant tree is given as an input to cosine similarity to classify the clone types. Filtered parse tree of C and C++ increases the precision from 51% to 61%, and matching based on renaming the input/output expressions provides average precision of 91.97% and 95.37% for small scale and large scale repositories respectively. The proposed cross-language clone detection exhibits the highest precision of 95.37% in finding all types of clones (1, 2, 3 and 4) for 16,032 semantically similar clone pairs of C and CPP codes.[...] Read more.