Classification and similarity detection of Indonesian scientific journal articles

Authors

Keywords:

Classification, Cosine similarity, GARUDA, Naïve Bayes, Similarity

Abstract

The development of technology is accelerating in finding references to scientific articles or journals related to research topics. One of the sources of national aggregator services to find references is Garba Rujukan Digital (GARUDA), developed by the Ministry of Education, Culture, Research, and Technology (Kemendikbudristek) of the Republic of Indonesia. The naïve Bayes method classifies articles into several categories based on titles and abstracts. The system achieves an F1-score of 98%, which indicates high classification accuracy, and the classification process takes less than 60 minutes. Article similarity detection is done using the cosine similarity method, and a similarity score of 0.071 reflects the degree of similarity between the title and the abstract that has been concatenated, while a score close to 1 indicates a higher similarity. Searching for similar scientific articles based on title and abstract, sort articles based on the results of the highest similarity score are the most similar articles, and generating article categories. The results of the research show that the proposed method significantly improves the classification and search processes in GARUDA, as well as accurate and efficient similarity detection.

Downloads

Published

2025-06-24

How to Cite

[1]
Nyimas Sabilina Cahyani, Deris Stiawan, Abdiansah Abdiansah, Nurul Afifah, and Dendi Renaldo Permana, “Classification and similarity detection of Indonesian scientific journal articles”, Comput Sci Inf Technol, vol. 6, no. 2, pp. 147–158, Jun. 2025.

Issue

Section

Articles

Similar Articles

1 2 3 > >> 

You may also start an advanced similarity search for this article.