Mykola Gorbatenko

Work place: Department of Mathematical Modeling, Yuriy Fedkovych Chernivtsi National University, Chernivtsi, 58000, Ukraine

E-mail: m.gorbatenko@chnu.edu.ua

Website:

Research Interests:

Biography

Mykola Gorbatenko was born in the city of Chernivtsi, located in the Chernivtsi region of Ukraine. He received his Master’s degree in Computer Science from the Faculty of Applied Mathematics at Chernivtsi National University, Ukraine, in 2005. In 2013, he was awarded the scientific degree of Candidate of Physical and Mathematical Sciences (specialty 01.05.04 – System Analysis and Theory of Optimal Solutions) by Taras Shevchenko National University of Kyiv, Ukraine.
He is an Associate Professor in the Department of Mathematical Modeling at Chernivtsi National University, Ukraine. His areas of interest include analysis, optimization, architecture, and high-load applications.

Author Articles

Data Optimization through Compression Methods Using Information Technology

By Igor V. Malyk Yevhen Kyrychenko Mykola Gorbatenko Taras Lukashiv

DOI: https://doi.org/10.5815/ijitcs.2025.05.07, Pub. Date: 8 Oct. 2025

Efficient comparison of heterogeneous tabular datasets is difficult when sources are unknown or weakly documented. We address this problem by introducing a unified, type-aware framework that builds compact data represen- tations (CDRs)—concise summaries sufficient for downstream analysis—and a corresponding similarity graph (and tree) over a data corpus. Our novelty is threefold: (i) a principled vocabulary and procedure for constructing CDRs per variable type (factor, time, numeric, string), (ii) a weighted, type-specific similarity metric we call Data Information Structural Similarity (DISS) that aggregates distances across heterogeneous summaries, and (iii) an end-to-end, cloud-scalable real- ization that supports large corpora. Methodologically, factor variables are summarized by frequency tables; time variables by fixed-bin histograms; numeric variables by moment vectors (up to the fourth order); and string variables by TF–IDF vectors. Pairwise similarities use Hellinger, Wasserstein (p=1), total variation, and L1/L2 distances, with MAE/MAPE for numeric summaries; the DISS score combines these via learned or user-set weights to form an adjacency graph whose minimum-spanning tree yields a similarity tree. In experiments on multi-source CSVs, the approach enables accurate retrieval of closest datasets and robust corpus-level structuring while reducing storage and I/O. This contributes a repro- ducible pathway from raw tables to a similarity tree, clarifying terminology and providing algorithms that practitioners can deploy at scale.

[...] Read more.

MECS Press Menu

Mykola Gorbatenko

Author Articles

Data Optimization through Compression Methods Using Information Technology

Other Articles