• Resumo

    Evaluating the Impact of Data Imputation on Knowledge Discovery in Smart Grid Data

    Data de publicação: 09/06/2026

    Missing data are common in smart grid environments, especially in underground substations where operational constraints lead to incomplete time-series measurements. Because the Knowledge Discovery in Databases (KDD) process depends on consistent datasets, imputation is essential to preserve analytical reliability. This study evaluates how data reconstruction affects a previously validated hybrid KDD framework applied to real substation measurements. Missingness levels of 5%, 10%, 20%, and 30% were simulated under a Missing Completely at Random (MCAR) mechanism, and the Modified Akima Interpolation Method (MAKIMA) was used to restore the affected series. The reconstructed datasets were then processed through EM clustering and Apriori association rule mining and compared with the original data. Error metrics (MAE, RMSE, R2 ) showed high reconstruction accuracy, with R2 above 0.999. Clustering deviations remained below 0.5%, and association rules retained their structure with minimal changes. The findings indicate that the KDD framework remains stable with up to 30% MCAR

Anais do Computer on the Beach

O Computer on the Beach é um evento técnico-científico que visa reunir profissionais, pesquisadores e acadêmicos da área de Computação, a fim de discutir as tendências de pesquisa e mercado da computação em suas mais diversas áreas.

Access journal