Abstract
Financial malware is an increasing threat due to the potential of
profit for cybercriminals. Year after year, the arms race between
malware developers and security professionals foster the release of
millions of malware variants. Although machine learning techniques
have been successfully applied to malware classification tasks,
the occurrence of concept drifts requires constant updates (or even
retraining) of these models. In this work, we evaluate how distinct
machine learning training approaches for malware classification
when we consider the arrival flow of samples as a data stream. We
also observe the manifestation of concept drift in an actual dataset
comprised of thousands of financial malware samples, discussing
different incremental learning approaches to deal with a custom
dataset and how concept drift can arise in malware data.
O Computer on the Beach é um evento técnico-científico que visa reunir profissionais, pesquisadores e acadêmicos da área de Computação, a fim de discutir as tendências de pesquisa e mercado da computação em suas mais diversas áreas.