• Resumo

    Detecção e Classificação de Documentos contendo Macros Maliciosas com base em Processamento de Linguagem Natural

    Data de publicação: 27/05/2025

    Abstract
    Macros are functions written in Visual Basic for automation within
    MS Office documents. On the one hand, macros bring many facilities
    to home users and organizations. On the other hand, they have
    also stimulated the arise of macro viruses (malware that exploit
    macros to infect users loading compromised documents, spreadsheets,
    and presentations). Those viruses may delete data and steal
    information, and have been causing losses of billions of dollars in
    global attacks. Although more prevalent in the 1990s and 2000s,
    macro viruses resurged in the last decade and continue to threat
    current MS Windows/Office users. In this article, we present a natural
    language processing-based pipeline to detect macros in MS
    Office documents and classify them in malicious or benign. Using
    byte2vec as document representation, we outperform the state-ofthe-
    art in Macro detection, reaching over 99% of Precision-Recall
    Area Under Curve (PRAUC) metric for four out of seven evaluated
    classifiers (and over 98% PRAUC in the remaining three classifiers).

Anais do Computer on the Beach

O Computer on the Beach é um evento técnico-científico que visa reunir profissionais, pesquisadores e acadêmicos da área de Computação, a fim de discutir as tendências de pesquisa e mercado da computação em suas mais diversas áreas.

Access journal