• Resumo

    Processamento de Linguagem Natural em Processos de Tomada de Contas Especial

    Data de publicação: 28/05/2024

    ABSTRACT
    In recent years, due to the significant volume of produced text
    documents, challenges have arisen in the search and analysis of
    content, necessitating the development of techniques for the extraction
    of useful information. In the field of Law, where the majority of
    information is in legal texts, information extraction has become crucial
    for discovering knowledge in unstructured data. Named entity
    recognition, driven by the advancement of deep learning models,
    stands out as the main technique for this task. This project aimed
    to explore the possibility of expanding the number of explanatory
    variables beyond those available on the institutional website of
    Tribunal de Contas da União, using natural language processing
    techniques in the Special Accounting processes. The development
    of the proposal included web scraping for data collection, preprocessing
    of pieces, entity annotation, fine-tuning pre-trained models
    in the legal domain and named entity recognition task, in addition
    to extracting entities. From the selected texts, 388,201 records
    (tokens and/or phrases) were extracted, with 286,781 and 101,420
    records from the Instruction and Judgment pieces, respectively, confirming
    the research hypothesis and demonstrating the feasibility
    of expanding variables using natural language processing.

O Computer on the Beach é um evento técnico-científico que visa reunir profissionais, pesquisadores e acadêmicos da área de Computação, a fim de discutir as tendências de pesquisa e mercado da computação em suas mais diversas áreas.

Anais do Computer on the Beach

O Computer on the Beach é um evento técnico-científico que visa reunir profissionais, pesquisadores e acadêmicos da área de Computação, a fim de discutir as tendências de pesquisa e mercado da computação em suas mais diversas áreas.

Access journal