The growth of open data, combined with the rapid development of Large Language Models (LLMs), has enabled significant advances in the scientific and technological fields. In the context of the capital markets, this environment enhances large-scale quantitative and qualitative analysis of disclosed information, which is currently limited due to the semi-structured nature of many source documents. As a labor-intensive and error-prone process, manual data extraction prevents systematic analysis. This extended abstract, as an ongoing work, presents the development of a solution for the automated collection and extraction of data from audit reports, using the CVM Open Data Portal. Through the application of LLMs and prompt engineering, the algorithm systematically interprets and converts textual and numerical information into structured JSON databases, streamlining market analysis and offering strong potential contributions for researchers, investors, regulators, and other stakeholders.
O Computer on the Beach é um evento técnico-científico que visa reunir profissionais, pesquisadores e acadêmicos da área de Computação, a fim de discutir as tendências de pesquisa e mercado da computação em suas mais diversas áreas.