Projects related to the preservation, democratization of access, and research of historical documents heavily benefit from Document Layout Analysis (DLA) and downstream tasks such as Optical Character Recognition. In recent years, neural networks developed for general-purpose object detection have seen immense interest and rapid development, resulting in improved accuracy and efficiency. Given the similarity between layout analysis and object detection, this research conducts a comparative study of the latestgeneration object detection algorithms on DLA tasks. Specifically, we evaluate and compare the performance of YOLOv9, YOLOv10, YOLOv12, and a custom-designed hybrid model on both a largescale modern dataset (PubLayNet) and a specialized historical collection (OCR-D). Furthermore, this work investigates the impact of transfer learning, analyzing the generalization from clean, contemporary documents to noisy, complex historical ones. The results demonstrate that these state-of-the-art models are highly effective for DLA and that fine-tuning a model pre-trained on a modern dataset significantly improves performance on historical documents, with YOLOv9 and YOLOv12 emerging as the topperforming architectures.
O Computer on the Beach é um evento técnico-científico que visa reunir profissionais, pesquisadores e acadêmicos da área de Computação, a fim de discutir as tendências de pesquisa e mercado da computação em suas mais diversas áreas.