ABSTRACT
This paper evaluates the ability of seven Large Language Models
(LLMs) to solve logic challenges. The models tested include GPT-
4, Claude 3.5 (Sonnet and Haiku), Gemini 1.5, Llama 3.1, Grok,
and Mistral 7B. Four challenges were proposed, and the results
demonstrate that none of the LLMs were able to solve all challenges
correctly. The study highlights the current limitations of LLMs in
logical reasoning tasks, despite advancements in other areas of
natural language processing.
O Computer on the Beach é um evento técnico-científico que visa reunir profissionais, pesquisadores e acadêmicos da área de Computação, a fim de discutir as tendências de pesquisa e mercado da computação em suas mais diversas áreas.