Morphological normalization is an important part of Natural Language Processing Systems and aids the detection of similar terms in texts. This paper presents an approach for comparison of morphological normalization techniques in the context of text similarity analysis. This proposal regards similarity analysis in two levels: lexical and semantic. We suggest the use of a recent workshop dataset with 10.000 pairs of sentences and its human assigned similarity index for performance analysis of these techniques. This proposal may be adapted to assist natural language
processing projects choice of morphological normalization approach.
O Computer on the Beach é um evento técnico-científico que visa reunir profissionais, pesquisadores e acadêmicos da área de Computação, a fim de discutir as tendências de pesquisa e mercado da computação em suas mais diversas áreas.