Análisis de sentimiento en tweets de fútbol argentino

Ferreyra, Mario Ezequiel

View/Open

Ferreyra, M. E. Análisis de sentimiento en tweets de fútbol argentino.pdf (1.701Mb)

Date

2021

Author

Ferreyra, Mario Ezequiel

Advisor

Luque, Franco Martín

Metadata

Show full item record

Abstract

En la actualidad la cantidad de datos que se genera en las redes sociales es gigantesca. Aquı́ es donde los sistemas de Análisis de Sentimiento resultan de gran utilidad, ya que su principal objetivo es identificar opiniones positivas o negativas en los textos de los usuarios sobre algún producto o marca. Para la construcción de sistemas de Análisis de Sentimiento se utilizan conjuntos de datos anotados con polaridad. Sin embargo, los recursos disponibles para el idioma español son limitados, particularmente para el castellano de Argentina a donde prácticamente no existen. En este trabajo construimos un corpus de tweets en español de Argentina orientado al tópico del Fútbol Argentino. Para ello se recolectó una gran cantidad de tweets, que luego pasó por etapas de filtrado y anotación realizada por voluntarios, aplicando criterios claros y explı́citos definidos por nosotros. Luego, diseñamos e implementamos distintos sistemas de clasificación de sentimiento, usando técnicas estándar de preprocesamiento, recursos lingüı́sticos y distintas representaciones de los tweet. Realizamos experimentos utilizando para entrenar y evaluar el corpus de nuestra creación, ası́ como también otros corpus en español previamente existentes. Finalmente hicimos un análisis de los modelos y de los resultados de la evaluación.

Currently the amount of data generated on social networks is gigantic. This is where Sentiment Analysis systems are very useful, since their main objective is to identify positive or negative opinions in users’ texts about a product or brand. For the construction of Sentiment Analysis systems, datasets annotated with polarity are used. However, the resources available for the Spanish language are limited, particularly for the Castilian of Argentina where they practically do not exist. In this work we build a corpus of tweets in Spanish from Argentina oriented to the topic of Argentine Soccer. For this, a large number of tweets were collected, which then went through filtering and annotation stages carried out by volunteers, applying clear and explicit criteria defined by us. Then, we designed and implemented different sentiment classification systems, using standard preprocessing techniques, language resources, and different representations of tweets. We carry out experiments using the corpus of our creation, as well as previously existing Spanish corpora, for training and evaluation. Finally we did an analysis of the models and the evaluation results.

URI

http://hdl.handle.net/11086/18384

Collections

Trabajos Especiales de Licenciatura en Ciencias de la Computación

The following license files are associated with this item:

Creative Commons

Except where otherwise noted, this item's license is described as Atribución-NoComercial-CompartirIgual 4.0 Internacional