Sentiment Analysis Comparison

of tweets that contain #python #rstats & both #python #rstats

Comparison of sentiment analysis conducted with a lexicon and rule-based dictionary and state-of-the-art pre-trained language models.

More specifically, vader compound sentiment scores and the classification obtained from Hugging Face's cardiffnlp/twitter-xlm-roberta-base-sentiment transformer model are compared.

The comparison reveals many outliers in the two following categories:

Bots that have only #rstats and are classified as neutral by HF transformer model
Non-Bots that have both #rstats & #python and are classified as neutral by HF transformer model

The tweets with min. and max. vader compound scores are further analyzed with an interpretable AI framework.

The overall analyses and results highlight the importance of tokenization and how different tokenization algorithms can dramatically impact NLP tasks.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
plots		plots
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
s0_collect_prep_data.R		s0_collect_prep_data.R
s1_Sentiments.R		s1_Sentiments.R
s2_transformers_explain_evaluate.ipynb		s2_transformers_explain_evaluate.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Sentiment Analysis Comparison

of tweets that contain #python #rstats & both #python #rstats

About

Releases

Packages

Languages

License

mmuratardag/DS_DictTrans_Sentiment

Folders and files

Latest commit

History

Repository files navigation

Sentiment Analysis Comparison

of tweets that contain #python #rstats & both #python #rstats

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages