[SentiSteem #4] Twitter popularity analysis of word "messi" between 2009-01-01 and 2018-12-31

in #technology5 years ago (edited)

sentiment.png

Hello world! Welcome to report where I'm using machine learning to analyze tweets about specified topic and present results in form of various and easy to understand charts. The sentiment analysis algorithm has been developed as part of my Master Thesis in 2017/2018.

This report is currently being published exclusively here on Steemit.

Power House Creatives Logos FINAL.png

Parameters

Today's analysis has been executed on tweets which contain word "messi" and were published between 2009-01-01 and 2018-12-31. Detailed specification of the data is shown in the following list:

  • Keyword: messi
  • From: 2009-01-01
  • To: 2018-12-31
  • Number of analyzed tweets: 60000
  • Tweets per week: 114
  • Language: en
  • Geographical location: Not specified

text16.png

Results

Sentiment

After downloading 60000 tweets between the specified dates, sentiment analysis has been executed on each and every one of those tweets. Sentiment score has been then aggregated over weeks and months, to lower the granularity of results on the time axis and then plotted as a following linechart.

sentiment.png

Sentiment of tweets for keyword "messi"

My subjective comment on the chart: The constant value of sentiment only shows what a true legend he really is as he managed to be on top for 10 years straight. Also, we can see that in 2013, there are several months with lower scores - it's caused by so many injuries he had that year.

Aggregation using heatmaps


To show the general trend/pattern in the sentiment, linechart works great. We can see the bigger timeframe and estimate the long-term direction. But if you're interested in particular month or week, it's hard and in case of weeks actually impossible to see the change. Has an athlete put the great performance in particular match? Has the brand/company released a new line of product? So see such low lever changes, following 2 heatmaps are to be used.

heatMap.png

Chart shows average sentiment per month where 0.38 is the worst and 0.78 the best achieved score

My subjective comment on the chart: OMG!! How could you not love this?? Why's there the huge boost in 2015 June?? Barca got the TREBLE! + he shot one of the nicest goals of his career on 30th of May. Maaaan I love when this works! Such a beautiful popularity spike in June when they got treble! I'm speechless :)

heatMapWeekly.png

Chart shows average sentiment per week where 0.38 is the worst and 0.91 the best achieved score

My subjective comment on the chart: I'm not sure how to interpret this chart so if you guys see some explanation/pattern in there, please let me know in the comments. I'd reply to every comment (as I always do anyway)

Most frequently used words


Another very interesting aspect to look into are the repeatedly used words using wordclouds. Even more interesting is to compare two wordclouds generated from different time - usually before and after some event/change happened. If you give this a second though, the problem here is that many short words (like "and", "or", "with" and so on) are used almost in every sentence and would also show up in wordclouds. To mitigate this, I've removed list of 153 so called stopwords. Additionally I've also removed words typical for this area listed in the end of the report*.

CommonWords.png

Most often used words in tweets containing word "messi" before and after 2013-12-31.

My subjective comment on the chart: Omg..I knew Ronaldo will be there...but wouldn't say sooooo much. According to twitter, Messi's career is more connected with Ronaldo than with Barca. This is amazing, what a rivalry!

Most frequently used UNIQUE words

As we can see in the previous worldcloud, there are many words which are actually shared in both wordclouds. That makes all the sense as there are many areas which will be forever connected with messi. But I went one step further and decided to create wordclouds which contain only unique words with don't appear in the opposite wordcloud.

UniqueWords.png

Most often UNIQUE used words in tweets containing word "messi" before and after 2013-12-31.

My subjective comment on the chart: Interesting to see that "d'Or" hasnt been mentioned so often after 2015 - mainly cuz Ronaldo took over. You really gotta love this algorithm of mine :) Names like Suarez and Messi also got a huuuge boost as these tree created an unbeatable attacking trio.

* words excluded from all 4 wordclouds are: yii,bit.ly,.ly,messi, lionel, messi messi

BONUS - shaped wordcloud from all words!

This one is just for fun :) It's generated from 1000 most popular words in all tweets, not divided into before and after groups. Click it to open!

shapedWords.png

Power House Creatives Logos FINAL.png

About project

This series of posts shows the power of machine learning and it's application in the real life. It also makes kind of symbolical point of analyzing Twitter and publishing it here on Steemit. Technology of the future is being used on the social media of the future ;)

Get your report


Twitter sentiment analysis reports are being sold for quite some dollars in the world outside of Steemit. In our tiny word of Steemit, such price would be way too much - that's why I'm offering to generate& send you a report with your chosen keyword and dates for a laughable price - 5 STEEM. Order 3 and get the fourth one for free :)

Interested in how's you favorite coin doing on Twitter? Or favorite athlete? Politician, actor or clothes company? . Just DM me and you'll get the full report under 48 hours :)

Power House Creatives Logos FINAL.png

Hope you enjoyed! Matko.

Sort:  




This post has been voted on by the SteemSTEM curation team and voting trail in collaboration with @curie.

If you appreciate the work we are doing then consider voting both projects for witness by selecting stem.witness and curie!

For additional information please join us on the SteemSTEM discord and to get to know the rest of the community!

Just here to support because I have no idea about any of this. Cheers!

haha thanks! :) seems noone understands it and therefore noone upvotes it haha :D I enjoyed it a lot tho! :D

Well, ain't we all here to write what we enjoy the most?! Just go for it!!!

Well, ain't we all here
To write what we enjoy the
Most?! Just go for it!!!

                 - joelai


I'm a bot. I detect haiku.

Wow! Are you sure you did a Masters and not a Doctorate! This is all way beyond my fireplace and I admire your intellect. Blessings!

hah thanks! It's deffo not worth PhD, believe me :D

Hi, @matkodurko!

You just got a 0.61% upvote from SteemPlus!
To get higher upvotes, earn more SteemPlus Points (SPP). On your Steemit wallet, check your SPP balance and click on "How to earn SPP?" to find out all the ways to earn.
If you're not using SteemPlus yet, please check our last posts in here to see the many ways in which SteemPlus can improve your Steem experience on Steemit and Busy.

Hi @matkodurko!

Your post was upvoted by @steem-ua, new Steem dApp, using UserAuthority for algorithmic post curation!
Your UA account score is currently 3.606 which ranks you at #5840 across all Steem accounts.
Your rank has not changed in the last three days.

In our last Algorithmic Curation Round, consisting of 221 contributions, your post is ranked at #88.

Evaluation of your UA score:
  • You're on the right track, try to gather more followers.
  • The readers like your work!
  • Good user engagement!

Feel free to join our @steem-ua Discord server

Presenting data in attractive way, good job. Especially the silhouette word cloud at the end! At the very least, it is a cool visual product :)

Coin Marketplace

STEEM 0.26
TRX 0.11
JST 0.033
BTC 64498.18
ETH 3079.08
USDT 1.00
SBD 3.86