Machine learning to find naughty steemians

in #programming6 years ago (edited)

The last few days I have been learning a bit about machine learning. So I thought it might be a nice idea to develop a machine learning python script to locate comment abuse. It is still work in progress but I think I have it mostly thought out.

Photo taken in Santorini

I made a function which given some text checks for sentence similarity, change in vocabulary and a couple of other stuff, all of this information is summarised in a couple of numbers. These are something like a score. Having low scores does not mean that you are an abuser, similarly having high scores does not mean that you are not an abuser. So why calculate these scores? I expect that abusers will have similar scores. Abusers cluster together like flies cluster together on shit. :D I can identify these clusters using my scores and some math. In doing so, I have a good guess if someone is an abuser or not.

One of the drawbacks of this method is that abusers could actively try to get the right score. But for certain cases you can see that this is happening by tracking score changes of users.

Currently, I need to get all the standard steem pythons scripts functioning on Windows. I did have this functioning on my Linux machine but I had to return that machine to university since my contract expired. I am also still working on adding some more scores to my programs.

Posted using Partiko Android

Sort:  

"Killer. So alluring!" to quote the spam bots :)

As fascinating as it would be to have a look at the actual code, I imagine you won't be publishing it just yet. But the #utopian-io community would probably love to see an expanded post on the subject at some point.

I suspecting that the only thing it will find are highly structured spam D:

Reminded me of you

45733536_10214621207501816_2203642411274469376_n.jpg

Didn't hear that one bee-fore

That is why my profile pic has an 8 for hate

Hello! Your post has been resteemed and upvoted by @ilovecoding because we love coding! Keep up good work! Consider upvoting this comment to support the @ilovecoding and increase your future rewards! ^_^ Steem On!

Reply !stop to disable the comment. Thanks!

Congratulations! This post has been upvoted from the communal account, @minnowsupport, by mathowl from the Minnow Support Project. It's a witness project run by aggroed, ausbitbank, teamsteem, someguy123, neoxian, followbtcnews, and netuoso. The goal is to help Steemit grow by supporting Minnows. Please find us at the Peace, Abundance, and Liberty Network (PALnet) Discord Channel. It's a completely public and open space to all members of the Steemit community who voluntarily choose to be there.

If you would like to delegate to the Minnow Support Project you can do so by clicking on the following links: 50SP, 100SP, 250SP, 500SP, 1000SP, 5000SP.
Be sure to leave at least 50SP undelegated on your account.

YOU JUST GOT UPVOTED

Congratulations,
you just received a 13.40% upvote from @steemhq - Community Bot!

Wanna join and receive free upvotes yourself?
Vote for steemhq.witness on Steemit or directly on SteemConnect and join the Community Witness.

This service was brought to you by SteemHQ.com

Hi @mathowl!

Your post was upvoted by @steem-ua, new Steem dApp, using UserAuthority for algorithmic post curation!
Your UA account score is currently 3.389 which ranks you at #7110 across all Steem accounts.
Your rank has improved 15 places in the last three days (old rank 7125).

In our last Algorithmic Curation Round, consisting of 270 contributions, your post is ranked at #102.

Evaluation of your UA score:
  • You're on the right track, try to gather more followers.
  • The readers appreciate your great work!
  • You have already shown user engagement, try to improve it further.

Feel free to join our @steem-ua Discord server

Coin Marketplace

STEEM 0.28
TRX 0.13
JST 0.032
BTC 60385.11
ETH 2889.75
USDT 1.00
SBD 3.65