Post navigation

Why I Need Twitter Distillation Tools

The following may not be news to those who regularly hang out in Twitter-land, but the extent of the problem recently became clear to me: there is a bunch of spam in Twitter. More specifically, there appear to be robots that do nothing but scan the web for keywords and create tweets with links back to them. There appear to be some that value this service (judging by the number of followers of these Twitter users), but for me it just adds to the general clutter I find in Twitter.

So — here is the situation. Yesterday I posted a blog message that has my upcoming ALA Midwinter meeting plans. I’ve got a WordPress plugin that injects an announcement of that post into my Twitter stream. Since I like my blog to be the definitive source of discussions surrounding my blog posts, I also run another plug-in (from the Backtype service) that takes commentary found in other social media sites and adds them as comments to my blog posting. I’ve set the latter plug-in to add such comments to my “pending” queue rather than posting them automatically.

When I looked at my pending comment queue this morning, I saw that Backtype found not only my own tweet of the post1, but also five others from “people” I haven’t encountered before. (No links here because I don’t want to offer any Google juice if there is something nefarious going on.)

In all cases, the Twitter IDs seem to be unlike other spammers — ones that I typically associate with spammers are names with a string of numbers. These look like real names.

Three of the five accounts have over 1,000 followers — usually the mark of someone legitimate. Heck…that is more than I have by far!

Three accounts (TechnoTrendz, ddaville, and FrankyConnelly) also add an excerpt of text from deep inside the post: “Next planned event is the discussion mee” Two of these three use the same bit.ly short link.

The original post did not use a third-party URL shortener.2 These five posts contain 3 unique bit.ly short links, with three of the five using the same short link.

All told, this looks suspicious. It is also the sort of thing that leads me to use third-party tools to distill Twitter content into something more manageable and less spam-y. Have others noticed the same thing? Do you have any coping strategies for dealing with the Twitter stream?

Evening Update

Okay, something funky is going on. This post generated seven of these title-plus-short-URL tweets from people I’ve never heard of: viral_veronica (97 followers, no profile URL); Phillips_mktgrp (6,620 followers, profile URL to a broken hosted site); ReclinIncomeRSS (1,649 followers, no profile URL); dmeyer11 (1,696 followers, no profile URL); Tweeting4Cash (7,422 followers, broken profile URL); PaulGoldman123 (10,663 followers, spamy profile URL); and glennsnews (1,264 followers, no profile URL). One other thing I’ve noticed in common with all of these is that their tweets of my blog post headline is coming from the Twitterfeed service. Twitterfeed seems to take an RSS feed and automates the process of creating tweets and Facebook updates and posts to other social networking services. So it would seem that someone is grabbing my blog post feed, or some derivative of a ping-back service or something else, and automatically feeding tweets into Twitter.

So the question would be — for what purpose? To as fodder to mask truly spamy tweets? Because the account owner thinks their followers might all be interested in what I’m saying? What I do know is that this practice — at least for my blog posts — has increased dramatically in the past few weeks. I don’t think this was happening earlier this month…

The text was modified to remove a link to http://search.twitter.com/search?q=&ands=&phrase=&ors=Peter+Murray&nots=&tag=&lang=all&from=infopeep&to=&ref=&near=&within=15&units=mi&since=2009-12-27&until=2009-12-29&rpp=15 on August 22nd, 2013.

Footnotes

Oddly, I didn’t get a tweet from InfoPeep — the reposting service based on the Code4Lib Planet. [↩]

I’m happy I have an inherently short URL to start with, so am using yet another WordPress plugin to internally direct users from short URLs to canonical ones. [↩]

[…] I know this is starting to seem like an obsession, but I can’t figure out why someone(s) would be constructing tweets that consist of my blog post headlines and links back to my …. I’m wondering how wide spread this problem is, so I constructed a list of URLs to blog posts […]

[…] of the more interesting discussions I have read since rebooting from a long vacation has been the Twitter weirdness uncovered by the Disruptive Library Technology Jester (a.k.a Peter Murray)It all started when the […]

Last, we hypothesized that spammers would follow each other and legitimate “regular” users, and regular users would follow each other and celebrities (see Figure 5). To test this, we selected 100 random user IDs off the public timeline and took five backward hops (e.g., picking a random node and clicking on one of her followers). After five hops, we reached a spammer 63 out of 100 times.

The corollary is that clicking on a random node’s “friend” (a node she is following) will lead to a high–profile user, such as a celebrity, athlete, or politician. In other words, popularity and legitimacy are indicated by high indegree and spam is indicated by high outdegree. This link structure is similar to PageRank and is susceptible to many kinds of spam attack.

Side note: I’m a little annoyed by FirstMonday’s lack of structured metadata on their article pages; I had to manually create the entry in Zotero.

From the Disruptive Library Technology Jester (http://dltj.org/), printed on Tuesday the 3rd of March 2015 at 11:32:19 PM UTC (+0000). The URL to this page is

This work is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 3.0 United States License. To view a copy of this license, visit http://creativecommons.org/licenses/by-nc-sa/3.0/us/ or send a letter to Creative Commons, 543 Howard Street, 5th Floor, San Francisco, California, 94105, USA.