Identify spam bots by speed of first notice?
Identify spam bots by speed of first notice?
| Issue ID: | 2021 |
| Issue Category: | enhancement |
| Component: | uncategorized |
| Priority: | normal |
| Status: | active |
| Assigned: | evan |
| Version: | 0.9 |
| Milestone: | 0.9 |
| Keywords: | antispam, spam, spammers |
The public timeline shows that a spam bot posting a "[name][year].vox.com/library/post/[filename]" link repeatedly creates a new account, and the first notice of each is posted immediately after the welcome notice from @welcomebot. As far as I've seen, no other accounts post their first notices so quickly. Could this proximity to the welcomebot notice be used to catch spam bots?
Examples:
http://identi.ca/harmrutger posted 10 notices.
3 minutes 50 seconds later:
http://identi.ca/blomemunk was greeted by @welcomebot and posted 10 notices.
4 minutes 25 seconds later:
http://identi.ca/ritchiedallmann was greeted by @welcomebot and posted 2 notices.
A Google search for "site:identi.ca +vox.com/library/post" currently returns 897,000 results.

Updates
#1
In #1860,
> … time limit somehow - no duplicate link posted within an hour or a day, say.
#2
This will make no difference to the gist of the ticket, but I now realise this number is clearly rubbish:
> A Google search for "site:identi.ca +vox.com/library/post" currently returns 897,000 results.
A search using a specific date range (01/05/2008 to 27/11/2009) returns 22,800 results - 22,400 of them since 15 November 2009.
#3
Replying to [comment:1 grahamperrin]:
> In #1860,
>
> > … time limit somehow - no duplicate link posted within an hour or a day, say.
The problem with that idea here is that these links aren't duplicates; they're presumably going to many separate blog accounts all hosted by vox.com.
#4
Replying to [comment:3 120new]:
> Replying to [comment:1 grahamperrin]:
> > In #1860,
> >
> > > … time limit somehow - no duplicate link posted within an hour or a day, say.
>
> The problem with that idea here is that these links aren't duplicates; they're presumably going to many separate blog accounts all hosted by vox.com.
These are different *accounts* (though conceivably the same spammer) all exploiting vox.com. The Google search mentioned in comment 2 now mentions about 92,000 results. Of course there could be many "duplicates" here: the same pattern showing up in many timelines of people subscribed (or auto-subscribed) to one or more of these spammy accounts. It's even possible that that number includes some that have been deleted by now.
BUT, a notice search on identi.ca brings up about 1000 results as well (50 results pages).
I seem to remember a remark from @foucault he'd taken care of them; if so, whatever it was seems to have cleared up only the then-existing dents, but not prevented any new ones. Yet they are all, invariably, spam. The oldest of the ~1000 was posted on 27 November. Most if not all of these accounts also post multiple times, so speed of first dent is rather irrelevant (in this case!).
Only cleaning up after the spammers but not preventing them from continuing to do the same thing doesn't scale...
#5
Replying to [comment:4 marjoleink]:
> I seem to remember a remark from @foucault he'd taken care of them; if so, whatever it was seems to have cleared up only the then-existing dents, but not prevented any new ones. Yet they are all, invariably, spam. The oldest of the ~1000 was posted on 27 November. Most if not all of these accounts also post multiple times, so speed of first dent is rather irrelevant (in this case!).
I "almost" remembered the phrase @foucault used, a bit of trial and error and I found it back, here:
http://identi.ca/notice/15698236 - posted Saturday, 28-Nov-09 17:55:25 UTC
> @evan gave me a bit of code to block that vox.com spammer. looks to be working nicely.
I'm afraid it looks like it only got rid of what there was before, but hasn't actually *blocked* anything.
Worse, I poked around at vox.com a bit - it looks like a general social media site, quite legit in itself, with "normal" users but (ab)used by spammers, too. Which will actually make it quite difficult to weed the spammers using http://vox.com/ from the normal users posting on the site and announcing their post here, or at least to do so in any automated way.
One more for my list of actually quite nice social sites being abused by spammers (potentially spoiling things for the rest of their users).
#6
Vox (vox.com) is now closed.
What remains, is the principle of blocking spammers (profiles) and individual dents based on URLs they use. Has that been addressed in any way by now?
I still see series of spammers using the same domains, being reported, but new ones not prevented from using those same domains (or even exact same URLs).
You can also subscribe to the
RSS feed for updates to this issue.