Blogspot domains identified as fine purveyors of Spam

Free content based domains are beacons for Spam

I was over at Bill Slawski’s excellent blog earlier today researching text to link proximity stuff, and stumbled across a post Microsoft Follows the Money to Find Spammers which referred to this interesting Spam research paper from Microsoft entitled: Spam Double Funnel: Connecting Web Spammers with Advertisers. For the geekazoids amongst you there’s lots of interesting snippets and observations. Bill’s already covered most of the headlines over at his blog, so I won’t regurgitate that.

What stuck out to me was set amongst the conclusions, the main one being that blogspot domains were the biggest culprits when it came to originators of spam.

 

…doorway domains, we showed that the free blog-hosting site blogspot.com had an-order-of-magnitude higher spam appearances in top search results than other hosting domains in both benchmarks, and was responsible for about one in every four spam appearances (22% and 29% in the two respectively, to be exact). In addition, at least three in every four unique blogspot URLs that appeared in top-50 results for commercial queries were spam (77% and 75%). We also showed that over 60% of unique .info URLs in our search results were spam, which was an-order-of-magnitude higher than the spam percentage number for .com URLs.

 

I don’t know if the findings of papers like these bear any weight or consideration in any subsequent re-jigs of search engine algorithms. Only the search engines truly know what is and what isn’t a consideration in any equation. We can certainly say that if a mainstream domain owned and controlled by a party other than the search engines were to be responsible in similar ways, then their tenure in the SERPs (search engine results pages) would be very short lived. Their authority score would suffer, as would their overall trustrank. In essence once identified they’d be dead in the water.

Search algorithms aren’t changed on a whim of course, its a relatively safe bet to assume that search models are consistently tested and evaluated internally, before any public release. Documents like the one referenced, give interesting insights into the minds of the people who look at webspam.

Perhaps it’s for these very reasons that people behind other platforms that allow human access to write and create content make such public pronouncements detailing there determinism to eliminate or at least drastically reduce spam in their indices. After all failure to do so, in light of the above for example, could quickly lead to a diminution in trust and authority with the resultant knock on effect of poor ranking ability and negative monetisation effects that would usually follow significantly reduced traffic levels. By publicly affirming their commitment to tackle it, they may well save themselves from the heavy axe a search engineer can wield.

Jason Calacanis of Mahalo was kinda right when he said

When I had SEOs on the last CalacanisCast they raved about Squidoo and it’s ability to game the system, and if SEOs love your platform you have a HUGE problem.

The fact is, that web spammers, (not all SEO’s are web spammers Mr C) will indeed game the system. Some see it as their job to take competitive edges and work with them to the max; the rationale being if they didn’t then somebody else would.

I guess its up to platform owners to ensure that access and effectiveness are reduced. It’s a big reason why wordpress and all the major blogging platforms introduced nofollow into their software. For those who don’t know, nofollow restricts the ability of a link to pass pagerank, or link juice or link love or whatever else you want to call it, to the page to which it points.

Perhaps Mahalo and Squidoo and Blogspot should just ‘nofollow’ everything they link out to, maybe they should just close it all off to spiders and bots. They haven’t been created for the benefits of search engines after all…

Perhaps serious individual content creators should just go out and buy a domain for $20, grab a WP install, get some cheap blog hosting and just run their own show. It isn’t exactly rocket science after all. It does make you wonder why a person would bother writing content and help make some other guy rich …unless of course you we’re writing it to funnel people elsewhere and monetise it to your own ends.

I do have some sympathy with what those guys say though, It narks me a little though, as it suggests that people like me are scum sucker sleaze buckets. Most of us aren’t, it’s just a small minority of uber spammer who spoil it for everyone else.

Maybe the likes of Mr Godin and Mr Calacanis could help by using the term web spammers instead of SEO’s. It’s a far more accurate descriptor.

Meantime, if you are blogging and on a free platform, then perhaps you ought to at least consider moving on..

Rob Watts
Kickstart your business today - Get an SEO Consultation or just talk to Rob about your online aspirations. With over 20 years experience in building traffic he's pretty much encountered most markets and scenarios
Posted on: 12th July 2007, by : Rob Watts

9 thoughts on “Blogspot domains identified as fine purveyors of Spam

  1. I think improvement in this area is sorely needed.

    I run a small-ish webcomic, but already spammers are taking advantage of my webcomics rather unique name to direct potential readers to their spam version of my site on blogspot.

    Fortunately, they haven’t been able to take me on when it comes to google, but I’m sure they must get a fair amount of visitors actually looking for my content.

    I prefer the term t***s rather than SEO for people that do this… Opportunistic parasites, or webspammers is fine though.

  2. Hi Adam, thanks for stopping by

    Sorry to hear about your spammer problems 🙁 It’s a PITA to combat.

    I agree about the need to tackle, although ultimately its down to ‘owner’ moderation, we are all responsible for what we put out there. If the owner refuses to tackle, then as harsh as it seems they deserve all that comes to them.

    I get around 150 pieces of spam through here per day alone. Thankfully with akismet and the math spam plugin I can delete it in a heartbeat, but its still a pain, still have to check the comments, just in case a genuine comment gets missed.

    >I prefer the term t***s rather than SEO for people that do this… Opportunistic parasites, or webspammers is fine though.

    Yes I like the opportunistic parasite label too, pieces of sh*t is quite apt too 🙂

  3. Hi David,

    I agree, and if it were owned by anyone other than Google then perhaps it really could be a serious problem. It’s certainly illustrative of what could befall one, guilt by association bad neighbourhoods etc.

    I’d advise anyone to go the self hosted route, especially if they are serious about this stuff and are in it for the long haul. Just makes sense.

  4. There are just sooooo many reasons not to be on Blogspot even if you don’t fancy the complexities of WordPress. One thing that Blogspot precludes is decent SEO-type control of your title tag and page permalinks etc, which might be deliberate on Google’s part, but I doubt it. I just think they got Blogspot “wrong”

    db

  5. I’d always worried that it was only a matter of time before blogspot blogs got ear marked as “spam” (whether they were or weren’t) that’s why I eventually took the plunge and picked up my own domains. I didn’t see any point in building up a site that essentially I had no control over.

  6. “Spam Double-Funnel: Connecting Web Spammers with Advertisers” is a good paper. It’s a demonstration that search engines need to get much smarter about dealing with redirects and “cloaking” to figure out where links really go. Blogspot is just a symptom of the problem.

    Our system, SiteTruth, accomplishes much the same goal, but from the other direction. We try hard to find the company behind the web page, and if we can’t, we downgrade its search position in our system. This derates most “doorway”, “referrer”, “directory”, and “direct navigation” pages. That covers most web spam.

    SiteTruth is hard on some sites, but all you have to do to get a decent SiteTruth rating is put your business name and address in a format that would work on a mailing label somewhere obvious on the site, like an “about” page. Or get a good SSL cert. Or a BBBonline membership. Or get into Open Directory. Or put your URL in a Yellow Pages ad. If we can find a solid indication of legitimacy from a non-web source, we’re happy. Everybody else in search is endlessly grinding on web pages, but not looking off the web for hard data. We do that, and it works.

    Try it at “sitetruth.com”. We’re in alpha test.

  7. Hi John

    Nice idea, good work on that, and best of luck with it too.

    Seems like you are looking at a few more of those ‘quality signals’ that people like Matt Cutts and Tim Maher allude to.

    Just have to be careful not to throw babies out with bathwater.

    Not every good website has an SSL cert, not every good website bothers with Yellow pages or Business.com or the ODP. Much of ODP data is centuries old or has been bought out and sold on. How many sites admited to ODP have since expired and been rebought and filled with content of dubity I wonder.

    Serious spammers also spend serious money too in getting to where they need to be. There are varying flavours and degrees of spam too, some of better quality than others. Its increasingly more difficult to differentiate too. Data is bought and sold and exchanged and rehashed. Spammers activiley seek out new data sources and aren’t particulalry fussy what its about either. Monetisation programs like adbrite , ypn, adsense, kontera, bidvertiser etc etc all help make the enterprise a worthwhile venture too.

    Don’t get me wrong. I think yours is a great approach to a problem that affects us all. I think the main players in search might even look at similar signals too, or at least should be testing various filters like you describe.

  8. ya this is very usefull and valuble information I’d always worried that it was only a matter of time before blogspot blogs got ear marked as “spam” (whether they were or weren’t) that’s why I eventually took the plunge and picked up my own domains. I didn’t see any point in building up a site that essentially I had no control over.

Leave a Reply

Your email address will not be published. Required fields are marked *

css.php
%d bloggers like this: