Yahoo’s confesses its algo is poor and needs a little help
Yahoo! annouced a new tag today, supposedly aimed at helping webmasters to section off aspects of their pages so that spiders don’t index content that is superfluous to the meat and gravy of the page. The ‘what a great way to flag seo’d pages’ factor aside, lets look at what they are saying with regard to usefulness and the webmaster.
The “robots-nocontent” tag is a useful tool for webmasters.
- It can improve our focus on the main content of your pages.
- It helps target your pages in search results by making sure the appropriate deep page in your site can surface for the right queries.
- It helps improve the abstracts for your pages in results by identifying unrelated text on the page and thus omitting it from consideration for the search result summaries.
Those bullet points are interesting. Lets have a look at them in reverse. Nope, not reverse order, but reverse logic. Lets see what can be determined by flipping the logic around.
It can improve our focus on the main content of your pages.
(Our algo needs help, help us see what your page is about)
Most of us have a general understanding that when a page is parsed by a search engine spider it is broken down into its parts and weighted against a set of criteria that are used for the purposes of document classification, relevancy and ultimately ranking in the SERP’s.
Going by that statement above we could infer that slurp (Yahoos spider) at least finds it difficult to classify content or weight documents correctly and that the use of this tag will help them correct that failing in their use of the technology.
Meta description tags, title tags, heading tags, word frequency, positioning, context and semantic relationships are obviously inadequate in this regard. Block level link analysis or word positioning too it would seem, but more on that later.
It helps target your pages in search results by making sure the appropriate deep page in your site can surface for the right queries.
(Our algo needs help, our serps are poor, help us see what your page is about)
It does, really? Wow – that’s a pretty cool meaningless set of words that say very little. You know, Ive read that now four times and it still hasn’t sunk in. Hmmn so let me get this right. If a page is sunk, because its real meat and potatoes is obscured by poor design and poor use of existing tools and layout options and tags and the whole myriad of options that already exist to tell a bot what a page is about and why – eg classical seo techniques, then by use of this special tag, we the webmaster can give our poorly designed, shitty seo’ed pages a little lift for the right queries…ah – now I get it, could it be that inefficient algo/scoring system again. Too many strings of disconnected words coupled with an inability to identify repeated boilerplate stuff.
It helps improve the abstracts for your pages in results by identifying unrelated text on the page and thus omitting it from consideration for the search result summaries.
(Our algo needs help, help us see what your page is about, exisiting snippet systems just aren’t working)
Are the exisiting snippets really so bad? Is the noodp noydir tag a failure? Are meta description tags inadequate? Are pased page content snippets with the highlighting of keywords and phrase not doing the job then?
Anytime I see the rule of three being applied when one sentence could suffice, I get a little suspicious, especially when it comes from a big corp, more intent on selling advertising than helping old joe webmaster. My spidey sense is cranked up to the max on this one, I smell another pincer movement in the offing.
Remember the ides of nofollow
Those who remember the launch of rel=nofollow, will recall that it was vaunted as a tool for bloggers and other interested webmasters, to stop those evil link spammers and content manipulators who were busy dropping their dirty little links all over the blogosphere, manipulating search engine rankings during the process, polluting the internet for all and…well, you heard the drill then, no need for me to repeat it over. If we look at nofollow today especially in the context of how the search engines are dictating it should be used, then in my view its only fair that we cast a watchful eye over any new initiatives, especially those designed to ‘help us’.
Is it mere coincidence that this thing was first suggested at a so called web spam squishing conference back in feb 2005?
Yahoo first proposed this type of attribute way back in February 2005, at the Web Spam Squashing Summit that Niall Kennedy organized.
The web as we all know is made up of billions of documents some of which have sat around unchanged for years. Good solid content full of relevant stuff, that people are finding daily, via queries and serps pages across all engines – established docs with lots of authority. It isn’t so difficult to find something, provided you know how to use the tool properly, which is an additional reason why I’d ask that a company who has worked in the search space for so long, would really need to go to the content creators and say, “look we are a little stuffed, we have tried this search algo thing for some time now, and despite the legin of documents and indexes and time we’ve all had to look at the whole document classification thing, we’ve now decided that the games up, and we need you guys to tell us what the page is really about, cos we can’t, we suck” On the face of it, that may seem a little rude, cruel even, but it really isn’t so far from what they are saying, surely? Or am I missing something huge here?
Do they really expect webmasters to open up their docs and add these little tags and classes all over the shop, just so that they can (theoretically) rank better. I can picture the average webmaster now thinking “I know, sod seo, I don’t need those over priced, secret squirrel, greedy link spamming sods anymore, oh no, I’ll just stick these around everything but my title tags, meta description tags, h1 tags, link tags and anything else i wish to place a little emphasis on, that’ll fix it- job done”.
Exactly, preposterous it isn’t going to happen. Which is why Im kind of asking in all seriousness what is the use of this thing – why is it necessary even. Is the algo really that awful? And if it isn’t now, then (assuming its not some kind of trojan horse) what about when web spammers get a hold of it and play about with it on a few 100 thousand new domains or pages, because make no bones about it, if anyone is going to be seriously proactive and exploitative on this stuff then it will be the hard core web spammers who will be 1st in line, probably cranking up pages right as I type this stuff now. It is they who will look to see if this stuff actually works for ranking purposes, it is they who will be seeking to get a run off of the starting blocks.
Yahoo aren’t idiots, surely not – they know this as much as I or anyone else who is in to this sort of thing. And what about when Goog and Msn or Ask get on board, what then? How long before we will be dictated to about how we should build our pages and mask out content that may be duplicated elsewhere or affiliate in nature.
I think it really is pretty incredible that they would admit that their system of scoring and ranking documents is in need of help like this, because when you boil it all down, this is exactly what they are saying.
Read about this elsewhere
Andy Beal,Andy Beard, Threadwatch, WebProNews, FinalTag, John Andrews, Mashable , Techmeme , ChristianPosted on: 3rd May 2007, by : Rob Watts