nofollow on the StatusNet Cloud
Today we rolled out code on the StatusNet Cloud to set the "nofollow" relationship for certain links on public sites. I wanted to take a few minutes to describe what this will do, why we did it, and what users can expect.
Many search engines and other automated Web software use the incoming links to a Web page from other parts of the Web as a way to "rank" the page. High-rank pages are shown on search results higher than low-rank pages. Google's PageRank is the canonical example of this kind of system.
These algorithms date from ancient times on the Web, when HTML was hand-coded and content carefully screened by the publisher. In the age of user-generated content, people use Web-based communications systems to share links with each other without the consent or approval of the site publisher: on wikis, in blog comments, and on social messaging systems like StatusNet. Although the vast majority of these links are well-intentioned, some people abuse these content systems to get more PageRank for their sites.
Search engines haven't kept up with this change in the Web; instead, they've put the onus on service providers to keep their search results accurate. nofollow is special markup that Web site operators can add to their HTML output to say that it's a user-generated link that isn't screened or validated by the publisher. Algorithms like PageRank will skip links with the nofollow attribute.
With some systems, this is relatively easy. On a blog, for example, comment links may be "nofollowed" but the blog author's links are not. On some wikis, internal links and links to trusted sites are left alone, but all other links to external sites have the nofollow attribute. On a microblogging system like StatusNet, this is considerably harder. Who is responsible for your personal inbox? For a tag page? For a personal profile? Everything in a microblogging site is complex and intertwined.
In the changes we released today (which will be an optional part of the upcoming 0.9.2 version of StatusNet), we've tried to make reasonable compromises that discourage abuse of the system without unnecessarily disconnecting StatusNet sites from the rest of the Web. Our guideline was that users and groups could and should share links out to other sites, but that no one should be able to elect themselves to get "Google juice" from anyone else. So, these are the kinds of links we've added the "nofollow" relationship to:
- Subscribers. If I subscribe to your stream, that means I find you and your stream interesting. If you subscribe to my stream, however, I have no idea who you are.
- Group members. Joining a group doesn't mean that the group has chosen you.
- People tags. People who tag themselves "ubuntu" or "php" don't necessarily deserve the votes of thousand of other people with the same tags.
We will continue fine-tuning our HTML output to strike a fair balance between Web presence and abuse discouragement. Any feedback or suggestions are very welcome.