• 1049阅读
  • 0回复

给互联网加标签

级别: 管理员
Tagging the Internet

Wouldn't it be great if you could actually find stuff on the Internet? Sure, Google is a wonderful tool for searching for some things -- say the home page of a company, or how to make Battenberg cake. But more often than not, you'll get way too many hits for what you're looking for, and end up frustrated.

It isn't surprising, really: Google is now indexing more than 8 billion Web pages, against 2 billion three years ago and 3 billion two years ago. That's a lot of pages. As David Weinberger of Harvard University's Berkman Center puts it: "We've been struggling for several years with the Internet's size and complexity." So is there a better way of finding stuff?

Well, not exactly. I think that Google is still the best search engine. But some believe that might be looking at the problem the wrong way. What if there was a better way of organizing, and not just indexing, the Web? Google, after all, merely indexes the words it finds on a Web page, and those on pages linked to it. So if you're looking for recipes for Battenberg cake, that's easy enough. But Google doesn't try to figure out what the words actually mean, or what the pages are about.

In short, using Google is like going into a library and hiring a very fast runner, who isn't smart but happens to a be a very fast reader, to sprint around finding all the books that have the word "Battenberg" in them. Wouldn't it be better to just wander over to the catalog and look up the subject of cooking marzipan-wrapped yellow-and-pink-square sponge cakes?

It would, but so far there's no catalog like this. But there's an idea of one. It's called the "semantic Web," and it's simple enough: To categorize information on the Web by adding tags -- cake, marzipan, recipes, whatever -- to Web pages. But with billions of pages out there, and thousands more added every day, this is not a task that anyone is volunteering to do. Until recently.

Free Services

Last year a couple of free Internet services started doing something interesting, entirely independently of each other. Flickr (www.flickr.com) is a Web site for storing your photographs; del.icio.us (simply http://del.icio.us) lets you store bookmarks to your favorite Web pages. They share two features: Both let users add tags to what they are storing, and by default share that data with any other user.

So, say you upload a photo to Flickr, you might add a word or two to categorize it -- say, scuba, or marzipan. The same applies if you add a Web page to your del.icio.us bookmarks. But because both of these tools are public, it also means that you can see what other pictures, in the case of Flickr, or Web page links in the case of del.icio.us, have the same tags.

LOOSE CONNECTIONS: MARK THAT SPOT


You don't have to be overly excited about tags to use del.icio.us: Just think of it as a good way to keep your bookmarks (what Microsoft calls "favorites") in a place you can find them. And there are alternative sites that offer this service. Check out Simpy (www.simpy.com), Powermarks (www.kaylon.com/power.html), and Spurl (www.spurl.com). For a more complete list, and some more thoughts on tags, visit my blog at loosewireblog.com. All of these solve two basic problems: how to keep tabs on your bookmarks if you use more than one browser, or more than one computer, and, second, how to find them again easily.

Still, tagging is the future and once you see it in one place you see it, and its potential, everywhere. If you have a Gmail e-mail account (the free Web mail service offered by Google) you'll notice how you can add what Google calls "labels" to e-mails to find them later. So, if you wanted to keep all your family e-mails in one place, you could add the label "family" to those e-mails. If you were an organized sort of person, you would put these e-mails in a separate folder called "family" anyway, but the beauty of labels, or tags, is that you can assign more than one. So if Auntie Joan happened to be your boss, an e-mail from her could get a "family" tag as well as a "work" tag. Beginning to see the benefits yet?



This wasn't intentional: Joshua Schachter, a 30-year old New Yorker who set up del.icio.us, did it primarily because he wanted to keep track of his bookmarks. But suddenly you could see not only what you are gathering, but also what other people are gathering. "The motivation was mostly because I was solving a problem I had, and then I solved it for everyone," Mr. Schachter says. Social tagging was born.

Others realized that this was a grass-roots kind of classification that could be extended. Instead of someone hiring dozens of drones to sit at a computer and surf the Internet categorizing Web pages and photos so that people could find them more easily, people were doing it on their own, voluntarily, just by adding whatever key words came to mind when they added a Web page or photo.

Instead of a committee sitting down and deciding on some hierarchical system of categorizing stuff, it was ordinary people adding whatever tags sprang to mind, on the fly. A sort of egalitarian taxonomy -- which is why some people are calling it "folksonomy," which may or may not catch on. It's not perfect but it works: As Gen Kanai, a Japanese-American based in Tokyo who has been working on tagging, puts it: "The user does a bit more work tagging, but it results in a wealth of information once the tagged information is cataloged and associated with other data that has the same tag."

Easy Searching

So what does all this mean for you and me? Well, imagine that you're interested in scuba diving. You add a few relevant Web sites to del.icio.us and tag them "scuba." Suddenly, on your del.icio.us bookmark page, you can see not only all your tags, but how many others have tagged the same pages. And you can see what other pages have also been tagged "scuba."

You've not only stored your bookmark somewhere you can find later, but you've helped point others to the same page. And, most important, you can then see a whole library of pages others have considered worth bookmarking. Suddenly tagging becomes something simple, social -- and useful. Says San Francisco programmer Bowen Dwelle: "It gives people a comprehensible way to link things together. And, most important, it gives people a way to link to other people, and -- potentially -- to be grouped together."

Now, all this remains small-scale, and fragile. First off, how can we be sure everyone is adding the same tags to things -- marzipan, and not almond paste, say? Second: This is just two Web sites, a tiny fraction of the whole Web. True, but this is just the beginning. This month, a search engine called Technorati started using tags from Flickr and del.icio.us to categorize the millions of blogs, or online journals, that it indexes. That turns Technorati into a kind of homepage of every conceivable topic you can imagine people writing about: Check out, for example, its Web page on the notebooks I wrote about in the "Loose Wire" column a few weeks back, at www.technorati.com/tag/moleskine.

Most important, this social tagging thing, if it takes off, could make finding information much easier. Instead of relying on search engines, we can rely on other surfers submitting interesting sites as they find them. A bit like having some seriously fast, smart speed-readers running around the Internet on our behalf armed with piles of index cards.
给互联网加标签

能在互联网上找到想要的东西是不是很棒的事呢?当然!Google就是一个出色的查找工具--例如查找一个公司的主页或是制作巴腾堡蛋糕(Battenberg cake)的方法。不过,经常发生的情况是:你查找一点东西却有无数个选择,让人沮丧不堪。

这一点都不奇怪:Google目前索引的网页超过80亿个,而3年前还只有20亿,两年前只有30亿,数目的确庞大。正如哈佛大学伯克曼中心(Berkman Center)的大卫?温伯格(David Weinberger)所说:“这些年我们一直在庞大而复杂的互联网中挣扎”。有没有更好的方法查找东西呢?

严格来说还没有。我认为Google仍然是最好的搜索引擎。不过有人认为这样看待问题的方法有误。如果有一个更好的、组织这个网络而非索引网络的方法会怎么样呢?Google毕竟只是索引它在一个网页上搜索到的词,以及与这个网页链结的其他网页上的词。因此,如果你要搜索巴腾堡蛋糕的配方,那非常容易。不过Google不会去分析这些单词的确切意思或这些网页的内容。

简单来说,使用Google就像走进图书馆、雇佣了一个跑得很快的帮手,他可能不大机灵,但阅读速度却很快,他能飞快找到带“巴腾堡”几个字的所有的书。如果只需走到目录旁边,查找“制作一种用杏仁糖浆包裹、黄粉两色的方形海绵蛋糕”,这样是不是更好么?

也许是会好些,但问题是目前还没有这样的目录。幸好这种想法现在已经有了,叫“语义网”,概念很简单:通过给网页加标签对互联网上的资讯进行分类,如蛋糕、杏仁糖浆、配方等等。但是,互联网上的页面浩如烟海,每天又有几千个新增加的网页,可不会有人主动请缨来完成这项任务--直到最近。

去年有两个免费的互联网服务开始做一些有意思的事情,它们相互之间是完全独立的。Flickr网站(www.flickr.com)可以存储照片,del.icio.us (http://del.icio.us)可以让你保存你最喜欢的网页的书签。它们有两个共同特点:让使用者给保存的东西加标签,这些资料默认可以与其他用户共享。

因此,假设你上传一张照片到Flickr,你可能会给它加一两个单词来进行分类,比如:scuba (水中呼吸器)或marzipan(杏仁糖浆)。同样道理,你在del.icio.us的书签上加进一个网页时,也对它分类。不过,这两个工具都是公用的,这意味著你可以浏览到拥有同样标签的其他Flickr图片或del.icio.us的网页链结。

这不是谁特意想出来的。del.icio.us是由30岁的纽约人约书亚?沙克特(Joshua Schachter)创建的。他这样做主要是想保存他的书签。但是,忽然之间,你不仅可以看到自己收集了些什么,还可以看到别人收集了些什么。“做这个网站的主要动机是解决自己遇到的问题,没想到给很多人解决了问题”,沙克特说。社会性标签由此诞生了。

另外有些人意识到这个非主流的分类方法有扩展的空间。不用聘请几十个像蜜蜂一样勤劳的工作人员每天坐在电脑旁边工作、在互联网上对网页和图片进行分类以便使人们找起来更加方便,现在人们自愿在做这件事,在上传网页或图片时想到什么关键字就加什么。

用不著组织一个专家委员会来讨论决定分类的层级制度,现在是普通人在做分类,脑子里想到什么标签就加什么。这是一种人人平等的分类法--因此有人把它称为“通俗分类”,它有可能会流行起来,也可能不会。这种分类不算完美,但有效。正如一直在研究标签、住在东京的日裔美国人Gen Kanai所说的,“用户在加标签上只需多做一点工作,而一旦加了标签的资讯被编进目录并和其他有相同标签的资料链接起来后,这一点点的工作就会带来丰富的资讯资源”。

那么这一切对你和我意味著什么呢?假设你对潜水(scuba diving)感兴趣。你在del.icio.us上加进一些相关的网站并用“scuba”作为标签。突然,在你的del.icio.us书签页面上,你不仅会看到自己所有的标签,还会看到有多少人对相同的页面加了标签,你还能看到其他被加“scuba”标签的页面。

你不仅把书签存在了以后能找到的地方,还把其他人指向了这些页面。此外,最重要的是,你会看到一大批别人认为值得加书签的网页。忽然之间,加标签成了一件简单、富社会性而有用的事情。旧金山市程序员鲍恩?德尔(Bowen Delle)说:“它给了人们一个容易理解的方法把事物联系起来,最重要的是它成为人们可能与他们建立联系的方式--甚至有可能因此走到一起”。

目前这种方式还是小规模的,并不发达。首先,我们怎么能确定每个人加的是相同的标签呢--比如都是杏仁糖浆而不是杏仁糊?其次,现在还只是两个网站,相比于整个互联网不过是沧海一粟。没错,但这是良好的开头。最近,一个叫Technorati的搜索引擎开始使用Flickr和del.icio.us的标签来对它索引的几百万个网络日志进行分类。这使得Technorati成为一个主题网页,包括任何你能想得到的、可能有人在撰写的主题。例如,你可以在www.technorati.com/tag/moleskine上找到我几个星期以前在“Loose Wire”栏目里讨论的关于笔记本电脑的网页。

最重要的是,社会性标签如果真的发展起来,会使搜索资讯变得容易得多。我们不再需要依赖搜索引擎,而可以借助其他上网者把找到的有趣网站贡献出来。这种情形有点像有一些特别快速、机灵的速读者带著一叠叠索引卡,在网上替我们四处奔跑查找资讯。
描述
快速回复

您目前还是游客,请 登录注册