Deep Thinking about Weblogs

by Andrew Grumet

Author's note: This article first appeared in May, 2003.

Weblogs are becoming increasingly difficult to ignore for those of us who spend much time reading the Web. Also known by the inscrutable nickname "blogs", weblogs are something of a hard nut to crack. Compounding the difficulty is the fact that a great deal of weblog content today is about weblogs and weblog technology. What are weblogs? What's the big deal? Why should we pay attention? We attempt to answer these questions in the essay that follows.

The pet rock of the 00's

Weblogs are everywhere. No longer the hideout of programmer nerds, weblog authors count among their ranks a Stanford law professor, a cast member of Star Trek: The Next Generation, a woman who works in the adult film industry, a popular humor columnist (who no doubt would be thrilled to appear in a list next to "a woman who works in the adult film industry").

In the words of weblog pioneer Dave Winer, "A Weblog allows you to easily publish a wide variety of content to the Web. You can publish written essays, annotated links, documents (Word, PDF, and PowerPoint files), graphics, and multimedia." To many this will sound a lot like a Geocities home page. Nothing new here: Geocities has been making it easy to publish to the Web for almost as long as there has been a Web. Scan a few weblogs, either those listed above or others that you may know about, and it should become clear that a weblog is exactly what it sounds like: a log that is published to the Web. The log entries are typically short, informal, and posted daily.

We can think of a weblog as a special kind of home page that has a time element. Or, even better, as a public, online diary. So why all the excitement? Everybody seems to have one and yet a weblog feels more like a pet rock than a revolution. We are particularly reminded of the excitement that accompanied the explosion of home pages in the early days of the Web. We suspect that, like home pages, the appearance of so many weblogs isn't the interesting part. The interesting part is, rather, the pervasive use of a set of technologies. Let's leave that thread for now and pick it up a little later on.

A new use for old technology

Here are two questions for those among you that have a home page. What information did your first home page convey? More importantly, how much additional information has appeared since?

In the early days of the web, it was easy to get distracted by the details of the Hypertext Markup Language (HTML) and the mechanics of transferring files from the authoring environment to a server machine. While Geocities and other authoring tools relieve most of that burden, they still encourage a static, style-over-substance variety of publishing. Even if you start out with a lot to say, you may find that you have little energy left to do so by the time you've settled on the fonts, color-scheme, graphics, page layout and site map. Having reached perfection, you will be loathe to make changes.

Perhaps the single most important innovation in weblogs is the way they reduce the number of degrees of freedom facing an author. Like Geocities, many weblog authoring products include an integrated hosting component, freeing the author from the need to maintain a web server or to manually transfer content files across the Internet. Radio Userland and Blogger are two examples. But weblogs go further in the simplicity department. A weblog author publishes to her site by creating a series of posts, each of which consists of a body and maybe a title. The thumbnail image to the right illustrates this. Clicking on it will take you to a screen shot of the entry form presented to Movable Type users. Of the four fields shown and several others reachable by scrolling, just one---the Entry Body---is required. Those who want to work with the additional fields, to apply fancy styles, to utilize sophisticated backend machine-to-machine enhancements (below) and other features, are of course free to do so. But fundamentally, weblogs encourage publishing by making it really easy to publish.

And publish they have. Technorati claims to track activity on over 250,000 weblogs at the time of this writing (May, 2003). Another site, Weblogs.Com, tracks recently changed weblogs. The rates reported by Weblogs.Com are regularly better than 10 updated sites per minute.

Weblog subject matter is varied, but most weblogs function as a sort of personal pulpit from which the author broadcasts their thoughts, interests, opinions, daily activities and so forth. Weblog posts often point to stories on news sites or other weblogs. This style leverages the hyperlink structure of the Web, and makes it especially easy to publish by providing a ready source of subject matter.

The steady stream of new content emanating from a well tended weblog conveys a sense of immediacy that is absent from most home pages. A weblog can thereby provide

You can read a weblog by simply pointing your web browser at it. And this brings us to the point that, technologically speaking, there's nothing really new in what we've described. The Movable Type authoring tool, for example, is built in Perl and the Common Gateway Interface (CGI). The first version of Perl was released in 1987. CGI shipped with the very earliest web servers. In summary, you can read and write weblogs with 10-year old technology.

Weblogs as the New Media

The Web dramatically lowered the barrier to entry into book and magazine publishing. Weblogs dramatically lower the barrier to entry into news publishing. By providing a simple, standard means of relating firsthand accounts of unfolding events, weblogs will provide an important alternative to traditional streams.

But we can go further. Ask your republican friend what they think of the media and, if nothing else, you are sure to come away with the conclusion that news coverage influences politics. By bringing democracy to the oligopoly of news reporting, weblogs should give the average citizen greater influence over the political process.

Weblogs and commerce

Although the weblogging world still has something of a grassroots feel to it, there are a handful of businesses in evidence. Most of these have an Old Web feeling to them. As we might expect, companies that write weblog authoring software usually charge for copies of the software. As we might also expect, companies that provide hosting services usually charge for these services. We haven't seen much in the way of paid advertisements on weblogs, but many popular authors are now using payment systems like PayPal and the Amazon Honor System to solicit donations from their readers. Producing entertainment every day is work, after all, so why not ask your readers for something in return? Finally, for many individuals and businesses, the self-promotional benefits of weblogging are reward enough.

The cleverer money makers leverage the new technologies that have come out of the weblogging world. Our favorite among these is onfocus' Weblog Bookwatch. This site displays a list of books, each of which is linked to a page at Amazon.com where you can buy the book. For every book you buy, onfocus gets a small kickback through the Amazon.com Associates program. The Associates program has been around for quite a while, and explains most of the "my bookshelf" and "books about Topic X" pages that you find on the Web. What's interesting about Bookwatch is that the site's maintainers, unlike those who came before, don't bother to actually read the books. Their book list is compiled by a computer program that sifts the titles out of weblog posts across the Internet and ranks them by popularity. Once the final line of code for this program was written, the marginal cost for linking to fresh, kickback-generating sales became zero.

New technologies put to work

Though weblogs can be published and read using old technology, we would be missing the boat if we didn't examine the newer, more useful weblog technologies. Consider the task that confronts a weblog reader who consumes 20 weblogs. Each time he wants to read his favorite authors, he will have to point his browser to each of 20 URLs and scan 20 pages to see if any new stories have appeared since he last checked. If he is efficient, perhaps, he will have the URLs bookmarked, and will have become adept at manipulating his browser using keyboard shortcuts. Sounds pretty painful, doesn't it?

The weblogging community has evolved a couple of interesting solutions to this problem. The first is an extension of the bookmarking idea. Once the reader has specified the URLs they would like to follow, it should be possible to have a computer program periodically scan the subscribed URLs and call to one's attention those that have been updated recently. Such programs, known collectively as news aggregators, now plentiful. And thanks to widespread use of a standard XML data format called RSS (for "Really Simple Syndication"), most news aggregators can directly display the content of new weblog posts.

The importance of RSS standardization cannot be overstated. For many years now the World Wide Web Consortium has been working to create conditions in which the next generation of the Web---dubbed the Semantic Web---can emerge. Defining "Semantic Web" is well beyond what we could expect to achieve in a short essay, but we can say this:

Unfortunately, widespread adoption of an XML document type does not guarantee that the semantic web will emerge. This requires agreement not just on document syntax, but, by definition, on the meaning of the data that is exchanged. RSS assigns meaning to XML fields that describe news sources and news items. It is perhaps a first step in the thousand mile journey to a semantic web.

A second solution to the reader's dilemma is provided by Weblogs.Com. This site maintains an up-to-the-minute listing of recently changed weblogs. A reader can scan this list periodically and link off to their favorite sites when they appear. How does Weblogs.Com provide this service? A system programmed in the mid-1990's would maintain a list of weblog URLs and periodically check these for changes, a solution turns out to be unscalable for tracking weblog activity Internet-wide. First of all, it provides no automated means of expanding the list of known URLs as new weblogs come online. And second, as the list grows, so does the burden of downloading the latest RSS versions and comparing them against the previous versions.

Weblogs.Com has adopted a much better solution. Instead of venturing out into the Internet looking for weblogs, Weblogs.Com exposes human- and machine-accessible interfaces that allow weblog publishers to post an alert when new content is published. The greatest mileage is provided by the machine interfaces, since compliant publishing software can automatically ping Weblogs.Com as each new piece of content is posted, without any additional effort by the author. This variety of machine-to-machine communication, known as remote procedure call, is one of the core technologies behind the heavily publicized but still young Web Services initiative.

What distinguishes RSS syndication and Weblogs.Com pings from technologically similar efforts is that they evolved to solve specific problems for users, a crucial factor in the success of any software endeavor.

Weblogs and online communities

Most weblog tools have a "comment on this story" feature, so that writers who are willing can engage in a discussion with their readers. Lively discussions can appear in the comments section of a story, but the livelier the discussion, the more likely the system will be abused. That's because most comment systems rely on an honor system: they do not try to authenticate you, instead assuming that you will be honest about who you are. This convention facilitates discussion by relieving the commenter of having to create accounts on every weblog system, and by relieving the weblog hosting software of the burden of maintaining these accounts. The weblogging world settles for this convention because it's simple and is not usually abused.

A second form of dialog has evolved in parallel. Most weblog readers are also weblog writers. A weblog writer usually posts her commentary, naturally enough, to her weblog. If that post contains a hyperlink to the subject of the comment, it should be possible for a third party to follow the chain of commentary. The news aggregators are starting to pick up on this, as the SharpReader screen shot to the right illustrates.

This second form contrasts markedly with traditional web-based bulletin boards. First and foremost, it is entirely distributed. There is no centralized database-backed web server to maintain, because each post resides on the weblog author's system. Second, there is no requirement that participating systems run on the same software. An author can participate in the discussion as long as their software supports a standard data format. This means that they are free to choose their software that best fits their usability preferences, licensing preferences, operating system and budget. Third, there is a built-in authentication mechanism. A weblog author's posts appear at a specific url, which can be tracked through the Domain Name System to the maintainer of the machine and ultimately to the author. By "tracked" we don't really mean tracking down everything there is to know about the author, but rather that, if a post appears at http://blogs.law.harvard.edu/philg/, we can be reasonably assured that it was posted by Philip Greenspun.

At a higher level, it is common for weblog writers to advertise list of sites they read, also known as their "blogroll", directly on the front page of their weblog. This provides a casual reader with a kind of context, a cue about who the author hangs out with.

The future

We seem to be at a tipping point with weblogs. As more and more webloggers come on line, we expect the use models and technologies to evolve in ways that we can only guess at (which we will do shortly). Leaps of the same magnitude that took us from a particle physics laboratory homepage to online communities, megamarts and research tools.

Our optimism is grounded in the "end-to-end" principle, whose significance Lawrence Lessig makes clear in his book The Future of Ideas:

The birth of the Web is an example of the innovation that the end-to-end architecture of the original Internet enabled. Though no one quite got it---this is the most dramatic aspect of the Internet's power---a few people were able to develop and deploy the protocols of the World Wide Web. They could deploy it because they didn't need to convince the owners of the network that this was a good idea or the owners of computer operating systems that this was a good idea.

Lawrence Lessig, The Future of Ideas: The Fate of the Commons in a Connected World. (New York: Random House, 2001, p. 44)
Weblog technology, especially the RSS format, embodies this same principle in the domain of online speech and dialog. You do not have to purchase a Microsoft Word license in order to speak. You do not have to specify what your speech should look like in a web browser in order to speak. The only requirement is that your speech be made available in an open format.

We can make some educated guesses about where weblogs will go. Like a good diary, weblogs provide documentation about people, places and events. From the front page of Scripting News, the original weblog, we can easily jump back and read about what was happening on the same day six years ago. Since the news items often contain hyperlinks, we can learn more about their context by reading the linked pages. Many of these links, sadly, have broken over the years (humble suggestion for weblog software writers: make it easy to replace broken hyperlinks with the archival location at the Internet Archive). But we digress. What's interesting is that other forms of weblog-embedded context are starting to appear. If you are listening to music while posting to your weblog, this plug-in software allows one-click inclusion of the name of the artist, album and song in your post. Another example: audblog lets you to post voice messages to your weblog through any phone. The related notion of mobile weblogging, or "moblogging" could be an emerging trend. Why wait to get home when you can post photos and GPS coordinates directly from your cell phone?

A second interesting direction is identified by systems like Bookwatch. Such systems use the common RSS format and discovery mechanisms like Weblogs.Com to aggregate and analyze weblog content. Another example, Breaking News, presents a listing of news items viewed through the weblogger's lens. How many webloggers are linking to the story, and what are they saying about it? Individual authors cannot answer these questions, and most wouldn't have dreamed up the idea. This is the end-to-end principle at work, an example of how an uncontrolled environment allows creativity to build on what came before. Widespread adoption of Creative Commons licenses by weblog authors is a good sign that we will see more of this kind of innovation.

We've argued that weblogs mediate online speech, and that weblog authors are authenticated by the Domain Name System. Could weblogs serve as an all-purpose digital identity, a la Passport? We think this is absolutely possible, though it would require significantly more evolution than has taken place so far. At a minimum, weblogs would need to provide a way to pass richer kinds of structured data and would also need to establish adequate security for sensitive data. For better or worse, these changes would add complexity to a software environment that has thrived by making simplicity a virtue. The win is straightforward: no single company owns identity on the Internet.

Finally, we mustn't ignore the role of commerce. Google just recently bought Pyra, creators of the Blogger system. We might expect Google to create tools for searching weblogs. Perhaps they will create a catalog of weblog authors, pigeonholing them according to subject matter and popularity. Perhaps people will pay to search through the catalog to find established subject matter experts and their content. Beyond Google, we suspect that online marketers will eventually figure out how to use weblogs. You can talk amongst yourselves about that -- we're not going to encourage them!

More

Overviews

End-user tools

Programmer tools

Weblogs and society

Comments | Cosmos