Saturday, June 03, 2006

These days, everybody's going Web 2.0, which, as Tim O'Reilly notes,  really means providing "a continually-updated service that gets better the more people use it". The Hyper-Concordance has one nice characteristic in this respect: each word has its own page, so if you want to know what people have searched for, you just have to look at the server logs.

So with a little Perl programming (unlike Playing 24, Perl really shines at these tasks), i can

  • filter out Hyper-Concordance pages from the rest of the server hits
  • select out the most frequent ones
  • process them along with the other Hyper-Concordance data to produce a "most popular searches" page

The result is here, with a little more than 100 of the most commonly visited term pages. Some ("Jesus", "disciple", "angel") are the ones you'd expect, and a few are probably artifacts of what's on the home page. On the other hand, i was somewhat surprised to see terms like "feast" and "gate" turn out to be so popular.

It would be cool to track "buzz" like Yahoo does, and see changes in different words over time. But the Hyper-Concordance doesn't get enough volume to make this very meaningful, and anyway i don't have a whole IT infrastructure to support this: it's just me.

I still fall short on O'Reilly's "continually-updated" criteria: to update this page, i have to download a server log (which currently is just once a month), run a couple of programs, and upload a new file. But it still seems interesting to me, and i'll try to keep up with it.


3:02:23 PM #  Click here to send an email to the editor of this weblog.  comment []  trackback []