One question that came up in our discussion with Mike Perez was, what would it take to create a hyper-concordance for other languages and/or translations? My first response was that it would be pretty simple (programmers are eternal optimists). But even having thought more carefully, i still believe it. The essential elements:
- a Bible text in some structured format (OSIS preferred, of course) where it's possible to identify the verses and the words
- a way to map words back to their dictionary forms. If this is imperfect, or even completely missing, it doesn't stop you: it just means some forms that ought to be grouped together won't be. For my RSV hyper-concordance, i just created a text file by hand, mapping e.g. "brethren" to "brother" and "broken" to "break". For languages with richer morphology than English, you might need a lot more smarts (or a lot more manual effort)
- a program to create a index mapping each unique word (in its dictionary form) to the verses it occurs in
- a program to generate an HTML page for each such word=>verses index
- if you want a master index showing all the words, a simple program to collect all the indexed terms and hyperlink them to their respective pages
- i left out some words because their pages would have been excessively large: this is just an optional practical matter, though. To do that you need a stopword list: i created mine by sorting all the words by frequency and cutting everything that occured more than 100 times. Then you use this to filter which words you index.
Much as i'd love to take credit for my brilliant programming, there's no real difficulties to any of this. But the proof will be to put my money where my mouth is, and create another one. Stay tuned.
4:37:49 PM #

Donna and i had the privilege this week of having lunch with two people deeply involved in using innovative technology to expand use of the Bible. Mike Perez helps lead a technology initiative group of the American Bible Society, including ForMinistry.com, which is a portal site for churches and other ministries. Steve deRose is the chairman of the Bible Techologies Group, which is sponsoring the OSIS initiative.
We had a wide-ranging discussion about things that might help the ABS with their mission of "Scripture engagement." That's a nice tight phrase with very broad scope: in particular, what's different in the Digital Age about how we present God's Word to people? This is what Blogos is all about, and it's gotten me thinking in a lot of different directions, many of which will be the focus of subsequent posts. A sampler:
- in the pre-Internet era, printing bound together the costs and values of content, production, and distribution of Bible information. That's already changing, and content is becoming a completely separable value from binding and printing. How does that change the ministries of the Bible Societies and their traditional roles in Bible printing and physical distribution?
- how can weblogs help churches in their ministry? what new opportunities does RSS open up?
- what would it take to do hyper-concordances for other translations and languages? what new kind of Bible search paradigms might this lead to?
- what other ways might you present the content of Scripture, moving beyond print to hyperlinked digital media that include sounds, images, and other approaches to organizing information than the traditional book/chapter/verse divisions?
We also talked about my interests in semantic annotation of the Bible, and i'm hoping we'll find some ways to work together toward this goal.Stay tuned!
3:21:19 PM #

Copyright 2004 sean boisen
Theme Design by Bryan Bell






