Wednesday, June 28, 2006

I'm still working away (far too slowly for my impatient tastes) on the first complete first of New Testament Names, a semantic knowledgebase of named things in the New Testament and their relationships: you can get a sense of it from these representations of a browser prototype. But i'm also looking beyond to what will come next (one reason these projects take too long! I keep starting new ones ...).

After cataloging the names and their information, clearly the next step is to add Scriptural references. The first pass here can be done automatically (it's largely just string matching). But of course, there are a lot of different Johns, Marys and Simons in the New Testament, and it's a lot more useful to know which one is which: this is something people do so easily they hardly recognize it, but it can be surprisingly tough to do automatically.

As an example, there are 36 mentions of "Joseph" in the ESV NT text (in 35 verses: Acts.7.13 mentions him twice). Obviously in the birth narratives of Jesus, Joseph refers to Jesus' (earthly) father: by my count, that's 14 of the 36 references. Joseph of Arimathea is mentioned in the Passion narratives, since he provided a tomb for Jesus to be buried in: 7 of the 36 are this Joseph. As an aside, here's a case that requires a little more than string matching: Luke.23.50, "a man named Joseph, from the Jewish town of Arimathea". You'd need to be pretty smart about the use of context to figure out which Joseph this is. 

For other cases like John.4.5, where the mention of Joseph refers to the Old Testament figure, only real human understanding of the text can determine the correct reference. This Joseph is more frequent outside the Gospels (though he's in Luke's genealogy), 10 of the NT Josephs in all. There are also two references in Acts to Joseph who the apostles nicknamed Barnabas, and then a few others: Jesus' brother (Matt.13.55, perhaps also Matt.27.56), and two Josephs in Luke's genealogy (Luke.3.24 and Luke.3.30).

My guess is maybe 80% of the name references in the New Testament can be easily disambiguated, either because they're not ambiguous in the first place, or because simple heuristics clarify them. But for the rest, somebody will actually have to look at them and make a decision (in a very few cases, some really hard decisions). Maybe Amazon's Mechanical Turk is an appropriate mechanism: the ESV blog reports on an interesting experiment here.

7:50:22 AM #  Click here to send an email to the editor of this weblog.  comment []  trackback []