Thursday, August 31, 2006

I've moved to a new blogging platform (goodbye Radio Userland, hello WordPress).

But if you read through an RSS aggregator (this is really important, so pay attention):

If you read directly from the website, everything will work as before at my preferred URL, The new site includes several syndication buttons that make it easy to add Blogos to your Bloglines, MyYahoo!, or other readers.

If you have any problems with this, please send me (sean) an email at semanticbible daht com. I don't want to lose any readers in the transition (there aren't that many to start with!).

7:49:44 AM #  Click here to send an email to the editor of this weblog.  comment []  trackback []
 Thursday, September 16, 2004

I've been working for several weeks with the GEDCOM 6.0 XML format for genealogy data, and i've entered about half the random written information i have into this XML file, which is rendered by this XSLT into this browsable version (all are works in progress). The XSL closely follows this work by Michael Kay (who literally wrote the book on XSLT), and his example taught me a number of cool new tricks for this sometimes baroque but very powerful language.

One difficulty with using the GEDCOM 6.0 format is that information about an individual is distributed across several different elements, linked by an ID. For example, here's my individual information:

 <IndividualRec Id="BoisenSean">
   <NamePart Type="given name" Level="3">Sean Cornell</NamePart>
   <NamePart Type="surname" Level="1">Boisen</NamePart>
  <PersInfo Type="occupation">
   <Information>computer scientist, manager</Information>

but then my birth is represented in a separate event record

 <EventRec Id="BoisenSeanBirth" Type="birth" VitalType="birth">
   <Link Target="IndividualRec" Ref="BoisenSean" />
   <Link Target="IndividualRec" Ref="ClaycombDorothy" />
   <Link Target="IndividualRec" Ref="BoisenElliott" />
  <Date Calendar="Julian">September 21, 1958</Date>
    <PlacePart Type="town" Level="4">Tacoma</PlacePart>,
    <PlacePart Type="state" Level="2">Washington</PlacePart>,
    <PlacePart Type="country" Level="1">United States</PlacePart>

My death (had it already taken place) would be yet another event record, likewise for my marriage or other events. Yet another element type is used to record my membership in a family.

All of this gives it very much the feel of a relational data structure, because that's just what it is, for all the reasons that make relational structures appropriate. But for the simpler cases, i'm thinking it would be nice to take a more compact structure like this:

ID: BoisenSean
NAME: Boisen, Sean Cornell
OCCUPATION: computer scientist
BORN: September 21, 1958
   AT Tacoma, Washington, United States
   OF father [BoisenElliott]
   OF mother Boisen, Dorothy Louise (Claycomb)
MARRIED: July 25, 1998
   TO Zarba, Donna Irene (Jones)
   AT Andover, Essex County, Massachusetts, United States
NOTE: married at Free Christian Church by Jack L. Daniel
DIED: September 20, 2018
NOTE: this hasn't happened yet

and use a program to generate the various informational elements. Of course, this output won't be fully linked in to other records (if it could be, you wouldn't need the distributed representation in the first place), and will therefore require some manual adjustment. But particularly since i hope to gather a lot more information from relatives who don't even know how to spell XML, it seems some more amenable format may be required.

I'm working on some Perl code to process this, which i'll post once it's done (not quite yet).

11:51:26 PM #  Click here to send an email to the editor of this weblog.  comment []  trackback []
 Saturday, August 14, 2004

I got started by looking around at XML specifications for genealogy. There's a nice summary here. Michael Kay, who wrote the book i rely on for XSLT, has some interesting work called GedML, including SAX parser code and XSL transformations to generate HTML representations (which i'll definitely want to do eventually). But i opted to follow the GEDCOM 6.0 beta specification (PDF documentation). It's probably not the last word, but it seems closer to an true XML spec in spirit, and farther from the quirks of GEDCOM, which can hardly be faulted for showing its roots in the stone-age of data processing. There's also some history of various formats here.

Enough data has been done in GEDCOM over the years that i'm betting it's the one with the most traction, and the one most likely to succeed in "future-proofing" data. Even if that bet loses, any structured spec will always be better than none. Note i'm completely skirting the issue of how to get all the GEDCOM data that's currently out there into a more forward-looking representation, though the DAML folks have done some work on this, including one by my colleague Mike Dean. Since i'm essentially starting from scratch, i'll just be entering data by hand for a while.

5:30:59 PM #  Click here to send an email to the editor of this weblog.  comment []  trackback []

Funny coincedence: i was listening to homegirl Nicole C. Mullen's self-titled album while i was finishing the last post on genealogy, only to hear this song.

Dedicated in loving memory of Napoleon Coleman, Sr., Bessie (Smith) Coleman, and Eloise & Isaac Roberson . Words and Music by: Nicole Coleman-Mullen

He was
A beautiful shade of chocolate
She was
A beautiful shade of red
And under the watchful eyes of heaven
Afro Indian girl boy were wed
Little did they know
So long ago
Flowers would come
From the seed they’d sown
Yeah, little did they know
What would come to be
A forest would grow
From the soil and the seed
And these are the branches
In my family tree
Napoleon, Betsy, Isaac, Eloise
Under their branches
I can feel a breeze
Where the leaves from the trees
Make a canopy for me to
Live in the shade, yeah
The leaves from their trees
Made a canopy for me
To live in the shade . . .
I wanna thank you
Cause you took the heat for me
You took the heat for me . . .

I don't have the same background, but i share the sense of gratitude for those who went before and did things i'll never know that allow me to stand here today.

5:01:58 PM #  Click here to send an email to the editor of this weblog.  comment []  trackback []

My dad had an extended visit last month, which among other things gave us some time to talk about family history. We also visited Ellis Island during a trip to Manhatten, which i highly recommend. A better understanding of history would sure help with a lot of issues we face today: instead, we keep reinventing and re-stumbling because we've lost context.

As it turned out, the ancestors we were searching came to the US prior to the big "third wave" of  US immigration, so Ellis Island didn't have any of their records. But this whole process, and other activitives in structured data (like SemanticBible) reignited my interest in capturing our family tree in a reusable way.

My mom gave me some notes from some genealogy work a relative did back in the 70s. How the times have changed! These are foms developed by the Mormons (who have long been leaders in genealogy research because of some peculiar beliefs about being baptized for deceased relatives), filled out by old-fashioned typewriter, or with handwritten notes, along with some xerographic copies (that's what they used to call them) of wedding announcements, obituaries, and the like. There's some interesting correspondence with the head of the Custer County (Nebraska) Historical Society, who said he had collected some 40,000 obituaries (doubtless in some very large file drawers, in the pre-digital age), including some information about my great-grandfather, Arthur Napoleon Robinett(e).

So all this has inspired me to try to organize the information i have in an appropriate way for the digital age: i'm starting with XML, though eventually OWL might be more appropriate. My hope is to enter what i have (already quite a bit for the past three generations), and then get it posted on the web so the extended family can review and hopefully extend and correct what i have.

This category will operate somewhat independently from the rest of Blogos, though everything will get posted to the main URL as always. If you're only interested in this category, you can view it here, or follow it via RSS with this channel (you'll need an RSS reader like bloglines).

4:48:37 PM #  Click here to send an email to the editor of this weblog.  comment []  trackback []