The Future of Sharing (Genealogical Data)

It’s no secret that the current standard for sharing genealogical data, GEDCOM, is woefully out of date. The last official revision to the GEDCOM standard, 5.5, was completed in 1996. A minor update, 5.5.1, was released in 1999 but never officially approved (even though some of its provisions have been adopted by various genealogy programs). Revision 5.5.1 added one very important feature – support for UTF-8 character encoding, which is a form of Unicode, which support multiple character sets (including, for example, Hebrew).

GEDCOM has, for all intents and purposes, been abandoned by the Church of Latter Day Saints (the Mormons) which created and owns the standard. The church has indicated that they will not be updating it, and indeed are replacing the need for it with a new API (Application Programming Interface) which will allow genealogy programs to exchange data with their website (FamilySearch.org). One problem with this approach is the need to go through their website, and the fact that they have not made this API publicly available (i.e. it’s not a public standard, just a private interface to their web site). Another major problem is that there is no data format that allows one to create a family tree that can be shared independently, like GEDCOM is used today. FamilySearch in no way needs such a format, since their mammoth size and importance in the genealogical world will force genealogy program to support its API, as many have already done.

Over the years, there have been many attempts to either upgrade or replace GEDCOM. These efforts have all failed. In general the problem has been that the companies that create genealogy program need to agree to adopt any new standard, and they really haven’t had much incentive to do so. Supporting the import of GEDCOM files allows them to support a basic file interchange, which never will support the full feature-set of their programs which have become much more sophisticated since 1996, but is enough to allow customers to exchange information with their relatives. If they supported a fully-featured GEDCOM replacement (that for example would better support photographs and evidence management), it would only make it easier for customers to try other programs. Thus the disincentive for the companies to support a modern replacement for GEDCOM.

Another problem with replacing GEDCOM has been arguments over the data model used. GEDCOM is based on a nuclear family data model (i.e. one mother, one father and their children). It assumes a nuclear family structure, and other forms of families are harder to support. This problem has caused some to support a data model based not on the family but on the individual. This is philosophical debate, and as you might imagine different people take very strong positions in this battle.

Even with this history, there are a few new initiatives to come up with a replacement for GEDCOM. One initiative that has garnered some attention recently is BetterGEDCOM. The BetterGEDCOM initiative came from the frustration of many genealogists over the lack of updates to GEDCOM and is an attempt to create an open forum for the creation of a new standard. Like many attempts at ‘openness’, however, it has run into its own in-fighting and conflicts. It remains to be seen how successful this attempt with be. Another recent initiative is the International OpenGen Alliance (OpenGen). This effort is a bit more of a top-down approach, being managed by the company that runs AppleTree.com, an online family tree web site. OpenGen is, however, a non-profit organization that is supposed to include more than just the team at AppleTree. There have been some attempts between BetterGEDCOM and OpenGen to coordinate, or at least follow each others’ efforts closely. Time will tell which effort, if either, will be successful in creating a new genealogical data sharing standard.

In case you think it isn’t complicated enough, other web sites beyond FamilySearch.org are also developing their own APIs for exchanging genealogical data. OneGreatFamily.com last year introduced an API called GenealogyCloud. It seems that no third-party applications yet support this API.

Geni.com, which boasts nearly a hundred million profiles on their site, and nearly 50 million that are interconnected in what they call their World Family Tree, just yesterday introduced their own API. Unlike FamilySearch.org, however, they are releasing documentation and sample applications on their web site. This will allow anyone to write applications that interact with Geni.com, similar to the way Facebook allows outside developers to create application that access information on Facebook. This is a very positive step. It’s not coincidence that one of the other large family tree web sites, AppleTree.com, is pushing another initiative to replace GEDCOM (OpenGen). These large sites need to create ways to exchange data and interact with other programs and web sites in order to maintain their growth rates.

MyHeritage.com, another one of the big family tree web sites, has taken a slightly different approach in that they have their own application (Family Tree Builder) that runs on a computer, which can sync data to their web site. While this approach allows them more control over what modifies data on their platform, it has its shortcomings as well, not the least of which it requires Windows to run (this coming from a Mac user). I suspect that MyHeritage.com will release their own public API in the future, if only to compete with Geni.com, their biggest competitor.

We can always hope that FamilySearch.org, Geni.com, MyHeritage.com and AppleTree.com will all come together and create a single API and data format for sharing data, but unfortunately if the past is any guide, this is unlikely to happen.

One indication of the direction the wind is blowing in this regard will be the upcoming RootsTech conference, taking place in February 2011 in Salt Lake City. This conference is the first RootsTech conference, although according to the organizers it replaces three earlier technical conferences – The Conference on Computerized Family History, the Family History Technology Workshop and the FamilySearch Developers Conference. Note that these previous conferences were all connected in some way to the Mormon church. It’s unclear how open this new conference will be to new ideas, or if it is really only looking for input for the existing Mormon church efforts such as FamilySearch.org. I imagine representatives from most of the genealogy software companies and web sites will be in attendance at the conference, as will people associated with the BetterGEDCOM and OpenGen efforts. During the week of the conference there will probably be a lot of blogging about what is going on, but the real test will be after the conference if companies announce intentions to seek a common API or data format to move forward with, or whether everyone will just continue the same disjointed approach that has been pursued for nearly 15 years.

7 thoughts on “The Future of Sharing (Genealogical Data)

  1. No problem, I’ve changed the URL. If you have an RSS feed, I can add you to my blog roll. I think I actually looked for an RSS feed for your site when I set up my blog, but didn’t find one.

  2. Thanks for letting your readers know about initiatives to improve file sharing between researchers, and between a researcher an a genealogy website.

    BetterGEDCOM developers hope to arrive at a better transfer protocol than currently exists.

Leave a Reply