Category Archives: Technical

Around the World in 40 Blogs

Family Tree Magazine  recently published  their list of the top 40 international (outside the United States) blogs, titled Around the World in 40 Blogs, and one of them is the Israel Genealogy Research Association (IGRA) site (genealogy.org.il):

As some of you may know, I built most of that site late last year and early this year for IGRA. It’s rewarding that something I spent so much time and effort on is being recognized. If you haven’t checked it out, or haven’t seen it recently, I recommend going to the site and seeing what’s there. There are videos and articles and dozens of searchable databases with information you cannot find anywhere else online.

Genealogy standards, another look

Over a year ago I took a look at genealogy data standards and where they were headed in my article The Future of Sharing (Genealogical Data). In some ways a lot has changed since I wrote the article, but in some ways we’re really at the same point we were then, with no clear picture of the future. This past week’s 2nd annual Rootstech conference (my last article mentioned the then-upcoming 1st Rootstech) has brought some of the questions asked into focus, so I thought it was worth reviewing what has happened.

GEDCOM X

On the face of it, the biggest news to come out of the conference was the release of long-awaited successor to GEDCOM, GEDCOM X. FamilySearch, the online presence of the LDS church which was the creator and maintainer of the original GEDCOM standard, released this new standard at the conference a few days ago. FamilySearch hits a lot of the right keywords in the release – the format can be XML or JSON based, is released under an Creative Commons license, supports metadata including Dublin Core and FOAF, the development is hosted on Github, it offers both a file format (like tradition GEDCOM) and an API, and more. Yet there are also some strange decisions that seem to have been made, and no explanation seems to be given. One that stands out is the decision to base the file format MIME, a format created for sending e-mail attachements (MIME is an acronym of Multipurpose Internet Mail Extensions). So far the logic behind many of the decisions that have been made seem very opaque. The entire development of GEDCOM X seems to have been done up to this point without any input from the industry at large, or even the well know efforts to improve GEDCOM, such as the Better GEDCOM group. Indeed, the answer in their FAQ about these efforts seems largely patronizing:

Have you heard about FHISO (BetterGEDCOM), OpenGen, ?
Of course. We’ve heard about them and many others who are making efforts to standardize genealogical technologies. We applaud the work of everybody willing to contribute to the standardization effort, and we hope they will continue to contribute their voices.

In other words, at least to my ears, it’s saying they know other people want to improve GEDCOM, but they are going to do their own thing and maybe they’ll listen occassionally (but no promises). In short, while it’s great that FamilySearch has come out with a new standard, their approach to doing so does not seem geared towards gaining widespread adoption from the industry at large, or at least not in such a friendly manner.

Of course, the huge advantage FamilySearch has over just about anyone else is the very large developer network they’ve cultivated for accessing familysearch.org. They are essentially a non-profit organization which has many commercial companies using their current API. To the extent that they transition these existing companies from their legacy API to GEDCOM X, they will certainly have a major advantage over other efforts to replace GEDCOM.

Progress On Other Fronts

So what happened to the other efforts mentioned in my last article?

The most visible effort has been the BetterGEDCOM wiki, which is moving from an informal group to a formal organization called the Family History Information Standards Organisation (FHISO) which will now sponsor the wiki. While they have been the most active effort to create a replacement for GEDCOM, they seem to have been overtaken by the too-many-cooks problem and how they plan on coming to a consensus remains to be seen, let alone how they convince industry organizations and companies to agree with them. It will be interesting to see FHISO’s response to GEDCOM X, and if they will focus their efforts on trying to implement their ideas within the GEDCOM X framework, or if they will continue to try to go it alone.

The OpenGen International Alliance, started by the people at AppleTree.com, doesn’t seem to have taken off. Either for the matter has AppleTree, which may explain the why the OpenGen site hasn’t been updated in the past year (and refers to an upcoming webinar last March).

APIs

One of the most interesting developments last year was the introduction of Application Programming Interfaces (APIs) for genealogy web sites. Indeed, the rumors around what would become GEDCOM X was that it was only an API, and not a file format, but luckily that turned out not to be true and it is both. The only APIs that had been released before my last article were Geni.com‘s API and OneWorldTree.com‘s GenealogyCloud API.

Geni seems to at least gotten some traction with their API, with future support for syncing data coming from AncestorSync. Presumably this uses Geni’s API. I haven’t heard of other uses of the Geni API, however. If you know of other developers using the Geni API, let me know in the comment.

I have not heard of anyone using the GenealogyCloud API. If you know any anyone using GenealogyCloud, let me know in the comments.

As I predicted in the last article, MyHeritage introduced their own API, smartly named Family Graph. I say smartly because it is clearly mimicking Facebooks’ Social Graph API. They’re not comparing themselves to Geni, but to Facebook, which is smart. The other very smart thing they did was introduce a contest to develop applications that use the Family Graph API. If no one uses your API, what’s the point right? The winner receives $10,000. The deadline for that contest is actually in about a week from now, with judging by a panel taking place in the first half of March and the results announced on March 15th. The real test will be the quality of the applications submitted, and whether the applications were submitted by individual developers or by larger companies. If the contest results are published next month with no major applications, then this will in my estimation be a setback for MyHeritage, not an achievement.

Conclusion

It will be very interesting to see how the introduction of GEDCOM X is accepted by the genealogy companies at large that are needed to make a new format successful. FamilySearch has some key advantages in that they are a non-profit organization (even though in many ways they compete with the large commercial companies like Ancestry.com and MyHeritage.com) and that they already have a large developer network. While many of the largest genealogy companies are not currently part of that developer network, if all of the ones who are start adopting GEDCOM X as their export format of choice, I think it will be hard for other companies to not adopt it. GEDCOM X’s dual format/API functionality also gives it a major edge, especially if FamilySearch’s legacy API is replaced by the API functionality in GEDCOM X.

Some have predicted there would never be a true replacement for GEDCOM, and others have said that technology such as AncestorSync’s upcoming products would make the need for a file format unnecessary. I think both of these assertions are incorrect. There will be a replacement for GEDCOM, and it is necessary. Whether or not GEDCOM X is the ideal replacement seems to me to be a moot point. They will get the traction they need to push GEDCOM X into the mainstream. The real question is will they truly make it an open standard, or will they continue to hold it close to the chest? The real test will be when other groups insist on various features, and how they handle those demands. FamilySearch has put in all the trappings of an open and transparent development process, so let’s hope they keep in that direction.

Great-Grandma’s Cherry Pie: An entertaining look at copyright issues

The California State Genealogical Alliance (CSGA) recently launched two blogs. The first one is simply the CSGA Blog, covering genealogical issues in California.

The second blog, Csgacopyright, is of interest even to those with no connection to California, as it covers the thorny issues of copyright, as they pertain to genealogy.

Image from Wikimedia Commons.

This second blog just posted a very entertaining look at what copyright issues might exist when a great-grandmother passes down her secret cherry pie recipe through various generations. It’s worth a read if just to remind us of the complicated issues family sometimes find themselves dealing with…

As for the blog, I have no idea who is actually writing it, nor if they are qualified copyright attorneys, etc. so until they let people know who are authoring their articles, I guess take the legal advice with a grain of salt, or cherry pie, whichever you prefer.

Perceptions of Relationship

In a project I’m working on I have been giving some thought to how we relate to others, but also how we perceive we relate to others. These are not necessarily the same. Certainly it’s possible to be closer socially with cousins that are more distantly related than other cousins, but that’s a choice. What I am thinking about is how we actually perceive we are related to others, and are we right? How would we judge that in any case?

I’m sure many of you are familiar with the traditional ‘cousin calculator’ chart, such the the one below (click to enlarge):

Traditional Cousin Calculator Chart

For those of you unfamiliar with how a cousin calculator works, you take two people and determine their common ancestor. You move in one direction (i.e. along the top) from the common ancestor until you reach the relationship of the first person to the common ancestor. You then move in the other direction (i.e. down along the side) until you reach the relationship of the second person to the common ancestor. The box where those two lines merge is the relationship between the two people. For example, if you are the great-grandchild and someone else is the grandchild of a common ancestor, you move along the top to the third column for great-grandchild, and down to the second row for the grandchild, and the box that is in the 3rd column and the second row is 1st Cousin, Once Removed.

If you take a close look, you’ll notice I’ve color-coded the chart how I think we normally perceive relationships. Essentially, our sibling and parents are one degree away, our nieces/nephews and 1st cousins are two degrees away, and so forth. A second cousin is generally perceived as one degree further away from us than a first cousin. A first cousin, once removed is, at least to me, in the same category as a second cousin, and that’s what this chart shows.

Now how can we actually determine how closely we’re related? One simple method is by how much DNA we share. If we add in the percentage of DNA present between any two relatives to the chart it looks a bit different (click to enlarge):

DNA Cousin Calculator Chart

Note in the above chart that I’ve changed the color coding to match the percentages of shared DNA. The colors no long take a box shape around the common ancestor, but instead move out in the straight line. What we can see by looking at the numbers is that actually the degree of relationship is moving twice as fast as we perceived before. From a first cousin to a second cousin, the amount of shared DNA is one quarter, not one half. We perceive the second cousin as being twice as distant a relative as a first cousin, but from the perspective of DNA, they are actually four times as distant!

I know one of my 5th cousins, and we share just 0.049% DNA. That’s a half of a tenth of a percent. Not very much. Anyways, this was just an attempt to create some kind of objective view of family relationships. Of course, nothing having to do with family is really objective, right?

The End of the Printed Book (coming soon, but not yet)

So I live in Israel and while it’s not too hard to get popular books from best-selling authors in English, it’s a bit harder to get things like technical books, or more niche books like those that deal with genealogy. Finding ways to get English books to Israel cheaply is somewhat of an obsession with much of the English-speaking community here, and it’s not so simple. Amazon.com was a long-time favorite for many years, although now-Amazon-owned BookDepository.com seems the better deal (books are a little more money, but shipping is free). Of course, with the rise of eBooks one would think eBooks are the simple solution – usually cheaper and no shipping charges. My wife recently got an iPad, and when I decided to order a book recently (Ancestors and Relatives: Genealogy, Identity, and Community by Eviatar Zerubavel) I thought about getting it as an eBook. The price was almost half the printed version ($9.99 on Kindle versus $18.21 in Hardcover on Amazon) and that’s without considering shipping for the hardcover.

I’ve been a book collector for more than twenty years, and while not all my books make it out of my library, I do lend many books out. Considering how hard it is to get niche books like an academically-published one like Ancestors and Relatives…, here in Israel I figured it would be highly likely I would be loaning out the book at some point. So how does one loan out an eBook? First I think it’s worth taking a look at who the different players are in the eBook field.

So the big players in eBooks are Amazon (with the Kindle), Barnes & Noble (with the Nook), Apple (with iBooks) and Google (with Google Books). Amazon has long been the leader in this field, with both the hardware (the Kindle) and the store (Amazon.com) to provide the total package for eBook reading. In fact, Amazon is really the only company that offers software on just about every type of device (Mac, Windows, iPhone, iPad, Android, and of course their own Kindle devices) and in that they have a real advantage. When Barnes & Noble, the retail leader in book sales in the US, launched their eBook platform called the Nook, they introduced one feature which had been missing from the Kindle – the ability to lend books. Amazon quickly copied that feature and made it available on the Kindle, but with the same odd restrictions – you could only lend a book once to a friend, and only for 14 days. Sure, I wish everyone I lent a book to would return it in less than two weeks, but that’s not reality. Why does it matter how long the book is lent for exactly? When a book is lent out, you cannot view it yourself, which makes sense. If I can’t view it while it is being lent out, who care how long it is being lent out and to whom? Herein lies the problem with eBooks as they currently stand – you’re not buying the book, your essentially leasing it. In fact, even with the lending features of Kindle and Nook, not all publishers allow books to be lent – you need to check each book when you buy it and see if lending as a ‘feature’ is enabled.

In the days before Apple launched iTunes and the iPod, digital music failed to take off in a major way. The reason it failed was that it was easier to freely download pirated music than it was to buy and use music from the big labels. Apple fixed that, not by eliminating all the restrictions music companies wanted on the files, but by removing enough of them that using digital music legally became easy enough that most people wouldn’t bother trying to get it illegally. The big breakthrough was that Apple had the store (what Amazon and Barnes & Noble now have for books) tightly integrated, and that Apple got the music companies to loosen their restrictions so that customers could play music on multiple devices (their Mac, their iPod and their now their iPhone for example) and could even burn CDs of their music for their own use. Most people don’t really remember what digital music was like before Apple, but none of that was possible. Sure, the iPod was a breakthrough device when it came out, but the real reason it was so successful was the integration with the iTunes Store and the improved licensing from the music companies.

The problem with eBooks is that none of the companies have yet hit that sweet spot of great device, great store integration and good enough licensing. It’s hard to even think about licensing a book. It reminds me of a used book store I used to visit almost 20 years ago in Jerusalem that had a copy of a book that was out of print, yet highly in demand, so they rented it out. It was bizarre and I didn’t rent it. I waited a little longer and I found a copy for sale elsewhere. Eventually the book came back into print and everyone could get a copy. The iPad is a great device for reading books, and the various Kindles and Nooks are also good devices. The new Kindle Fire is really trying to compete with the iPad, and is perhaps the first device that will be able to do so, but while there are devices that are great, and there is store integration which works okay (I wouldn’t yet call it great on any platform), no one has gotten the licensing right yet.

It took years of battling between Steve Jobs and music companies to get the licensing right for music – and that battle included a visionary like Steve Jobs and music company executives that finally ‘got it’ (perhaps they were forced into ‘getting it’ by Jobs). How long will it take for book publishers to ‘get it’ is anyone’s guess. It’s already possible to download illegal eBooks, although I don’t know if the book reading public will adopt that as quickly as the music listening public did in the days before the iPod and iTunes.

One company that seems to be getting ready for the inevitable move to eBooks is, believe it or not, IKEA. Apparently, they are creating a deeper version of their popular (some might say ubiquitous) BILLY bookcase in order to accommodate the display of physical items, perhaps larger coffee-table style books, but not actually rows of books.

Music needed easy purchasing and a liberal licensing scheme so that people could listen to their music on all their devices. Books needs the same things, but something more. People listen to the same music over and over, but they don’t read the same book over and over – instead they lend it out to others. The book publishing industry needs to come to grips with this difference and make their eBooks as lendable as their printed cousins. Until that point, buying books for reading on digital devices will not be ubiquitous (not even as ubiquitous as BILLY bookcases). What’s worse is that as a ‘leased’ product instead of an owned product, what happens if the publisher decides to change the terms after the purchase, further restricting the usage of the book. What can you do about that? Not much, other than wait for the publishers to wake up and figure out that books are not music, and they need to be treated differently.

So in the end, I ordered the book from the Book Depository web site, and will get it in a couple of weeks. It’s a little pricier, but I get to own the book and lend out as often and to as many people as I like, without having to worry about what the publisher thinks. Of course, since Amazon bought Book Depository they’ll still be getting my money, but at least I’m getting something tangible for that money. In the future no doubt I will be buying eBooks along with the rest of society (I do not believe my grandkids will be buying physical textbooks) but for the time being I’m doing my share to help the paper industry.