Phil Moir's Blog

Top tip - Genes Reunited blogs

Welcome to the new Genes Reunited blog!

  • We regularly add blogs covering a variety of topics. You can add your own comments at the bottom.
  • The Genes Reunited Team will be writing blogs and keeping you up to date with changes happening on the site.
  • In the future we hope to have guest bloggers that will be able to give you tips and advice as to how to trace your family history.
  • The blogs will have various privacy settings, so that you can choose who you share your blog with.

New Scottish Census

New Scottish census records

Do you have Scottish ancestors?

Perhaps you do and you just didn't know! Search our brand new Scottish census records today and discover if you have Scottish roots.

Search Scottish Census

RootsTech2012


Published on 15 Feb 2012 15:18 : rootstech : 2 comments : 5875 views

Firstly, I would like to apologise that this blog was not completed until today, and should have been posted over a week ago. Hopefully, despite my negligence, you still enjoy the report. Well, the final day at RootsTech2012 is over (well for me at least), and what an amazing, informative, motivating experience it has been. RootsTech2012? I was going to equate it to the "Who Do You Think You Are" show at Olympia in London (UK), but that would be an unfair comparison. RootsTech is in its 2nd year, hosted by FamilySearch.org in Salt Lake City, Utah USA (and jointly sponsored by brightsolid and several other companies), and is quite unique in terms of content. It is both a genealogy show for genealogists and family history enthusiasts, but also a technical conference where all the major and minor for-profit and not-for-profit companies in the genealogy business get together to discuss new technical innovations that are being incorporated into this field and are discussed openly in the hope that we all rise to the challenge in providing and securing genealogical records, research and discoveries for now and the future. It was so pleasing to see non-technical genealogists showing a a direct interest in the technology seminars, as well as developers and the like attending user focused seminars, and this cross fertilisation of ideas and concepts and requirements between the groups.

And there was some fun events too, with the brightsolid sponsored evening entertainment on the Thursday, where two comedians had to entertain a room of 3000 individuals mixed between technologists and genealogists. I'm not sure the comedians had ever experienced before the sort of heckling they got that evening, all in a light hearted manner, and with people raising hands to ask questions or even suggesting one male comedian looked like an attendees mother. Everyone seemed to have a lot of fun though. And then on the last night, the Church of the Latter Days Saints (LDS) kept the doors of their Family History Centre open until midnight. I really enjoyed just browsing the corridors of their UK vault.

We had a team manning the stand covering all brightsolid sites, including FindMyPast (UK, Ireland, Australia and now US), Genes Reunited, Scotland's People, and the newly launched Census1940 site (which is also for the US market). I was there in my developer capacity, and spent very little time on the stand, but more time running from seminar to seminar. Planning was essential, and we tried to distribute attendance between the few development resources we had. The choice of seminars was critical, and although some didn't quite stack up to their billing, the majority definitely threw some new considerations into the mix for things that need review or action.

And so to kick off the Thursday morning schedule, we had the keynote speech by Jay Verkler entitled "Inventing the Future, as a Community", in front of a crowd of what I would expect to be about 3000. It was huge. The focus was not on today, but what genealogy is going to be like for our descendants in 2060, and what we have to do as a technical community to ensure that it is still available, in much the same way that books written in medieval times and before are still available today. The term is persistent links, and the aim is that if you have a link to a page today, the link will still work 10 years from now, when currently some links don't work a year after they are first published. Additionally, we were given an insight into some new genealogy based microdata tagging (using schemas published at http://www.historical-data.org) that can be applied to our web pages to enable search engines like Google to gain specific intelligence about the genealogy data available. This would mean searching Google could direct you to the exact record in Genes Reunited using their search page. There was also a very brief demonstration of a Google Chrome browser extension, and if there was ever a reason to use Chrome instead of any of the other browsers for genealogy then this would be it. (I will post details on this as soon as it becomes publicly available). Definitely some interesting challenges ahead.

The first two seminars for me were hosted by Ryan Heaton from FamilySearch, and covered GEDCOMX, a new standard for the transfer of genealogical data from one source to another (i.e. from a desktop application to a website like Genes Reunited). Please note the "X" at the end. To be very clear this is not a revision of the existing GEDCOM, nor is it going to be an exclusively FamilySearch (LDS) driven activity. It is a new design from the ground up, and will not be distinguishable from any previous version of GEDCOM, and although the initial lead has been initiated by FamilySearch following break out discussions at RootsTech2011, it will have an open-source community evolution. That is, all parties interested will be free to comment, contribute and help deliver it's capabilities. This is not the same as the existing GEDCOM where FamilySearch (LDS) delivered a product and separate groups and individuals used a minimized group or extended the group for their own purpose. People will still be able to do this if they wish, but the aim is to encourage any changes needed to be adopted into the main standards definition. brightsolid (aka Genes Reunited, FindMyPast, ScotlandsPeople, etc.), will be getting involved in this standards definition, and we will endeavour to incorporate the full set of features that it covers in Genes Reunited when it finally becomes a viable standard. We also discussed directly the handling of old GEDCOM files and the conversion into the new format. This won't be an instant change, but I am sure you will see the new GEDCOMX format being taken up by all the major genealogy websites and software manufacturers. One clear statement that was made at the end of the first session, there will be NO effort from FamilySearch to provide a tool that will take a new GEDCOMX file and convert into the old GEDCOM style. It just won't match, but they will definitely be working on a tool to take the old version and convert to the new version.

There were a couple of lunches going on and I attended the one hosted by Chris van der Kuyl (CEO brightsolid). Well that was expected. Chris presented a talk on "Why Everyone Deserves Their Own Episode of Who Do You Think You Are and How Brightsolid Will Help You Get There", which introduced all the sites that brightsolid develops to the US (and Canadian!) contingent. It was an interesting presentation that brought home the message that telling stories is a key part of personal genealogy, and one that I agree with, and would like to see evolve considerably on Genes.


Chris van der Kuyl presents brightsolid to the audience

I then followed that up with a seminar and workshop combination on a product called MongoDB. For readers who are interested, companies and not just genealogy companies, traditionally use relational databases to store all the information that is needed to run their businesses, from customer data to transcription data, as these databases allow for complex querying of the data in a relatively fast and efficient way. However, the more complex you want to make it, and genealogy gets very complicated, the more difficult it is to maintain and performance starts to deteriorate. And so it has been this way for over 40 years! More recently new technologies have appeared called Document databases, that store data in a less structured and less limiting way. The performance gains can be massive, but the cost has been in making complex queries. To explain the differences, relational databases tend to store data in fixed columns with limiting parameters, such as field of 10 characters in length. The imposed structure makes it better for finding similar data, or sorting, but it is very restrictive. Document databases store data in a relatively un-structured way, so that you have this flexibility, so for example if you wanted to record your own piece of information, such as in my case to create a unique "MoirFamilyReference", and someone else wants to record the height of all their relations, you can. Obviously you would keep some conventions in place, but it gives the opportunity to let the user decide what you record and save, and most importantly retrieve and extract. We already use similar tools for British Newspaper Archive and the new US Census website, so we will be exploring this within the Genes Reunited environment.

Friday morning's keynote was maybe of less interest to the once again massive audience. It focused on size of data, and how we have evolved from stone tablets storing a few bytes of information, to servers that now hold exabytes of information. This is a number with an awful lot of 0's after it, or in terms described by the speaker as being the equivalent of every 8inch tree or bigger in the world being chopped down and made into paper and being written on.

The first seminar on Friday was called Deep Linking and covered the concept of creating your own search pages that effectively scraped results and stored this information for personal and re-publishing purposes. Yes, the search could be tweaked to personal preference, the image/transcription was still behind the payment barriers for sites that held this information, and for his site he had sought approval from the data providers. What presenter Stephen Morris showed was that sometimes there are other ways of searching, gathering and displaying data that can often be missed by the genealogy websites. I attended Steve's follow up seminar on the Saturday, and in combination with this first one, really exposed some interesting options to consider. Stephen works with a very experienced and knowledgeable chap who specialises in languages, scripts and speech. Between the two of them, they have evolved some new approaches to searching records for names based on how the name would be pronounced in different parts of the world. It still needs work, but he presented some interesting test results. The other main seminar for me on Friday was about large scale JavaScript development, using Google Closure. This is of real interest to us as it may help us improve development and performance on our tree tool.


brightsolid stand in main hall

The Saturday event was only a half day experience, but I still had time to cram in three seminars (well I attended two half seminars) before catching a plane home. One was the Stephen Morris follow up on Phonetic Matching mentioned above. And the final one of the event was one on another FamilySearch led approach, and that was on persistent identifiers. There was a discussion about the various methods used in technology to provide persistence to record sources, each of which provide various forms of benefits and also some pitfalls. FamilySearch are developing an idea for themselves and again want to include the community, and certainly Genes and brightsolid should consider this as an option, although I am also aware that the British Library are looking at one option already in the market.

And so my brief visit to Salt Lake City and the RootsTech2012 conference came to an end. This turned out to be a very worthwhile trip, not just for the conference and to discuss common ideas and development, but also to see how the market is evolving in the US. It has again stoked my motivation to help deliver a quality product in Genes Reunited that meets the needs of the members whether they be hobby family historians or professional genealogists. If you have the opportunity to attend next year, then I would encourage it, and would love to meet up and hear from you. In the meantime, we are only days away from the "Who Do You Think You Are" show at Olympia, London, where Genes Reunited will once again be fully present. This is a great opportunity to meet us and tell us what you think of the site and how better we can meet your needs. I will be there on the Friday, and am looking forward to chatting with all those lucky enough to attend.

To see some of the presentations, watch the videos at http://rootstech.org/live.

Comments

Profile Picture
Send Message
by Joy on 15 Feb 2012 22:18 : Report Abuse
I have read about Chris van der Kuyl and about that presentation.

I would love to go to Salt Lake City. I have ancestral family from Lenham in Kent that became Mormons and emigrated to America, sailing on the ship 'Windermere' in 1854 from Liverpool. And I have a distant relative in Utah.

I have to admit to not understanding much of what you have typed :)

Thank you for posting about your visit, and thank you for enabling emoticons here.

Joy
Profile Picture
Send Message
by Simon on 22 Feb 2012 14:44 : Report Abuse
You clearly had an interesting time there Phil. It is extraordinary how this technology has progressed in the past few years. I recall the Computer Room at my college back in the early 70s - a whole ground floor wing of the College to house The Computer - and I guess it could do far less than my mobile phone can do today! And to think that the first pc I had at work in the 80s had a whole 20Mb hard drive!!

A bit different today... and as for tomorrow...?

Simon