We had a technical work meeting with Luther Tychonievich, Tod Robbins, and Brandon Plewe yesterday to discuss some issues and strategies for federating (not merging) all of the databases that EarlySaints scholars are working on.
We developed the following short-term (this summer/fall) strategy:
- Match Table. A list of possible matches between persons in the various databases developed by EarlySaints participants; each match is an assertion that a person record in one database may be the same as a person record in another (or the same) database. For example, “FamilySearch KWJZ-DLC = Nauvoo Community #22006,” to a given degree of certainty (thus not demanding that we be sure yet). This will be built by some combination of automated data mining and interactive collaboration.
- Federated Search. A simple search interface that will look for a person in all of the databases, and return a list of matches (linking to the websites for more information). This will both use the match table, and encourage people to use the found records to add to the match table.
- Hosted Databases. Develop a service (or install one like CKAN) to share some of the major databases that are not currently online, thus allowing them to be part of the match table and federated search. At first, we would only do a few as a prototype, but eventually we want to make something where you can upload data yourself.
To develop these, we need as much data as possible. If you have an online person database (not a source transcription yet), would you be willing to give us either a data dump or (read-only) database access? We promise not to republish it and compete with your website. If your database is not online, could we get a copy? We will not make it public if you don’t want us to.