Principle goal: to construct a data harvesting system with an associated semantic web-enabled store for genealogical data with a method for querying the data which you test using at least one query.
In the past decade there has been a massive increase of interest in genealogy due in part from the ready availability of much genealogical data via the web. One of the most important, and freely available, resources of genealogical information is provided by the Church of the Latter Day Saints [1]. The main aim of this project is to develop a system to automate the extraction and facilitate the analysis and visualisation of Mormon GEDCOM [2] data. In particular, the focus will be on Scottish genealogical data. The data will be stored in a format that can be readily used in programs devoted to spatial analysis such as graphical information systems (GIS). The benefits of having information in such a format will appreciated by a large number of fields, e.g. health, history, social science and economics to name just a few. This project will involve building a data harvesting system, developing an ontology for genealogical data and developing a structure in which to store the harvested data (e.g. in semantic web compatible format).
The analysis and visualisation phase of the project aims to build family tree networks from the extracted data in such a way it allows information about the size topology and shape of these trees to be extracted easily. By mapping this information to other data via GIS systems, it can then be put into new contexts. This allows geoscientists to answer question as to how the topology of family trees is affected by environmental issues, wars and epidemics. If you have sufficient time, you may try to answer one such question: “Using data from the Orkney islands, what are the main lines of communications between families on the main and more remote islands (i.e., the connectivity in the network)?” The results of this question could be then displayed on a simple GIS system such as DIY Map [3].