Original early maps are usually only accessible for a small group of researchers and librarians because they are very old and sensitive, and could be easily destroyed. However, they are a valuable knowledge source for historical research, because they are also political and cultural evidences of its time. In the age of Digital Humanities, online access and information search in digitized historical documents and early maps allows people from all over the world to work with such artefacts of cultural heritage. However, the digitization solely generates images of the artefacts without any access to the semantics of the documents.
For most digital libraries of early maps (e.g. http://www.oldmapsonline.org/) the available metadata include only information about the map, e.g. author, title, size, creation date. Unfortunately, there is only little information about the data contained in the map. Tools for information retrieval in digitized early maps need to support users in typical queries like for instance:
Place markers and their text labels contain this information, for instance the place type town, village, village with church or a factory or mill. This makes the annotation and georeferencing of place markers a crucial task. This task is at the same time very challenging due to the nature of the manually created early maps, which contain a high variance in the used symbols. This gets even more complicated by the fact that a single map can easily contain many thousands of place markers. Therefore, proper tool support and automation of the annotation and georeferencing are of interest.
The Referencing and Annotation Tool (RAT) supports annotators in the task of identifying place markers in a digitized early map and helps to create a link to a modern map with minimum effort. Because many different symbols are used as place markers and they differ across maps, the user needs to select a template for each type of place marker and to manually annotate a small subset of the map. Based on this, the templates get adjusted; all parameters for the template matching are calculated to automatically preselect place markers with high confidence and assign them the most likely of the possible types. These annotations can also be added by hand and the automatically generated annotations can be corrected.
RAT facilitates georeferencing by suggesting the most likely modern places based on an estimated mapping between the coordinates of the pixel- and geocoordinates of the already georeferenced place markers. The number of suggestions can be further restricted through a phonetic search to places with names sounding similar to the name given on the map. This allows for identification of the modern place name using the historic name if the spelling has changed but the names still sound similar.
Currently, RAT is in a prototype stage which we tested with a range of 16th to 18th century maps. For instance, one of the maps in the test set contained 3809 place markers. An area with 47 place markers was manually annotated. Based on this initial annotation 3399 place markers were identified and 3339 of them were correct matches.
In other words, 98.2% of the identified markers were correct and 87.7% of the existing place markers were semi-automatically identified. A detailed description of the implementation and the functionality of an earlier version of RAT can be found in (Höhn et al., 2013) but the latest version uses a new template matching algorithm specifically optimized for maps where text, rivers and other structures are frequently next to the place markers and can disturb standard template matching algorithms.
We plan to reduce the manual work needed in all areas by identifying similar maps of the same region and time and exploiting their similarities to provide better suggestions for existing places and their georeferencing on new maps.