- * (x)
- Grinnell (Iowa) (x)
- Foss, Danja (x)
- Student Scholarship (x)
- cartographic (x)
- map processing (x)
- georeferencing (x)
- Search results
-
-
Title
-
A Data Set of Annotated Historical Maps
-
Description
-
This data set is designed for testing the performance of optical character recognition (OCR) algorithms on text in scanned historical map images. Thirty maps from the nineteenth and early twentieth centuries (1866–1927) were chosen from nine atlases in the David Rumsey Map Collection. Most maps are of individual states, though some are regional and one is of the entire U.S.; most are manually typeset, with occasional handwritten text. Each place name and many other textual items are annotated with the baseline, bounds, character heights, and ground truth text transcriptions. Several maps also feature USGS GNIS Feature ID tags for labeled items. Java code for processing is provided with the text-based XML annotations, alongside the original, compressed MrSID images and uncompressed TIFF images. We gratefully acknowledge the David Rumsey Map Collection as the source of the map images, which come with the following notice: "Images copyright 2000 by Cartography Associates. Images may be reproduced or transmitted, but not for commercial use. This work is licensed under a Creative Commons [Attribution-NonCommercial-ShareAlike 3.0 Unported] license. By downloading any images from this site, you agree to the terms of that license." The remaining (textual and data) material is hereby licensed CC-BY-NC-SA 4.0 International; the code incorporated is GPL v3.0 (the license is incorporated in the tar file).
-
Date Created
-
2017
-
PID
-
grinnell:19349