A Data Set of Annotated Historical Maps

Primary tabs

This data set is designed for testing the performance of optical character recognition algorithms on text in scanned historical map images. Thirty maps from the nineteenth and early twentieth centuries were chosen from the David Rumsey Map Collection (http://davidrumsey.com). Most maps are of individual states, though some are regional and one is of the entire U.S.; most feature little handwritten text. The original MrSid files are converted into uncompressed TIFF images for a manual annotation, stored in XML format. The authors gratefully acknowledge the David Rumsey Map Collection as the source of the map images, which come with the following notice: Images copyright © 2000 by Cartography Associates. Images may be reproduced or transmitted, but not for commercial use. This work is licensed under a Creative Commons [Attribution-NonCommercial-ShareAlike 3.0 Unported] license (http://creativecommons.org/licenses/by-nc-sa/3.0). By downloading any images from this site, you agree to the terms of that license.