Preview |
PDF, English
Download (2MB) | Terms of use |
Abstract
Documents written in cuneiform script are one of the largest sources about ancient history. The script is written by imprinting wedges (Latin: cunei) into clay tablets and was used for almost four millennia. This three-dimensional script is typically transcribed by hand with ink on paper. These transcriptions are available in large quantities as raster graphics by online sources like the Cuneiform Database Library Initative (CDLI). Within this article we present an approach to extract Scalable Vector Graphics (SVG) in 2D from raster images as we previously did from 3D models. This enlarges our basis of data sets for tasks like word-spotting. In the first step of vectorizing the raster images we extract smooth outlines and a minimal graph representation of sets of wedges, i.e., main components of cuneiform characters. Then we discretize these outlines followed by a Delaunay triangulation to extract skeletons of sets of connected wedges. To separate the sets into single wedges we experimented with different conflict resolution strategies and candidate pruning. A thorough evaluation of our methods and its parameters on real word data shows that the wedges are extracted with a true positive rate of 0.98. At the same time the false positive rate is 0.2, which requires future extension by using statistics about geometric configurations of wedge sets.
Document type: | Conference Item |
---|---|
Date Deposited: | 10 Feb 2016 13:32 |
Date: | 3 February 2016 |
Event Dates: | 3-5. Feb. 2016 |
Event Location: | Rimske Toplice, Slovenia |
Event Title: | 21th Computer Vision Winter Workshop (CVWW'16) |
Faculties / Institutes: | The Faculty of Mathematics and Computer Science > Department of Computer Science Philosophische Fakultät > Seminar für Sprachen und Kulturen des Vorderen Orients Service facilities > Interdisciplinary Center for Scientific Computing Service facilities > Graduiertenschulen > Graduiertenschule Wissenschaftliches Rechnen |
DDC-classification: | 004 Data processing Computer science 090 Manuscripts and rare books 490 Other languages 760 Graphic arts Printmaking and prints |