<> "The repository administrator has not yet configured an RDF license."^^ . <> . . "Building and Improving an OCR Classifier for Republican Chinese Newspaper Text"^^ . "This work presents methods and results of an initial step towards full text extraction from a Republican Chinese newspaper. My basis is a small fraction of the image corpus for which text ground truth exists. I introduce a character segmentation method which produces over 90,000 labeled images of single characters. Then I pre-train a GoogLeNet classifier as an OCR model on character images extracted from font files and randomly augmented on the fly, whereafter I fine-tune it on the previously segmented character images. I show that the pre-training step is able to increase OCR accuracy from 95.49% to 96.95% on the test set and finally, how post-processing using a masked language model corrects up to 16% of remaining errors, increasing accuracy on the test set to 97.44%."^^ . "2021" . . . . . . . "Konstantin"^^ . "Henke"^^ . "Konstantin Henke"^^ . . . . . . "Building and Improving an OCR Classifier for Republican Chinese Newspaper Text (PDF)"^^ . . . "Bachelor_Thesis.pdf"^^ . . . "Building and Improving an OCR Classifier for Republican Chinese Newspaper Text (Other)"^^ . . . . . . "medium.jpg"^^ . . . "Building and Improving an OCR Classifier for Republican Chinese Newspaper Text (Other)"^^ . . . . . . "preview.jpg"^^ . . . "Building and Improving an OCR Classifier for Republican Chinese Newspaper Text (Other)"^^ . . . . . . "small.jpg"^^ . . . "Building and Improving an OCR Classifier for Republican Chinese Newspaper Text (Other)"^^ . . . . . . "indexcodes.txt"^^ . . . "Building and Improving an OCR Classifier for Republican Chinese Newspaper Text (Other)"^^ . . . . . . "lightbox.jpg"^^ . . "HTML Summary of #30845 \n\nBuilding and Improving an OCR Classifier for Republican Chinese Newspaper Text\n\n" . "text/html" . .