Unexpected result from a try to fix IA Upload failures
(too old to reply)
Alex Brollo
2017-12-22 22:14:29 UTC
While trying to fix some failures of IA Upload an unexpected result
emerged: an easy opportunity of fixing some usual OCR errors into djvu text

In brief, the script xml2dsed.py
converts IA _djvu.xml files into a "dsed" (lisp-like) code, so that text
layer can be uploaded into djvu file into a much faster and controllable
way using djvused.exe. While parsing the xml tree, at WORD level any word
of the text layer is exposed to the script environment as pure text; this
offers a unique opportunity to fix many scannos, avoiding any risk to mess
the xml or the dsed code.

Here the first djvu file
where this has been successfully tested.

Alex brollo

priva di virus. www.avast.com