| DJVU2HOCR(1) | djvu2hocr manual | DJVU2HOCR(1) |
djvu2hocr - DjVu to hOCR converter
djvu2hocr [option...] djvu-file
djvu2hocr {--version | --help | -h}
djvu2hocr converts hidden text from a DjVu file to the hOCR[1] format.
-p, --pages=page-range
The default is to convert all pages.
--word-segmentation=simple
This is the default.
--word-segmentation=uax29
--title=title
The default is “DjVu hidden text layer”.
--css=style
For example, --css='.ocrx_line { display: block; }' can be used to visually preserve line breaks.
--version
-h, --help
djvu2hocr uses a custom extension to hOCR to retain characters which cannot be directly represented in an HTML/XML document. For example, control character BEL (^G, U+0007), is converted into the following HTML chunk: <span class="djvu_char" title="#x07"> </span>
Please report bugs at: https://github.com/jwilk/ocrodjvu/issues
| 2018-07-12 | djvu2hocr 0.10.4 |