DOKK / manpages / debian 10 / tesseract-ocr / merge_unicharsets.1.en
MERGE_UNICHARSETS(1)   MERGE_UNICHARSETS(1)

merge_unicharsets - Simple tool to merge two or more unicharsets.

merge_unicharsets unicharset-in-1 ... unicharset-in-n unicharset-out

merge_unicharsets(1) is a simple tool to merge two or more unicharsets. It could be used to create a combined unicharset for a script-level engine, like the new Latin or Devanagari.

unicharset-in-1

(Input) The name of the first unicharset file to be merged.

unicharset-in-n

(Input) The name of the nth unicharset file to be merged.

unicharset-out

(Output) The name of the merged unicharset file.

merge_unicharsets(1) was first made available for tesseract4.00.00alpha.

Main web site: https://github.com/tesseract-ocr Information on training tesseract LSTM: https://github.com/tesseract-ocr/tesseract/wiki/TrainingTesseract-4.00

tesseract(1)

Copyright (C) 2012 Google, Inc. Licensed under the Apache License, Version 2.0

The Tesseract OCR engine was written by Ray Smith and his research groups at Hewlett Packard (1985-1995) and Google (2006-present).

01/21/2019