Tesseract¶
Tesseract is an open source OCR engine used for converting scanned images into documents whose text is searchable (for further details about tesseract, refer to Tesseract OCR).
Note
A supported and tested Tesseract version is 3.02.02, that can be used with Genius Server 2.10.0 or later.
This section explains how to enable and configure it.
Configuration¶
Enabled: if checked, the Tesseract functionalities are enabled.
Pool size: the number of threads involved in the OCR process (it is suggests not to change the default parameter).
Executable path: the path of the folder which contains the Tesseract executable file (e.g. C:\Program Files (x86)\Tesseract-OCR\tesseract.exe).
Installation path: the path of the Tesseract installation folder (e.g. C:\Program Files (x86)\Tesseract-OCR).
Execution temp path: the path of the folder where Tesseract temporary files, created during the engine execution, are stored (it is suggests not to change the default parameter).
Execution output path: the path of the folder where converted documents are stored (it is suggests not to change the default parameter).
Timeout: the timeout in seconds.
Hint
Do not forget to click on Save to save the changes. When everything in the config tool is configured, the Genius Server needs to be restarted.