gImageReader 2.93 Manual
Contents
gImageReader is a frontent to tesseract-ocr written in C++ using the gtkmm bindings.
- Allows the user to select the parts of the image they want to be recognized or directly recognize the entire image.
- Supports page layout autodetection.
- Supports opening images and PDF documents, as well as importing images from scanning devices, from the clipboard and from screenshots.
- Recognized text displayed directly next to the image, or copied directly to clipboard.
- Multilingual recognition.
- Basic editing of output text, including search/replace and removing line breaks on selected text.
- Spellcheck enabled for the selected language in the output textfield if corresponding dictionary installed.
- User is prompted to install missing spellcheck languages.
A detailed list of changes can be found in the commit log: https://github.com/manisandro/gImageReader/commits/master
gImageReader 2.93 (Apr 30 2014):
- gImageReader 3.0 beta 4
- Add possibility to choose multiple recognition languages
- Add button to show/hide output pane
- Allow toggling spell checking from context menu
- Fix a crash when loading a scanned document
gImageReader 2.92 (Mar 19 2014):
- gImageReader 3.0 beta 3
- Add replacement-list feature, allowing the user to specify a list of replacements to perform on the recognized text
- Fix saving output resulting in empty files
- Fix crashes when rendering PDF files
- Keep line-breaks if preeded by line-break
- Fix localization not working on Windows
gImageReader 2.91 (Feb 20 2014):
- gImageReader 3.0 beta 2
- Improve page-layout autodetection by merging overlapping regions
- Use native file-chooser dialogs on Gnome/KDE/Windows
- Allow performing multipage-recognition with page-layout autodetection
- Fix broken search/replace which caused the application to crash
- Add Win64 packages
gImageReader 2.90 (Feb 11 2014):
- gImageReader 3.0 beta 1
- Support multiple selections (via CTRL-key). Rightclicking a selection opens a context menu which allows to:
- Deleted and reordered individual selections
- Recognize the selected text, either to clipboard or to the output pane
- Basic automatic page layout detection
- The output pane now supports undo and redo
- Configuration is now automatic
- Proper arbitrary rotation of images
- Detect deleted/renamed files
- Cleaner UI
- Port to Gtk+3, rewrite in C++ using the Gtkmm bindings
- Images can be opened/imported from the sources pane, which is activated by clicking on the top-left button in the main toolbar.
- To open an existing image or PDF document, click on the open button in the images tab.
- To capture a screenshot, paste image data from the clipboard, or open a recently opened file, click on the arrow next to the open button.
- You can manage the list of opened images with the buttons next to the open button. Temporary files (such as screenshots and clipboard data) are automatically deleted when the program exists.
- To acquire an image from a scanning device, click on the acquire tab in the sources pane.
- Buttons in the main toolbar to zoom in and out as well as rotate the image by an arbitrary angle. Zooming can also be performed by scrolling on the image with the CTRL key pressed.
- Scrolling on the image pans the image vertically. If SHIFT is pressed while scrolling, the image is panned horizontally.
- Basic image manipulation tools are provided in the image controls toolbar, which is activated by clicking on the image controls button in the main toolbar. The provided tools currently allow brightness and contrast adjustments as well as adjusting the image resolution (through interpolation).
- Areas to be recognized can be selected by dragging (left click + mouse move) a rectangular area around portions of the image. Multiple selections are possible by pressing the CTRL key while selecting.
- Alternatively, the automatic layout detection button, accessible from the main toolbar will attempt to automatically define appropriate recognition areas, as well as adjust the rotation of the image if necessary.
- Selections can be deleted and reordered by right-clicking on them. It is also possible to resize existing selections by dragging the corners of the selection rectangle.
- The recognition language can be selected from the drop-down menu of the recognize button in the main toolbar. If a spelling dictionary is installed for a tesseract language definition, it is possible to choose between available regional flavors of the language. This will only affect the language for spell-checking the recognized text. Unrecognized tesseract language definitions will appear by their filename prefix, one can however teach the program to recognize such files by defining appropriate rules in the program configuration (see below). Multiple recognition languages can be specified at once from the Multilingual submenu of the drop-down menu.
- The selected portions of the image (or the entire image, if no selections are defined) can be recognized by pressing on the recognize button in the main toolbar. Alternatively, individual areas can be recognized by right-clicking a selection. From the selection menu, it is also possible to redirect the recognized text to the clipboard, instead of the output pane.
- Recognized text will appear in the output pane (unless the text was redirected to the clipboard), which appears as soon as some text was recognized.
- If a spelling dictionary for the recognition language is available, automatic spell-checking will be enabled for the outputted text. The used spelling dictionary can be changed either from the language menu next to the recognize button, or from the menu which appears when right-clicking in the text area.
- When additional text is recognized, it will either get appended, inserted at cursor position, or replace the previous content of the text buffer, depending on the mode selected in the append mode menu, which can be found in the output pane toolbar.
- Other post-processing tools include stripping line breaks, collapsing spaces and more (available from the second button in the output pane toolbar), as well as searching and replacing text. A list of search and replace rules can be defined by clicking on the Find and replace button in the output pane toolbar and then expanding the Replacement list section.
- Changes to the text buffer can be undone and redone by clicking on the appropriate buttons in the output pane toolbar.
- The contents of the text buffer can be saved to a file by clicking on the save button in the output pane toolbar.
- The output pane can be hidden by clicking the right-most button in the output pane toolbar. This will also clear the contents of the text buffer.
- The program options can be accessed from the application menu, which opens when clicking the right-most button of the main toolbar.
- Options allow setting the orientation of the output pane (if vertical, it will occupy the right portion of the application, if horizontal, it will occupy the lower portion), the font of the output pane, as well as determining whether the application will notify about missing spelling dictionaries and new program versions.
- Additionally, one can define new rules to match tesseract language definitions to a language (unfortunately, the tesseract language definitions do not include this information). The list of predefined rules can be seen in the Predefined language definitions section. Additional definitions can be added clicking on the Add button below. The rules for a language definition, which consists of three fields, are as follows:
- Filename prefix: The filename of tesseract language data files is of the format <prefix>.traineddata, i.e. for English, the file is called eng.traineddata and the prefix is eng.
- ISO code: This is a combination of the ISO 639-1 language code and the ISO 3166-2 country code, separated by an underscore (i.e. en_US). This information is necessary to match spelling dictionaries to the language. The choice of the actual country code is not strictly relevant, but it is necessary for the automatic installation of spelling dictionaries to find a relevant package of dictionaries. This code can also be made up if no appropriate choices exist, the only result being that no relevant spelling-dictionaries will be matched with the language.
- Native name: The native name of the language simply determines the label of the entry for the language in the language menu.
- On Linux, it's sufficient to install the package corresponding to the language definition one wants to install via the package management application (the packages may be called something like tesseract-langpack-<lang>).
- On Windows, one needs to download the desired language definitions from the project download page, the copy the <lang>.traineddata files to Start→All Programs→gImageReader→Tesseract language definitions.
- To re-detect the available languages, one can restart the program, or select Redetect Languages from the application menu.
- On Linux, if your distribution supports PackageKit, the program will offer to automatically install missing dictionaries when necessary. If automatic installation does not work for some reason, you can install the spelling dictionaries from the package management application (the packages may be called something like hunspell-<lang>).
- On Windows, you need to download the desired spelling dictionary from http://wiki.openoffice.org/wiki/Dictionaries, and extract the *.dic and *.aff files to Start→All Programs→gImageReader→Spelling dictionaries. Caution: Be sure to download zip (or tar.gz) archives, not OpenOffice 3 extensions.
For suggestions and contributions of any kind, please contact me at manisandro@gmail.com. I'd especially appreciate translations - here are the main steps for creating a translation:
- Download the source archive from the project homepage.
- Enter the po folder.
- To create a new translation, copy the gimagereader.pot file to <language>.po (i.e. de.po for German). To edit an existing translation, simply pick the corresponding file.
- Translate the strings in <language>.po.
- Send the po file to manisandro@gmail.com. Thanks!
If you find an issue or have a suggestion, please file a ticket to the gImageReader issue tracker. Feel free to also contact me directly at manisandro@gmail.com.
Copyright ©2009-2014 Sandro Mani, revision: Mon, 28 Apr 2014