KNOWLEDGE BASE ARTICLE

Improving processing speeds

There are many things that can affect data extraction and export speeds within Umango. To ensure optimal processing speed, consider the following:

  1. Limit the use of image enhancement options. Deskew, blank page detection etc will slow processing so only use them if they are really required.
  2. Anchors slow processing. The larger the anchor the slower the anchor detection.
  3. The fewer (and smaller) the OCR zones the faster the processing. Make sure every zones needs to be there.
  4. Scanning resolution. We would normally recommend no more than 300dpi B&W and 200dpi for color. Of course, there are always times higher resolutions will be required but these resolutions should typically be high enough for quality data extraction.
  5. Server resources. Use at least our recommended specifications and in VM style environments use fixed RAM not dynamic.
  6. OCR accuracy vs speed. In most instances leaning toward the fastest speed (and possibly actually the fastest) will be more than adequate. OCR accuracy vs speed on full page OCR (text searchable pdf/a) will make a significant difference to processing speeds.
  7. Typically the ABBYY engine will be the faster choice for full page OCR (export file format OCR engine).
  8. If you are reading from or writing to a network destination, make sure your network and (if applicable) internet connection are not creating a bottleneck. Large files can take considerable time to upload/download on slower connections.
  9. Consider scanning in black and white instead of color or grayscale. This can make a significant difference to processing speed.
  10. Source image quality. If your source file quality is poor (eg. speckled, skewed, lots of grey areas, lines and artifacts) the OCR engine speed will be significantly affected.
  11. Add document processor licenses. This can have significant improvement to processing speed in some instances. To understand if this option is right for your requirements, read our article explaining the functionality.
  12. Image processing consumes large amounts of RAM. Make sure you have adequate amounts of RAM available. Refer to our documentation for details, but at least 4Gb of RAM per concurrent process is often a good rule of thumb.

Link to this article http://umango.com/KB?article=61