Although Optical Character Recognition (OCR) scanning technology has increased rapidly over the years, there are, however, limitations in regards to the source materials and character formatting.
- Text from a source with a font size of less than 12 points will results in more errors.
- Most document formatting is lost during text scanning, except for paragraph marks and tab stops. Sometimes bold, italics and underline are recognised, depending on your software.
- The output from a finished text scan will be a single column editable computer file. This computer file will always require spellchecking and proofreading as well as reformatting to desired final layout.
- Scanning of plain text files or spreadsheet print outs usually work, however the data needs to be reformatted to match the original.
- Source materials that often cause issues
- Small text
- Blurry copies
- Mathematical formulas
- Draft copies
- Colored paper
- Handwritten text
- Unusual or script-type fonts
- Document formatting may be lost during text scanning (i.e, bold, otalic & underline are not always recognized).
- Output from a finished text scan may be a single column editable text file. Text file will always require spellchecking and proofreading as well as reformatting to desired final layout