RSS

About this Project: Typo Woes

08 Jan

When I convert the scans into a format which I can edit the end results are still full of typos.  The software is not as self-learning as the brochure would have you believe, it can’t cope with the type face, and correcting the errors is too time-consuming to be practial.

Back in 2011 when I first got the scans I opted for Abbyy PDF Transformer.  I’d have looked at the options at the time, so it must have seemed the best choice for some reason.  However, it struggled with the strange 1970s typeface and so I upgraded it to Abbyy Finereader in December.  For a while that looked as if it was up to the job.

The letters from Paul were all typed on a normal typewriter in Courier and Finereader can cope with that. But now I’m working on the letters from Ted and Ben, I’m discovering that Finereader really cannot cope. There are four problems with the text it produces

  1. it mistakes letters, substituting b for h for example
  2. sometimes it inserts bizzare characters instead of letters, so like becomes 1;|<e,
  3. it flags up a lot of correct guesses as problems for checking so even clean text involves a lot ot checking
  4. i t  o f t e n  p u ts in e x t r a  spaces b e tw e e n letters, which is insanely time-consuming to correct

The unforgivable thing is that it does not learn from all these run of the mill corections, only when it is in a special letter-by-letter learning mode.

As a result, I’m back to looking for OCR software. I’m going download a trial version of OmniPage, which will be £80 if I opt for a single user licence.  The reviews also recomend Presto! but I can’t find a price for that.  I’m annoyed about spending money on FineReader and then having to ditch it, but it’s really proved to be unusable.

 
Leave a comment

Posted by on 8 January, '13 in About

 

Write a reply.....

This site uses Akismet to reduce spam. Learn how your comment data is processed.