If we are considering open source OCRs then the tesseract-ocr - which is currently being maintained by Google - offers the most accurate results. This concerns the text-in-the-wild problem and not a controlled environment such as the inside of a scanner. Text-in-the-wild problem is related to the detection and recognition of text segments in an open environment such as random images taken from Google street view as seen from the image below. Source. http.//cmp.felk.cvut.cz/~matas/papers/neumann-2012-rt_text-cvpr.pdf For the theoretical description of the Tesseract check An Overview of the Tesseract OCR Engine.
You could write your own program, but it would be good if you could get someone to do it for you. One final suggestion is that you should have the text written out to a text file.