How it works

Upload & Edit
Your PDF Document
Save, Download,
Print, and Share
Sign & Make
It Legally Binding
Video instructions and help with filling out and completing ocrmypdf ocr python
Instructions and Help about ocrmypdf ocr python

Music hi my name is ponshankar and in this video we will be looking into ocr or optical character recognition using python optical character recognition is a technology that is used to recognize text in images it is used to convert tight handwritten or printed text into a machine readable text what is the use of it though we are moving towards a digitalized world and everyone is talking about digital transformation organizations will always have stacked up files and documents as part of their business the application of ocr is to help store these files in an organized way in laptops computers or other devices think about the pdf documents in our organizations ocr plays a vital role in making the pdf documents easily searchable using its information retrieval capabilities extracting information from handwritten documents is another important application of ocr a lot of industries uses ocr technology for automation banking is one among them where ocr is used for processing handwritten checks now the plan for this session is to do a hands-on example of ocl what we would first do is to pick a clear pdf document from internet and then try to extract the text out of it then we would take hand returned text and see how ocr retrieves information from it and finally we will take a text image which is not so clear and then apply some basic fine tuning techniques to extract accurate information so we are going to use python for performing this task python is a great programming language and what we need to perform this task in python is just a few lines of code i would be using python through anaconda in this session for those of you who are interested in the same approach i'm leaving the anaconda download link in the video description google test rack is an ocr engine that we would be using in this session i will walk you through from where to download and install it and also have provided the download link in the video description for your convenience opencv is a python library that we would be using for processing images it is a library that is mainly aimed at real-time computer missions this is the link from where you would be downloading google tesseract software you could choose the 64-bit or 32-bit version as per your operating system and install it in your system once that is done i'm going into my command prompt and navigating into my anaconda directory now i am installing pi tesseract using python pip command this would be the python library of google tesseract that we would be using then i am installing opencv library in python this is my pycharm ide that i would be using for writing python codes i will leave the download link in this video description for those who would like to use the same ide first i'm importing cb2 interface of opencv library then i'm.