Is the any good place to learn state of the art of OCR (optical character recognition) for beginner?
That very much depends on your definitions for good and state of the art . What are the research areas in Optical Character Recognition? question qid 7523199 Typically bleeding edge is largely undocumented or at least unpublished; one would need to be in the right place at the right time while IP lawyers sort Patentability - Wikipedia. s To resist reverse-engineering some algorithms go straight to hardware without disclosure Trade secret - Wikipedia s Otherwise conference attendance and subscription to relevant research journals are places to start although attributing goodness italic is argumentative Many universities and other research organizations disclose preliminary results thru their own PR organs e.g. HSBC and IBM develop cognitive intelligence solution to digitise global trade Press Releases | Adlib Software ABBYY TextGrabber implements Real-Time Recognition to instantly connect to action s# Leading Up to the Mid-Grant Technical Review s s
Do you keep the wine labels of the wines you have drunk in a scrap book journal?
I think that it's a nice idea to keep the labels in a scrap book but perhaps only keep the labels of the wine bottles that you really enjoyed drinking. Underneath each wine label you could write a short tasting description and note down what you particularly enjoyed about the wine. That way you have a journal which is like a re-collection of all your favourite wines.
I want to learn text extraction from an image. Where can I find a Basic/Introductory/review paper which explains the idea of the topic?
Text extraction is a very wide topic when ites to research. So I suggest to start with OCR which helps you understand the base concept about detection and extraction Gonzlavez book on Image Processing will help you understand various process involved and can explan you concepts exed in papers in more detail. I have worked a bit on detection so i can refer you papers on that.. You can refer the papers in ascending order of year of publishing from Mr.P Shivakumara(for detection) Meanwhile please follow one author and read from hist first paper it will help you understand from basics
In machine learning, what is over-fitting and why is it bad?
Imagine you have 2 dimensional data consisting of two variables X and Y. In an absolutely perfect world the data would fall perfectly along a line or curve. But in reality there will be some noise that is represented by an error term in your model. Hopefully there will also be a trend or signal to your data as well as noise. The point of machine learning is to identify the signal and not the noise. So in our two dimensional data if you plot x and y and see something like the data below (excluding the line). You would notice there is a definite shape or signal in your data but there is also error. Underfitting Fitting the data with a line like in a linear regression would not be appropriate. The shape of the data is not linear so the model would be underfit for most of the data points. See how only a few points fall close to the line but some are way above and way below the line? If you undefit your data your model will have low accuracy on your training data and you won be able to make reliable predictions using it. Overfitting On the otherhand fitting your predictions so closely that you map your model very closely to your data will cause another issue. While your training data will give you high accuracy when you introduce new data your model accuracy rates will fall. This is because your model is fitted to both the signal and the noise in your training data. So while you might be able to show off a high accuracy rate again you won be able to make good predictions using the model. The best fites with using the middle option. Which is to try and fit the signal while ignoring the noise. This will result in a lower accuracy rate on your training data than the overfitted model but your accuracy rates on new data should stay relatively steady. Youll be able to make reliable predictions with your data and you should be able to confidently asses the strength of your model as a hole. Image Source What is underfitting and overfitting in machine learning and how to deal with it. s
Is there a good app/tool to keep track of the weekly progress for a PhD student?
As others have mentioned Evernote is an awesome software to keep track of your daily progress report. If you prefer using latex for your academic writing then it's better to use overleaf () as it is a cloud-based academic writing software. You'll have a headstart when you want to start working on your conference paper. It also has Mendeley support so it is even more easier to cite as you write on overleaf. If you don't use latex no worries. Microsoft word has onedrive sync option in addition to the Mendeley word plugin. If you are unsure of what mendeley is it is a great reference manager tool to track your literature survey. You can create multiple folders within Mendeley (for eg read and unread). Make Unread as your default folder. As and when you finish reading papers move them to Read folder and cite it in your weekly progress report on Evernote or any other writing software. Evernote has a very good OCR (optical character recognition) functionality. So if youe across interesting stuff on the blackboard after your lab meeting you can take a snapshot of it and upload it to Evernote. Also when you attend Lab meetings sometimes it is possible that you may not remember all the advice given by your supervisor. So use your smartphone and record the audio (ex Audio recorder app on Android). It'll serve as a neat evidence ) I hope this helps )
How does parsing help in Natural Language Processing?
EDIT Added some tasks where parsing is used to check grammar and rank possible utterances (Speech Recognition and Machine Translation). Thanks for the A2A. n He was a grammarian and could doubtless see further into the future than others. (. Tolkien in Farmer Giles of Ham italic Found via Steven Abney) nLet's take the following definition of parsing from s s n Withinputational linguistics the term is used to refer to the formal analysis by aputer of a sentence or other string of words into its constituents resulting in a parse tree showing their syntactic relation to each other which may also contain semantic and other information. nParsing is a small step towards finding the meaning of a linguistic utterance. (see Eugene Charniak Syntactic Language Modeling for Machine Translation and Speech-Repair Detection s - the first two minutes.)nIn many Natural Language Processing tasks this is helpful. In some others - parsing does not help much as you already have good features from lighter analysis (e.g. POS tagging bigrams). Here are some NLP tasks where parsing helpsn Let's start small with the book example of news Consider Man bites dog (journalism) s(journalism) and the utterancesDog bites man vs. Man bites you reduce them to bag-of-words losing the order then you lost the specific meaning. Knowing that man is the subject and dog is the object of this sentence gives it its whole point. Sentiment Analysis Compare I like Frozen I do not like Frozen and I like frozen yogurt. The three sentences' words are very similar to each other yet the first and the second contain inverse statements about the movie Frozen while the third is a statement about something else. Parsing here is crucial for understanding. Relation Extraction Rome is the capital of Italy and the region of Lazio. While entity extraction can give us the entities here we need parsing to see which entity is the capital of which other entities. Question Answering When answering Who was the first man in space? you need to parse the question and use parsed sentences to build the answer. Speech Recognition While not an NLP task per se speech recognition involves choosing among many possible strings. Parsing scores the strings with either a pass or a likelihood score to give a powerful language model for speech recognition. Similarly parsing could help in spell checking optical character recognition (OCR) prediction (T9) handwriting detection etc. Machine Translation Again parsing allows us to choose between several possible translations. Also it makes translation of phrases and terms easier.n Grammar Checking Well parsing can help in checking the grammaticality of a document even though you could get some leverage out of several canned patterns.n nHere are some NLP tasks where parsing almost does not helpn Information Retrieval If you enter search terms like Buenos Aires as a query to a web search engine it can give you great hits without using parsing by virtue of good indexing of words and phrases. Topical Classification Many classification methods (e.g. Naive Bayes SVM Logistic Regression) treat the words as atoms and can still classify documents with great accuracy.n nThe heuristic is that when you are trying to find out who did what to whom? you need parsing.
What are the top sites to download Indian author books?
Try NDL o DLI. Indian Institute of Science DLI Home Browsing Digital Library of India by Subject s s The National Digital Library (NDL) has taken under its fold more than 1 institutes and rolled out a collection of more than one million digitised books and journals as a part of the ongoing process. The objective of NDL is to uniform high standards of e-content free of cost on a single platform. Digital Library of India (DLI) is a digital collection of freely accessible rare books collected from various libraries in India. DLI project started in early 2 with the vision to archive all the significant literary artistic and scientific works of mankind and to preserve digitally and make them available freely for every one over Internet for education study appreciation and for future generations. As a first step in realizing this vision it is proposed to create the Digital Library with a free-to-read searchable collection of one million books predominantly in Indian languages. The Project was initiated by the Office of the Principal Scientific Advisor to the Government of India and subsequently taken over by the Department of Electronics and Information Technology (DeitY) Ministry of Communications and Information Technology (MCIT) Govt. of India. The idea was also to create a test bed for researchers to improve scanning techniques optical character recognition intelligent indexing and in general to promote Indian Language Technology Research. The Indian Institute of Science (IISc) has been digitising lakhs of books and thousands of doctoral theses at its library. One million books 1 institutes under National Digital Library
What scanner should I get for occasional document/receipt scanning?
Hi Ill rmend smart book scanner ET16 ET16 Plus from CZUR. The ET16 Plus Book Scanner is a versatile smart book incorporated with smart scanning technology and advanced processing software. Being an efficient and high-speed book scanner CZUR is able to quickly digitalize documents without damaging them (a non-destructive scanning method). Unlike any other traditional scanners CZUR ET Smart Book Scanner revolutionizes scanning experience by making book scanning as easy as turning pages. All books records magazines journals and any paper documents within A3 size can be scanned directly at the speed of Plus all scanned documents can be converted to editable Word and Excel via OCR (Optical Character Recognition) function. In your case CZUR is able to scan document and receipt efficiently. Receipt scanning Document scanning Besides CZUR Book Scanner is small in size and it designed like a desk light. So you can place it on the table. Start it when you need it. You can buy it on Amazon Please visit for more information.