If you already worked in an office equipped with a document scanner, you probably stumbled more than once on the expression Optical Character Recognition (OCR). But what is OCR and what is it used for? This article explains what OCR means and covers the most popular use cases.
What is OCR?
Literally, OCR stands for Optical Character Recognition. It is a widespread technology to recognise text inside images, such as scanned documents and photos. OCR technology is used to convert virtually any kind of images containing written text (typed, handwritten or printed) into machine-readable text data.
OCR Technology became popular in the early 1990s while attempting to digitise historic newspapers. Since then the technology has underwent several improvements. Nowadays solutions deliver near to perfect OCR accuracy. Advanced methods like Zonal OCR are used to automate complex document based workflows.
Popular use-cases for OCR technology
Probably the most well known use case for OCR is converting printed paper documents into machine-readable text documents. Once a scanned paper document went through OCR processing, the text of the document can be edited with word processors like Microsoft Word or Google Docs. Before OCR technology was available, the only option to digitise printed paper documents was to manually re-typing the text. Not only was thi massively time consuming, it also came with inaccuracy and typing errors.
OCR is often used as a “hidden” technology, powering many well known systems and services in our daily life. Less known, but as important, use cases for OCR technology include data entry automation, indexing documents for search engines, automatic number plate recognition, as well as assisting blind and visually impaired persons.
OCR technology has proven immensely useful in digitising historic newspapers and texts that have now been converted into fully searchable formats and had made accessing those earlier texts easier and faster.