Optical Character Recognition or Optical Character Reader (or OCR) describes the process of converting printed or handwritten text into a digital format.
Optical Character Recognition is a significant area of research in artificial intelligence, pattern recognition, and computer vision. OCR was also one of the earliest fields of artificial technology research and has emerged as a mature technology.
OCR began back in 1913 when Dr. Edmund Fournier d’Albe invented the Optophone to scan and convert text into sound for visually impaired people. Since then, OCR technology has experienced multiple developmental phases.
In the 1990s, the technology became prominent with the digitization of historical newspapers. In addition, the emergence of smartphones, electronic documents also lead to further advancements in OCR technology.
What is Optical Character Recognition (OCR)?
Optical Character Recognition refers to a software technology that electronically identifies text (written or printed) inside an image file or physical document, such as a scanned document, and converts it into a machine-readable text form to be used for data processing. It is also known as text recognition.
In short, optical character recognition software helps convert images or physical documents into a searchable form. Examples of OCR are text extraction tools, PDF to .txt converters, and Google’s image search function.
To see OCR software in action, you can try using Text Extractor Tool by Brandfolder. This optical character recognition online tool can convert an image of text (such as a screenshot) into plaintext.
How does Optical Character Recognition work?
The concept of OCR is straightforward. However, its implementation can be quite challenging due to several factors, such as the variety of fonts or the methods used for letter formation. For example, an OCR implementation can get exponentially more complex when non-digital handwriting samples are used as input instead of typed writing.
The entire process of OCR involves a series of steps that mainly contain three objectives: pre-processing of the image, character recognition, and post-processing the specific output.
In the following, we will show how optical character recognition works and explain the main steps of OCR technologies.
1. Scanning the Document
This is the prime step of OCR which connects to a scanner to scan the document. Scanning the document decreases the number of variables to account for when creating the OCR software since it standardizes the inputs. Also, this step specifically enhances the efficiency of the entire process by ensuring perfect alignment and sizing of the specific document.
2. Refining the Image
In this step, the optical character recognition software improves the elements of the document that need to be captured. Any imperfections such as dust particles are eliminated, and edges, as well as pixels, are smoothed to get a plain and clear text. This step makes it easier for the program to capture and be able to clearly “see” the words being inputted without, for instance, smudges or irregular dark areas.
The refined image document is then converted into a bi-level document image, containing only black and white colors, where black or dark areas are identified as characters. At the same time, white or light areas are identified as background. This step aims to apply segmentation to the document to easily differentiate the foreground text from the background, which allows for the optimal recognition of characters.
4. Recognizing the Characters
In this step, the black areas are further processed to identify letters or digits. Usually, an OCR focuses on one character or block of text at a time. The recognition of characters is carried out by using one of the following two algorithms:
- Pattern recognition. The pattern recognition algorithm involves inserting text in different fonts and formats into the OCR software. The modified software is then used for comparing and recognizing the characters in the scanned document.
- Feature detection. Through the feature detection algorithm, OCR software applies rules considering the features of a certain letter or number to identify characters in the scanned document. Examples of features include the number of angled lines, crossed lines, or curves used for comparing and identifying characters.
Simple OCR software compares the pixels of every scanned letter with an existing database to identify the closest match. However, sophisticated forms of OCR divide every character into its components, such as curves and corners, to compare and match physical features with corresponding letters.
5. Verifying the Accuracy
After the successful recognition of characters, the results are cross-referenced by utilizing the internal dictionaries of the OCR software to ensure accuracy. Measuring OCR accuracy is done by taking the output of an analysis conducted by an OCR and comparing it to the contents of the original version.
There are two typical methods for analyzing the accuracy of OCR software:
- Character-level accuracy, counting how many characters were detected correctly.
- Word-level accuracy, counting how many words were recognized correctly.
In most cases, 98-99% accuracy is the acceptable accuracy rate, measured at the page level. This means that in a page of around 1,000 characters, 980-990 characters should be accurately identified by the OCR software.
Optical Character Recognition Use Cases
In 2021, where everything is becoming digitalized and advanced, OCR technology is being used by various businesses to streamline business processes, improve accessibility, and enhance customer satisfaction. Below will be some of the most prominent use cases of OCR in the industry today.
Number Plate Recognition with OCR
Automatic number-plate recognition (ANPR) uses OCR technology to identify the numbers on license plates. Today, number-plate recognition is used in a diverse set of commercial applications to find stolen cars, calculate fees for parking, invoice tolls or for access control to safety zones, and more.
OCR Applications in Banking
The banking industry is deemed one of the largest consumers of OCR technology as it helps enhance security, improves data management, optimizes risk management, and enhances customer experience.
Before applying OCR technology, most banking documents were physical, including customer records, checks, bank statements, and others. With OCR technology, it became possible to digitize and store even older documents in databases.
OCR technology has also completely revolutionized the banking industry by:
- Providing easy verification: OCR allows a real-time verification of money deposit checks and a signature by scanning them using an OCR-based application. An example of this can be seen in mobile banking apps, where checks can be deposited digitally and processed within days through OCR-based check depositing features.
- Enhancing security: The electronic deposition of checks through OCR technology results in fraud prevention and increasingly secure transactions, fostering a better user experience.
OCR Use Cases in Healthcare
OCR technology has proved to be beneficial for the healthcare industry. In the healthcare sector, OCR technology allows patient medical histories to be accessed digitally by patients and doctors alike.
In addition, patient records, including their X-rays, treatments, tests, hospital records, and insurance payments, can easily be scanned, searched, and stored using OCR technology.
Thus, optical character recognition helps streamline the workflow and reduce manual work at hospitals while keeping the records up to date.
How OCR is used in Travelling
OCR technology has revolutionized the travel industry and has made traveling much easier. Whether you’re booking a flight or a hotel, checking in to the airport or your hotel room, or managing your travel expenses, OCR technology is being used at every single place to enhance customer experience.
The majority of airports and mobile travel apps use OCR technology for security and data storage purposes. The applications of OCR range from scanning passports to storing personal data when booking a flight or a hotel.
Advantages Of Optical Character Recognition
Optical Character Recognition offers a wide range of benefits, many of which were reviewed in this article. However, the most important benefits of OCR are listed below for your reference.
- Improved accuracy: Software-based character recognition eliminates human errors, resulting in improved accuracy.
- Speed-up the processes: The technology converts unstructured data into searchable information, providing the required data available at faster rates and subsequently speeding up business processes.
- Cost-effective: OCR technology does not require a lot of resources which reduces the processing costs and subsequently reduces the overall costs of a business.
- Enhanced customer satisfaction: The accessibility of searchable data by the customers ensures a good experience, assuring better customer satisfaction.
- Improved productivity: The easy accessibility of searchable data makes a stress-free environment for the employees, allowing them to focus on the main goals, boosting the productivity of a business.
Optical character recognition (OCR) is used to turn scanned images and other visuals into text. This turns paper-based documents into editable and searchable digital mediums and enables the development of automated systems.
If you enjoyed this article, we suggest you read more about other applications of Computer Vision: