Reducing paper workload in purchasing – is OCR the right solution?

Optical Character Recognition (OCR) is used to enable electronic B2B communication. The software converts documents into a digital format by scanning and digitising information from printed documents, images or handwritten text. Below we explore the features of OCR, how it works and assess the software’s suitability for reducing paper workload by processing documents automatically in an ever-changing business world.   

Share:

What is OCR used for?

OCR is a technology that enables you to convert different types of documents, such as scanned paper documents, PDF files or images captured by a digital camera into editable and machine-readable data. For example, instead of retyping a written text manually, you can convert all the required materials into a digital format within several minutes using a scanner (or a digital camera) and Optical Character Recognition software. 

When is OCR used in procurement?

In procurement, buyers mainly use OCR for scanning and digitising information from, for example, printed invoices or purchase order confirmations. After capturing the documents, buyers  incorporate the data into the downstream systems. 

How does OCR work?

Using OCR software involves three steps:  

Step 1: Pre-processing the document image 

  • The programme pre-processes images to improve the chances of successful recognition 
  • Aim of image pre-processing: improvement of the actual image data 
  • The programme suppresses unwanted distortions and enhances specific image features 

Step 2: Character Recognition  

  • The programme analyses the structure of the document image
  • The page is divided into elements such as blocks of texts, tables, images, etc.  
  • The lines are divided into words and then into characters 
  • The recognised characters are compared with a set of pattern images 
  • The programme advances numerous hypotheses regarding the character and then presents the recognised text 

Step 3: Post-processing the document image 

  • Error correction that aims for high accuracy 
  • The programme converts the data into standalone documents (for example, text or PDF files) or exports it for use in other software 

What are the advantages of OCR?

  • A paper-based document can be turned into an electronic document 
  • Paper workload is reduced 
  • Fast processing of OCR information  
  • Large quantities of text can be input quickly  
  • OCR makes scanned documents editable 

What are the disadvantages of OCR?

OCR systems are generally quite expensive due to the need for scanning software, training, materials and ongoing staffing costs. Further, the technology is typically inaccurate, and mistakes are common.  

Images produced by a scanner consume significant amounts of memory space, leading to an increase in recurring fees. Further, these images lose quality during the scanning and digitising process. This loss of image quality contributes to some of the common errors that occur during OCR, including: 

  • misreading letters 
  • skipping over unreadable letters 
  • mixing text from adjacent columns or image captions 

Read here about the importance of data quality in procurement.

As a result, all documents need to be checked carefully by a human and then manually corrected. This labour-intensive process takes valuable time away from team members. 

 Modern enterprises benefit from an extensive network of business partners. Typically, these business partners use unique formatting for their documents, and OCR systems are limited in their ability to handle these deviations. In addition, infrequently made purchases generate documents that the system can’t process since those business partners are often not familiar with the requirements of the document recognition system.   

Conclusion

OCR is an expensive software solution that helps to reduce the paper workload. However, it lacks the data quality, flexibility and reliability of more sophisticated technologies. While OCR certainly offers advantages to manual data entry, the disadvantages often outweigh the benefits. Since OCR is highly limited in its use, it can increase manual workload to check and solve errors.  

However, seamless B2B workflows, automated exchange of data and documents and digitised document flows are the pillars of modern procurement.  

 So is there an alternative? 

 The Netfira Platform offers a unique alternative to ORC solutions for automating B2B purchasing processes and digitising document flows.