The Problem with OCR Invoice Scanning

OCR invoice scanning (Optical Character Recognition) is what most people think of when they hear supplier invoice automation. Scanning is very old technology, which dates back to the 1930’s and was often used by banks to read bank account numbers on cheques. Although the technology has improved, is it still reliable?

Today, we have the ability to take a photograph with a phone and have text extracted from an image, and many companies have added layers of AI (artificial intelligence) on top of the existing OCR to enhance its capabilities, but just how good is it and can we rely on such old technology

Providers of AI enhanced OCR invoice scanning solutions often claim a 99.8% accuracy rate, but does that accuracy translate into invoices that are being processed? 

We wanted to test this accuracy, so we signed up for an account with a leading OCR company and uploaded 7 sample invoices to put it to the test. The invoices used were based on real invoices that we have received, but all of the content was entirely fictitious. 

The Good News

After we uploaded the 7 invoices, we observed that key information from 5 of the 7 invoices was read perfectly. These invoices would be capable of being processed electronically without issue. On the remaining two invoices, one was read with 99.79% accuracy, while the remaining invoice was read with 99.84% accuracy. 

The Bad News

Unfortunately, there were a handful of errors that were identified on the other invoices. The OCR imaging technology on which AI supplier invoice processing relies, returned an invoice total of $52,389.25 on the Jerry Madden Live Wire Electrical invoice when the actual total was $2,389.25. This is a very large discrepancy, which you can imagine creates a series of problems when using such technology. 

There were 466 characters on the invoice but 1 character was incorrect which gave the 99.79% read accuracy.  However, the character that was misread, a ‘5’ instead of ‘$’ altered the total of the invoice by $50,000. A big problem!

The Sam Stone Inc invoice had more significant issues.  Even though 622 characters were read, with one incorrect, giving a 99.84% accuracy rate, the OCR invoice scanning technology again misread the dollar symbol as a ‘6’, resulting in a $3,643.61 invoice being returned as a $63,643.61 invoice, which resulted in a $60,000 discrepancy.

This invoice test tells us that even if current OCR invoice scanning solutions are AI assisted, they are attempting to read and decipher numbers (and letters) from ‘photos’ of invoices.  It is unlikely that the underlying technology can ever work flawlessly in a real-world situation because of the quality of the invoices presented. Issues with fonts, overprinting, and many of the other challenges associated with processing vendor invoices can create these discrepancies. 

Overall Test resulted in a 71.43% success rate

This is based on invoices successfully processed, not the number of characters read.  7 invoices were submitted, resulting in errors on 2 which equates to a 71.43% success rate. It is impossible to predict. 

What is the Solution?

With OnePosting, we don’t use Scanning or OCR technology to scan invoices. We have developed a unique technology that works to extract data from invoices that does not use OCR. Instead, we have developed a ZERO ERROR data extraction technique that reads postscript language used to display and print PDF documents. This results in more accurate data, where we read the true data from the invoice, not an interpretation of a “photo,” which results in extracted data that is 100% accurate. 

If you are looking to get 100% accuracy when it comes to your invoice data, book a meeting with us.