In 2019, KBC Bank Bulgaria faced a significant challenge: digitising many personal documents kept on paper. It was a time-consuming process that required a lot of manpower. The bank turned to DSS for help developing an ID Card Cropping & OCR solution capable of processing these documents quickly and accurately.
The solution was designed to accept input of a file with three objects on it - front and back of ID card and text with signature. It then processed the input file, found the three objects, cropped and saved them into three separate files, and performed full OCR of the images with an ID card to extract personal data from the documents. The solution could also evaluate image quality in terms of contrast and readability and verify output data for validity.
DSS provided the ID Card Cropping & OCR solution as a library written in Java using C++ digital library, OpenCV and Tesseract libraries. The solution was designed to work offline (batch) mode or in real-time. Additionally, customer servicing officers can easily upload and process specific customer-scanned documents.
The Electronic processing of the document covered the most frequently used graphical raster formats, including JPEG/JFIF, TIFF, GIF, BMP, PNG, and PDF. All variations of colours, resolutions, and quality were considered applicable for processing. The solution was able to recognise and process/analyse various personal document types, including Bulgarian and EU standardised ID cards (front and back) and international travel passports.
The solution provided several functional electronic processing operations, including OCR technology to read from scanned ID cards/passports, check document validity (expiry date), and check for changes or inaccuracies. The solution was also able to identify the document type, extract all necessary data needed for further processing, and perform specialised machine-readable symbols within the document (if available).
Furthermore, the solution could recognise the document type provided as an image raster input and return the document type as a response. It was also capable of recognising the front and back side of the ID cards within the provided image input and producing cropped images (one or two) of the processed documents, together with the respective information indicating the front/back side.
The implemented solution also included a machine learning solution that trained to improve the accuracy of the OCR by detecting and correcting errors. This algorithm was designed to learn from the data it processed, allowing it to improve its accuracy over time.
Lastly, the solution recognised the image quality in several categories and provided a score for the possibility of performing OCR (from 0-100). The solution architecture was designed to be scalable and to allow 24x7x365 availability with 99.7% uptime. The designed average throughput was a minimum of 10 ops/sec with a peak throughput of a minimum of 20 ops/sec.
Overall, the ID Card Cropping & OCR solution developed by DSS successfully provided KBC Bank Bulgaria with a reliable, accurate, and efficient way to digitise personal documents, saving hours of manual human work.