Entia Software | Digitalize and Automate | Data Capture Singapore

For less-structured document types, the subfield of intelligent data capture was developed. This new approach takes a content-based, rather than layout-based, approach to documents. Most modern capture solutions that utilize IDC depend on a pre-production learning phase, during which human operators provide example documents. The software then scans and analyzes all the words on every page in order to build a statistical model of word relationships and probabilities. For example, an operator may provide an example of both a mortgage document and a land usage document; the system will build a model that effectively notes the presence of terms like borrower, SSN, interest, and principal in the former document, while prioritizing words such as title, bounds, survey, easement, and so on for the latter. In actuality, this example is quite simplistic, whereas the extensive matrices that today’s systems can generate are quite nuanced and sophisticated.

Having created predictive models for these different types of documents, a modern capture system can then easily and correctly recognize other instances of the same document – e.g. two title surveys from the same company. But, much more usefully, it will also be able to correctly recognize and classify completely novel documents of the same type, like a title survey from a different surveyor, which might have an entirely different layout, and a handful of different terms too. How is this possible? Since IDC leverages probabilities rather than absolute relationships, it is flexible enough to tolerate slight differences in data. That novel set of title surveys might have somewhat different verbiage, but will likely retain > 90% of the same overall vocabulary because it is still a survey. This is the paradigm at the heart of IDC – document recognition in today’s solutions is no longer a rote and mechanical process, but is actually semantically-based, adaptable, and truly intelligent.

IDC Technology

IDC Technology

RPA

OCR

IDC

“Capture, Classify, Extract, Deliver”, Intelligent Data Capture (IDC) technology automates document processing and data entry for you.

How can we help?

Find out key features of our solution: Data Capture

Why Intelligent Data Capture

More about IDC technology

Company Information

Solutions

Technology

Industries