What is our primary use case?
We use Document Understanding to process invoices, purchase orders, and addresses. It extracts data from a scanned structured document and converts that in a structured manner to a spreadsheet. Predominantly, we use Document Understanding for payroll, procurement, invoice processing, and also in the finance department. Document Understanding has multiple models for extracting data from receipts. Departments have different use cases, but it's mostly used on the finance side to extract invoice data.
The volume of documents varies from customer to customer. When everyone starts using the product, they typically process between 10,000 to 20,000 in the first year. Once you've achieved a stable environment, you might reach around 500,000 pages in the second or third year. It depends on the project and the customer's budget because pricing is based on the number of pages.
We are not talking about 100 percent data automation end to end. If our customers work with hundreds of vendors, they deal with various templates. If a new vendor comes in, there is a possibility that the model may not identify that particular document. It's also possible that the upload quality isn't that great because of a bad scan, so there is always a channel for manual processing to handle exceptions.
When you implement Document Understanding, we may start with 40 percent automated and 60 percent manual. As it progresses and matures, the percentage gradually improves. We may eventually achieve 80 percent fully automated processing with 10 percent manual so that exceptions can be handled with the help of human intervention.
How has it helped my organization?
Traditionally, the operations team has done many of these activities manually. A human takes information from the document and enters it into the system. There are many challenges inherent in performing these tasks manually. One is human error. Also, a department might receive documents in the middle of the night, and no one is around to process them. Document Understanding enables round-the-clock support and automatic processing
The implementation is fast compared to other solutions. Documentation Understanding is more flexible because it has the artificial intelligence to understand new formats when they come in. It may read the information automatically.
The amount of human validation depends on the type of input document. For example, let's say we are extracting data from a passport. We had to extract data from the passport. The solution can properly scan the documents. There are 192 countries with different passports. The bots are already trained with all the different types of passports.
However, if the solution encounters a new format for receipts, invoices, etc., it may not identify it properly. During COVID, we had to process PCR tests from different diagnostic centers with different formats, so we created a model to figure out whether the person had negative results, but if a different format came in from a new diagnostics center, we might not have enough data to train the model.
It will scan correctly without human intervention if it's a well-established document type, but if there isn't enough training for the model, a human needs to come into the picture. Also, if the data input is not properly scanned because of its model input and all those things, and the system cannot understand it, then human-in-the-loop comes in.
The time needed for a human to validate a document depends on the number of fields and whether the file is a PDF form, invoice, etc. If you only need to validate the invoice number, you can complete that in one or two seconds, but it will take more time to validate all the line items in every field.
Document Understanding has reduced our processing time by around 70 percent. In some cases, it may be 90 percent. It obviously takes more time for an employee to process a document with three or four pages and pull the data from various places. Using a solution with an OCR component like Document Understanding is much faster. It frees up employee time because we're not using resources to punch in data manually. We can use those employees to do other things that require more human intelligence.
The solution has reduced human error because somebody previously opened this document manually and typed whatever they saw on the screen. Now, what is happening is the data extraction is happening systematically. If things look fine and the confidence score is high, it inserts the data into the system. If the confidence score is low, it shows the screen to the user and asks them to correct it. Instead of merely typing the information, the user verifies what the solution has done. It's easily a 30 to 40 percent error reduction, and the operational efficiency is drastically increasing.
What is most valuable?
Invoice processing is the most valuable feature. Most of my customers use Document Understanding for invoice processing. That's one of the most common use cases. Typically, each customer starts their RPA journey with the finance department because that's the area where you can see the most benefit.
It can extract checkboxes, signatures, and printed documents. The extraction and conversion of printed letters is perfect. Document Understanding can also process handwriting and signatures using a machine learning model on the backend. UiPath's product team is constantly training this model continuously. Every two weeks, they are training it with a new set of data, so the model is constantly becoming more mature. I've seen a tremendous improvement since 2021.
The solution's machine learning model gives it the flexibility to accommodate documents with varying structures. Before document understanding came along, data extraction was done using template-based extraction tools. They created a machine-learning model that can be retrained for any number of templates. If you are actually not using machine learning, you will not explicitly identify fields like "Bill To," "Ship To" etc. You have to tell it the location where you want to find data.
UiPath has already trained its machine-learning model, which has seen these types of invoices and trained the solution. You're building a better solution that requires less effort to implement because the product does a lot of that work for you. The deployment time is faster. It's more intelligent than conventional coding, which is just listing a set of rules. Everybody needs flexibility. It's not enough to have a solution to handle documents in a particular format. Whatever you do, it should have the intelligence to understand data in a semi-structured format even though things are returning in a different manner than the one that came before.
What needs improvement?
Document Understanding's handwriting comprehension is improving, but it's still not as good as printed documents. Machine learning models, in general, are becoming mature, but it's still not to a point where I will give it five stars. I may give it a two or three. It is still not advanced enough to identify whatever handwritten content you give to it. It can process handwriting, but you need a human to validate it. With more training, it will become more automated. It will be better by 2025, but it is still not mature enough
Similarly, there is still room for improvement in reading printed documents. Ideally, if you have a model, Document Understanding should be able to extract every field from there. That's what customers expect.
For how long have I used the solution?
We have used Document Understanding for about six months.
What do I think about the stability of the solution?
I rate Document Understanding seven out of ten for stability. It has some room for improvement.
What do I think about the scalability of the solution?
I rate Document Understanding seven out of ten for scalability,
How are customer service and support?
I rate UiPath support four out of 10. Their support has degraded badly. Presently, they are mainly focused on closing tickets. They have trouble communicating with our business users and end up closing the ticket because they don't understand what the issue is. It's a problem because the customer will lose interest in the product if they are not getting technical support.
How would you rate customer service and support?
How was the initial setup?
UiPath can be deployed on the cloud or on-prem. The infrastructure costs of hosting it on-prem are high. We have done many cloud deployments, but I would say it's not that easy. Normally, we subscribe to the SaaS version of UiPath and configure it for the customer. UiPath has a cloud instance, which is a SaaS offering. I believe Document Understanding is hosted in Azure, but the customer can opt for AWS, Google, etc. There are no restrictions if customers want to put it on their private cloud.
An on-prem installation takes about two or three weeks depending on the complexity of the environment. Cloud installation is plug-and-play, so you can get it up and running in a day. They need to issue the purchase order for it, and we get the licenses. Once the customer has the license, they can log into the UiPath Cloud portal, and it will be activated. Within five days, they can start using Document Understanding. After that, you need to build the automations for your use case. The development time frame depends on the use case. It requires maintenance because you must train the model continuously as new templates come in.
What was our ROI?
The price is high, so it will take you about a year and a half or two years before you break even.
What's my experience with pricing, setup cost, and licensing?
Document Understanding's pricing is reasonable for developed markets because manual entry will be unable to match the cost of automatically processing one page. However, you can get labor for much cheaper in developing markets like India. It's not easy to sell Document Understanding in markets where you can get workers who will do this type of activity cheaply.
What other advice do I have?
I rate UiPath Document Understanding seven out of ten. It's an add-on for UiPath, so it isn't a standalone solution. If you already have a license for another third-party solution for RPA, you should consider whether it's beneficial to switch.
Which deployment model are you using for this solution?
Public Cloud
If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?
Microsoft Azure
Disclosure: My company has a business relationship with this vendor other than being a customer: partner