Grooper Room for Improvement
President and COO at a computer software company with 51-200 employees
Currently, we're still using version 2-7-2, and now they're about to do the beta release on their version 2021. In this coming version, we expect that some of our issues will be fixed.
We've had challenges in classification tasks where similar documents were flagged as multiple matches. The system would identify them and say, "Hey, I think I've got multiple matches. It could either be this one or that one." Because of that, it required us to instruct the system to either leave it unclassified, or we had to halt the process for somebody to look at it.
With the new version for 2021, they have changed the paradigm. As it is now, we're using something called a form type, where pages within the document are referenced using a specific page number. For example, in a ten-page document, you might refer to information specifically on the first or fifth page. In the new paradigm, there is a first, middle, and last page concept, as opposed to having the different form types with all of the different pages. What they're telling me is that it's going to make the classification more accurate. Just because the first page of two different documents looks the same, they will not be considered duplicates. Having multiple points of reference will now allow it to better distinguish them.
The other area we have had challenges with is table extractions, where if the data headers were not defined, or the tables did not have descriptions for the columns. My understanding is that in the 2021 version, they've now shown that they're handling that. Again, we don't have it and haven't been able to test it, but it's coming.
Technical support is definitely an area that they need improvement in, in terms of the front-line individuals.View full review »
Senior Consultant at a tech services company with 11-50 employees
Grooper is new. It's new beta stuff, so we've had some issues, but that's understandable. Getting the beta product to more of a true release is where it needs improvement. I'm going through training now, so it's hard to judge what they have and don't have until I get through that training. Training is the main thing for me because I'm trying to learn and take things I've learned from other products and try to transfer that knowledge to this one.
But from what I've seen so far, it does very well. In the beginning, it was very frustrating because I didn't know much about it, but now, as I'm getting more into it, it's not as bad as I thought it was going to be. I'm starting to use it better. I'm able to configure things really quickly. I see this as a really good product for us in the long term.
There should be more detail on how things were done, but because of how some of the things are being extracted, it's hard to judge that because it may be already out there. It's not a fair statement to say it's lacking in that area.
The stability of the environment needs improvement because it's new and they had some hiccups, but we got through it.View full review »
When editing the extractors, the name should be shown.
They should have more sub-extractors or exclusion extractors so that the user does not have to make a parent data type.
I would like to see them resolve the remark of the positive extractors and negative extractors. If the extractor provided here successfully extracts one or more values from the document, the document will be classified as this Document Type with no further processing.
There are few bugs that my superior has posted in the Grooper exchange that concerned errors with the OCR. Even if we recognized them again at the page level, they turned out to be correct but the document level was still wrong.
There are different degrees of OCR issues. A smaller issue might be that some characters were missed, whereas a bigger problem might be that the values are missed. This means that the fuzzy mode translation function needed to be used.
The biggest issue might be when characters are hard to read, even by humans. In this case, so far there is nothing much can be done.
On one file that we have, it has a table with the same four headers shown many times. In this case, there is nothing Grooper can do, at least as far as we can tell based on the knowledge and skills we have learned.View full review »