What is our primary use case?
The solution that I had worked on was related to the technical implementation of metadata for capturing analytics. That said, that particular implementation missed the bus with using it for business use and getting a proper buy-in from the users.
What is most valuable?
We learned some lessons on a past implementation, and with the new implementation, we're going beyond the data catalog and looking at the OwlIQ, data quality, as well as data lineage. We're really selecting some of the best modules within Collibra's toolset.
Right now, we're just implementing it. We're still in the purchasing process, however, having experience with it, I would say a great feature is its usability on the engagement with enterprise functionality. The crowdsourcing and just making it very accessible is foundational. At the end of the day, it always comes down to terms. For example, we might be saying the same exact term, however, have completely different meanings. It brings to light just the nuances, especially within a larger enterprise organization, a global organization. In organizations of this size, we've realized just how different our terminology views are. It sheds a good light on this and helps clarify.
The data lineage piece is very useful for us. The ability to understand data flows, where systems and changes originate, is great. A lot of time you might have something on paper that isn't necessarily working in real life. This product brings about the right visibility to have the right conversations between business and IT.
There is a good community setup around the solution that can provide insights.
What needs improvement?
It's not necessarily a tool specific, however, with any sort of application, there's an investment as far as the way in which you need to use it. There is a lot of upfront work that has to be considered. That's just a common reality with any software implementation. There's a lot of pre-work. You just don't turn on the lights assume it's going to work exactly as you envisioned. There is input and planning required.
If anything, I would say that the licensing is one area that could get improved. We have basically three roles: an admin, an editor, and a view-only role. It is limiting. For example, we want view-only, however, if we want users to be able to approve workflows, they need editor rights. That makes sense, except it doesn't necessarily meet all the business cases we have. In some instances, you might just need proper approvals, and you are not necessarily asking anyone to edit things. Yet in order for them to approve, they must have edit rights.
The last implementation was very much focused more on IT and capturing more of the IT view of data and even data definitions really focused on data standards, such as how we're going to name the technical fields or how we're going to name the entities. This new deployment is really much more focused on not just the IT side but on the business side and the operational side. It's based more so around analytics and operational governance. I'm hoping to use more of the modules and have a better, more favorable opinion of the solution's capabilities. While overall I have the sense it's good, the last company I was with didn't have the right business partners and it really just became another IT tool, which wasn't helpful to the company as a whole.
The initial setup requires more of a trial and error approach and there isn't too much documentation available to help you figure things out. There needs to be more online support around the sharing of best practices. There are a lot of use cases and people like the tool. That said, you hear a lot of pain points around large amounts of data being ingested and creating backlogs of data that need to be cataloged and there's really no way to prioritize it.
Ultimately, it's a tool that should help to coordinate a lot of efforts and it would be nice to be able to look at something and understand how another experience could be similar or you can get a lesson learned before you actually make it your own lesson to learn.
This is more of a data governance tool, not necessarily a centralized tool for data cleansing. However, with the data quality module, that's the next evolution that's possible. Looking at data quality issues and then ultimately not necessarily being able to correct them, there's a lost opportunity. Data changes all the time. We're measuring it all the time. It would be advantageous to build this into more of a data quality tool in which users could cleanse data that could go back to source systems. That said, that's encroaching on more of the MDM solution.
For how long have I used the solution?
I've been using the solution for about two years or so at this point.
What do I think about the stability of the solution?
It's largely a good, stable solution. This is not an MDM solution. From a governance standpoint, there are some things that Collibra does better than some of its competitors, however, there's always something about having multiple tools and getting users to accept the multiple tools. It would be great if they could partner with an MDM solution provider to give more of a seamless look and feel.
In the last implementation, I do not recall dealing with bugs or glitches. In this new implementation, it's still too early to tell.
What do I think about the scalability of the solution?
The scalability potential is all around the framework that's specific to the company. It'd be good to have some general best practices from Collibra's standpoint. That said, is scalable, however, first and foremost, you need to implement it and really look at how the tool is functioning out of the box before you put your own strategy on it.
Many times though, projects as they go, you're really not afforded that freedom. You might have a specific use case and you're trying to get that implemented so you'll get a quick win from a governance standpoint and so you can continue to incrementally add value. It's a balance due to the fact that, as we're trying to provide a solution, governance is an investment for sure. While there's certainly scalability potential there are structures that need to be in place from a foundational standpoint for it to scale as you need it to.
In the last implementation, there were about 20 users on the product. In that case, it was not that extensively used. Doing a data warehouse migration from Cloudera to Azure, things were collected, however, what was missing was the business definitions and the scenario-based understanding. Due to the implementation the last time, it offered a very flat view of the data. You didn't understand how everything was related or how things were scenario-based, et cetera. You couldn't get a sense of how fields are ultimately connected, and the KPIs that they ultimately built didn't help with understanding. The intention was that it was going to be an enterprise data catalog and it missed that chance.
How are customer service and technical support?
I haven't been in touch with technical support. I can't speak to how helpful or responsive they are.
How was the initial setup?
The implementation is not that easy. All the sell sheets and everything makes it seem as though it's more structured. Here you have this catalog, however, in reality, you have to define the structure including the data that you're going to be collecting, how you're going to define it, what those workflows are, what the user groups are, et cetera. There's a whole change management initiative even beyond just turning it on.
With any application, whether it's cloud-based, but especially if it's on-prem, there is a level of pre-work that needs to be done. It's not just a turn-it-on type of event. Overall, that's sometimes lost in the process.
Getting it installed and all that is pretty straightforward if you can get a system integrator, or maybe if you have the in-house knowledge, however, it's really the strategy that's behind it that makes for an easy or difficult rollout.
The community is pretty good, however, I haven't necessarily found anything that's like user groups that can help guide implementation. A lot of it is you make a mistake and you have to go back and try to remedy it. There is a lot of trial and error involved.
What about the implementation team?
We handled the entire implementation ourselves, in-house.
What other advice do I have?
While I do not have a sense of the version number, I would say that we are not on the latest version of the solution at this time.
I would advise new users looking at getting it implemented to really use the out-of-box features before you overlay your specific strategy on it. Upfront investigations and creating a repeatable framework of how this will ultimately operate are important to the success of the solution. One of the crucial early factors is to get this as part of an operating fabric within your company. There's a lot of pre-work and pre-thought that needs to be in place in that sense. Having well-engaged business folks as part of it will help with the level of success as well. This is not necessarily a big bang type of development and release. It's very incremental. You've got to think backward as far as the user experience - of how things are going to be searched and located - and bring that back to your IT process.
I'd rate the solution at a seven out of ten.
Which deployment model are you using for this solution?
Public Cloud
If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?
Microsoft Azure
Disclosure: My company does not have a business relationship with this vendor other than being a customer.