What is our primary use case?
My main use case for Monte Carlo starts with working on data quality related problems, but now it is a very necessary element for our whole data pipeline because it lets us know about the freshness and other data quality related metrics as well, covering all the data quality related metrics, including completeness, correctness, accuracy, and majorly, the freshness part of it.
A specific example of how Monte Carlo helped me with a data quality issue is in our architecture, where we are getting all the data, storing all the historical snapshots in a raw layer, deduplicating it in the second layer, and curating it in the third layer, with alerts and monitors on all layers whenever we move from raw to dedupe or dedupe to curated, checking whether the number of IDs or the number of data is as expected or not, leading to alerts being triggered in the past for the orders table that helped us investigate pipeline related failures.
I often rely on Monte Carlo for our tables that get refreshed in three hours, where it sets up alarms or alerts if the tables do not get updated on their usual trend, such as triggering an alert if a table has not been updated for four or five hours.
How has it helped my organization?
Monte Carlo has positively impacted my organization by removing the need for manual monitoring of pipelines, automatically triggering alarms whenever something goes wrong, thus allowing the data engineering team to focus on more positive work, and helping us in manual tasks where after every logic change, we can track data and updates, along with proper catalog maintenance to find out specific information in our data warehouse.
Since using Monte Carlo, the freshness of our data has improved a lot from less than eighty percent to above ninety percent and there has been significant time saved, noting that while we do not keep a precise record of this, there is a steep decrease in time consumed on monitoring and related activities.
What is most valuable?
The best features Monte Carlo offers include an AI related trend analysis tool that checks the number of records of a certain table or the kinds of records affected by delete, insert, or update operations, triggering alerts if those numbers become unusual and providing a triage solution to investigate specific base tables or parent tables behind specific issues.
The AI trend analysis and triage solution have helped my team by alerting us during manual deletions or update activities and if there is a logic change in the main curated layer; if deletion rates deviate from expected numbers, we receive alerts that we may have messed up with the code, allowing us to check the code logics that we have implemented.
Regarding Monte Carlo's AI capabilities, it offers a tool that provides a mechanism to select or exclude specific parts of the data from the training cycle, allowing companies to adjust incorrect or ambiguous trends in the data, thus showing consideration for governance.
What needs improvement?
There are some improvements needed for Monte Carlo's code used for migration, which has not been set up well; improving documentation and migration features from other services, along with enhancing historical maintenance and version control on Monte Carlo's code, would greatly help.
In some cases, with multiple tables, the UI sometimes crashes, but it is still the best I have seen so far, making it a great tool overall.
For how long have I used the solution?
I have been using Monte Carlo in my previous organization for about one year, and here in my current organization, I have been using it for one and a half years.
What do I think about the stability of the solution?
Monte Carlo is stable, with ongoing feature improvements; while there were initial breaking issues, they are fixed quickly when reported.
What do I think about the scalability of the solution?
In terms of scalability, Monte Carlo handles our organization's multitude of tables and connections well, although there could be improvements in its implementation scalability, particularly with monitors as code.
How are customer service and support?
Customer support was great, with dedicated resources from the Monte Carlo team who assisted with issues during our weekly calls, ensuring we understood specific features.
Which solution did I use previously and why did I switch?
We previously used the Great Expectations library, which did not offer a solution like AI trend analysis and only provided basic data-based monitoring, lacking features that led us to switch to Monte Carlo.
How was the initial setup?
In the beginning, I found that Monte Carlo took time to learn and understand the metrics and trends we have, but after six or seven months, it has shown great and accurate responses.
What was our ROI?
We have been tracking our return on investment, which has not been long, but we have saved significantly in time utilization of our resources and in capturing criticalities through this solution.
What's my experience with pricing, setup cost, and licensing?
In terms of pricing, setup cost, and licensing, I rate it a bit high on the pricing side; it is pricey, but given the features and flexibility it offers during implementation, it stands out against specific libraries that are less handy to use, requiring extensive documentation.
Which other solutions did I evaluate?
Before choosing Monte Carlo, we evaluated Evidently AI, and our existing organizational ties to Monte Carlo influenced our decision-making.
What other advice do I have?
My advice for others considering Monte Carlo is to assess whether their data platform is large enough to benefit from AI capabilities, as smaller scale industries with basic rule-based monitoring might find it a bit pricey. I would rate this review a nine out of ten.
Which deployment model are you using for this solution?
Public Cloud
If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?
Google