I provide product management and SME services to oil companies as a consulting service. My company has partnered with SparkCognition to bundle its products into a package of services that I provide to my customers. For the most part, when I'm working with SparkCognition, and Darwin in particular, I'm working with it on behalf of one of my customers.
We do different engagements. We've done PoC projects with customers with versions 1.4 and onward.
The biggest use case we've seen is for automatic classification of data streaming in from oil and gas operations, whether exploration or production. We see the customers using it to quickly and intelligently classify the data. Traditionally, the way that would be done is through a very complicated branching code which is difficult to troubleshoot, or by having it manually done with SMEs or people in the office who know how to interpret the data and then classify it, for analytics.
The customers have looked at using machine learning for that, but they run into challenges — and this is really what Darwin is all about. Typically there is an SME who can look at the data and properly classify it or identify problems, but taking what he knows and what he does instinctively and communicating it to a data scientist who could build a model for that is a very difficult process. Additionally, data scientists are in very high demand, so they're expensive.
SMEs can look at data and quickly make interpretations. They've probably been looking at the data for 10 or 15 years. So it's not a matter of just, "Oh, we can plunk this SME beside a data scientist and in a couple of months they can turn out a model that does this." First, SMEs don't have time to be pulled out of their normal workload to educate the data scientists. And second, even if they do that you end up with something very rigid
With Darwin, customers can empower the SMEs to build the models themselves without having to go through the process of educating the data scientists, who may leave next week for a better paying job.
Most of the projects that we've done, PoCs, are typically done in the cloud, for ease of use. Because we work in the oil and gas space, public cloud is the preferred option in the U.S., with the simplified administration and a little bit lower cost. Overseas, the customers we've talked to have noted there are laws and restrictions that require their stuff to be on-premise. We've talked to potential customers about it, but we haven't actually done an on-premise project so far.