What is our primary use case?
I use it in a project specifically with the models from Microsoft. These models, one was Florence-2, which is a model specialized in multiple computer vision tasks. The other was Microsoft Phi, which is a generative AI model for generative AI, also from Microsoft. Both of them were used for testing, not for this project. I also have a blog on YouTube where I do many experiments with different toolkits, and I tested it with other non-commercial models as well. For this specific project, I used it with two models from Microsoft. The idea was to run the models in Florence-2 and Microsoft Phi-3 in CPU for two specific use cases.
What is most valuable?
What I most appreciate about OpenVINO is the possibility to run models in CPU. That was obviously the reason to choose this solution. Another valuable feature is the possibility to run on non-NVIDIA GPUs. As this is a product developed by Intel, they developed it looking at the hardware, which is part of their house, to run inference or models, LLMs or deep learning models, in CPUs or from Intel, an Intel CPU or also Intel GPUs.
Before, it was not possible. The GPU market is completely dominated by NVIDIA. OpenVINO allowed me to run a model in an Intel CPU. Also, any framework PyTorch, TensorFlow, deep learning frameworks, they only have compatibility with NVIDIA GPUs, and OpenVINO opened the door to run LLMs in non-NVIDIA GPUs.
The benefit from using OpenVINO is that NVIDIA is dominating the market of GPUs and they set the price. There is no competitor. If I am able to run an LLM doing inference in commodity hardware, I am saving costs. To buy a GPU to run one of these models from NVIDIA costs a minimum of 5,000 pounds, and it's a very small GPU from NVIDIA. This cost is pretty high in comparison with a GPU from Intel or even a CPU with the same performance or similar performance on Intel. It is a cost saving solution and offers the possibility to run in cheaper hardware, which ultimately also saves cost.
What needs improvement?
What could be improved in OpenVINO is making the product more cross-platform. I know they are working with third-party plugins to extend the toolkit, and in this way, I can use it with NVIDIA GPUs or with other hardware because now it's primarily working in all Intel hardware. CPU, GPUs, TPUs, but only from Intel. If they make more cross-platform functionality, it would be great. It's difficult to make it work faster than the NVIDIA toolkit in their own GPUs. At least having the possibility and making it work faster than now in other hardware that is not from Intel provided would be beneficial.
For how long have I used the solution?
I have been using OpenVINO for the last two years.
What do I think about the stability of the solution?
Regarding stability, it is good. Once I have the model running in OpenVINO, it is quite stable and performs on the machine. That aspect is good. Intel designed this product, OpenVINO, and they know their hardware, which is the same as NVIDIA with TensorRT. Their engineers know the hardware and they produce a product with very good quality. When I am running OpenVINO in Intel hardware, it's quite stable and works very efficiently.
What do I think about the scalability of the solution?
In my opinion, OpenVINO is more designed for the end user or a solution because when I run this in CPUs, the reason to use it is for saving costs and then run this in cheap hardware. I think that it's not properly designed for scalability. It's designed for other purposes, specifically to be able to use Intel hardware and run inference using generative models or deep learning models in Intel hardware. I don't think that the product itself is designed to scale to multiple clusters of servers and this kind of implementation. It's probably not designed with that intention.
How are customer service and support?
The support team has excellent documentation. It's complex and requires deep reading of the documentation. You need to have some skills in engineering products and ML products. It's complicated, but to be fair, with others such as TensorRT from NVIDIA or Onyx from Microsoft, they are quite similar. They are complex to set up and to convert models into their proprietary format.
I need to have experience, and there are many cases of compatibility problems. The product is not easy to use, and engineering skills and experience are necessary for these tasks. The documentation is comprehensive. I never was in touch with Intel directly. I used the library as it's open source, so I can use it for free. I didn't need to use the support.
When I had issues, I was able to find what I needed in the documentation available on the repository or on the website of OpenVINO. It's quite thoroughly documented. There are many examples of using OpenVINO with different models, for different purposes, and different use cases. I can take the examples and reuse them for my local or particular problem. Good skills in Python and machine learning are necessary, which is similar to other complex tools.
How would you rate customer service and support?
How was the initial setup?
The initial setup and deployment of OpenVINO is complex to use, but this is similar to other solutions. These products are not easy to set up and require experience in machine learning. It's not for a junior developer. A senior engineer knowing what they're doing is necessary because they are complex to set up. This is especially true when working with containerization and needing to train or fine-tune models using the toolkit. It is a complex setup. The tool requires a specialist, a good specialist to work with.
What's my experience with pricing, setup cost, and licensing?
Regarding their pricing, it's open source with no cost for using it. There is no licensing involved. It follows the line of other products in the same area. TensorRT from NVIDIA has no cost and is open source as well. Onyx from Microsoft, which is also doing something similar, is also open source. I don't need to pay a license to use it.
What other advice do I have?
My company doesn't have a partnership with Intel. It was a product that I needed and used. It is a company that creates a product which is developing, introducing AI features on this product. It's a data governance and eDiscovery tool. I was helping with this project to introduce AI capabilities on the product. The company itself, Santax AI, didn't have a partnership with Intel itself. From Intel's perspective, they are just users.
Disclosure: My company does not have a business relationship with this vendor other than being a customer.