What is our primary use case?
I worked for a French company. They used TensorFlow for image classification. after that, I started working with a Russian-American Company who used TensorFlow mainly for object detection. TensorFlow is very good at object detection. We also used it once for natural language processing and audio processing, but I was not directly involved in that project. I was just assisting with deployment issues. We have some clients which wanted us to deploy on the cloud. Alternatively, some clients are releasing Tenserflow on some new edge devices, as an alternative to deploying on the cloud. It is going to be called NextGen AI or something like that from AWS. We use it for all aspects. including data processing, training, and sometimes deployment, but sometimes the use cases differ in practice for ML. As a result, we sometimes stay with TensorFlow or move into AWS specific architectures.
How has it helped my organization?
TensorFlow has benefitted us by enabling faster training and deployment. With TensorFlow, we don't really need any more DevOps to do the deployments. Even data scientists can do the deployment part. This has saved about 30% of the time we used to take for deployments.
What is most valuable?
Our clients were not aware they were using TensorFlow, so that aspect was transparent. I think we personally chose TensorFlow because it provided us with more of the end-to-end package that you can use for all the steps regarding billing and our models. So basically data processing, training the model, evaluating the model, updating the model, deploying the model and all of these steps without having to change to a new environment. Especially the part where you could train the model again, then evaluate it if it's better than the previous versions. It will do the deployment on its own. The end-users will not really see the change, as the update takes place without any downtime.
What needs improvement?
It doesn't allow for fast the proto-typing. So usually when we do proto-typing we will start with PyTorch and then once we have a good model that we trust, we convert it into TensorFlow. So definitely, TensorFlow is not very flexible.
For how long have I used the solution?
We have been using Tensorflow since 2017, so three years.
What do I think about the stability of the solution?
It's very stable. So we usually don't get any problems. Once any bugs are fixed, you shouldn't have any problems with TensorFlow. Once the deployment process is completed, you can monitor your model and datasets. You can monitor your model to ensure it is correctly deployed and it's working as it's supposed to do, including services.
What do I think about the scalability of the solution?
TensorFlow is very scalable.
How are customer service and support?
I've never really contacted TensorFlow support, but definitely, I can say you don't really need to do that because the support, like the community is pretty strong. Whatever problem you face, there's always going to be some Stack Overflow answer for it or at least some GitHub issue where you can find your solution.
Which solution did I use previously and why did I switch?
We used PyTorch and MXNet. A couple of my friends actually used MXNet, but I did not use it personally. Now I work mainly between PyTorch and TenserFlow.
How was the initial setup?
Setup got easier with TensorFlow 2. With TensorFlow 1 it was a bit more complicated. There are fewer compatibility issues with the newer version. Training takes at least a week. Deployment usually takes one or two days. With regards to deployment, this changes depending on the client. The usual method is to get our TensorFlow models up and running and then we have to convert them into specific formats depending on the client's requirements. Some clients actually require AWS specific formats. To incorporate that we usually just convert our TensorFlow models to AWS compatible models.
What about the implementation team?
We use in-house teams. I think the ML team has around 20 people. There is a team in Russia, Ukraine and the U.S. I am part of the team in Russia. In Russia, we have around 30 people who have used TensorFlow, including data analysts. They basically handle data pre-processing. We also have ML engineers and ML Ops.
What's my experience with pricing, setup cost, and licensing?
TensorFlow is free, so cost is not an issue.
Which other solutions did I evaluate?
PyTorch is very flexible, and definitely more flexible than TensorFlow. However, TensorFlow allowed us to deploy our models faster and more robustly. With TensorFlow, you don't encounter a lot of errors. PyTorch is faster to prototype with. So it's very flexible. You can do all the changes that you want, but it's not very stable, whereas TensorFlow is more of a stable solution but is not very flexible. We looked at other alternatives, but they don't scale to large problems.
What other advice do I have?
Have a look at TensorFlow extended. It's very useful. Especially if you know how to use the old system. It will speed up the process of deploying your model. Don't reinvent the wheel. There's always going to be a good GitHub repo out there which kind of answers your solution. You shouldn't really spend a lot of time trying to build the new models where there is some other open source project that actually did a good job of the modelling part. You definitely need to have your own pipelines for this process. Try to build the pipelines that automate most of the tasks for you. Then all you need to concern yourself with is just the architecture. Obtain a pipeline template from GitHub of what you are trying to achieve, amend it for your needs and then you are ready to go. Your model is training already. I would rate TensorFlow 8 out of 10.
Which deployment model are you using for this solution?
Private Cloud
If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?
Amazon Web Services (AWS)
*Disclosure: My company does not have a business relationship with this vendor other than being a customer.
Interesting view