What is our primary use case?
We have been using the most recent version. It's version 4.6.
What is most valuable?
I used to be a Pascal programmer, and then I did a bit of Python. It does many of the things that I would've had to do in code without using code. I don't think it does everything, but it does most of what I need to do. It can read many different file formats. It can very easily tidy up your data, deleting blank rows, and deleting rows where certain columns are missing. It allows you to make lots of changes internally, which you do using JavaScript to put in the conditional. For example, I have one data set whereby all of the data is encoded and there was one variable called code or something like that and it had codes for what the topic was, which was being discussed, whether it was positive or negative, whether it was strongly worded or weakly worded, and so many other things like that. I had to transfer those into columns, like sentiment, strength of sentiments, topic being discussed. I had to split it up into columns, and I could do that very easily, like simple JavaScript, in their column expressions. It also has very good fundamental machine learning. It has decision trees, linear regression, and neural nets. It has a lot of text mining facilities as well. It's fairly fully-featured. They are also very careful with things like lab variance and issued variance, because they have some labs that develop nodes, new chunks of code which are represented as an icon. They make it very clear that those lab ones are not fully tested, and they're very glad to get comments back if you have problems. I haven't had that difficulty myself. They seem to be aware that they have the community there as their testing base, and they seem not to be embarrassed about that. They will tell you when they go wrong and try to put it right.
What needs improvement?
So far, I haven't had problems with it, so I haven't really thought about room for improvement. It's so much better than many other things. It's useful in that you can at least get people who are pretty averse to programming to start thinking about putting something into a program of any kind, because they can see what's happening. It's visual. It's codeless. For some purposes, I'd want to add Python or R, but I haven't had to do that so far, so I haven't seen the shortcomings of it. There must be some. All software has shortcomings, but I haven't recognized any myself. Not just for KNIME, but generally for software and analyzing data, I would welcome facilities for analyzing different sorts of scale data like Likert scales, Thurstone scales, magnitude ratio scales, and Guttman scales, which I don't use myself. I use both Thurstone scales and magnitude ratio scales quite a bit, and they're very powerful. But I've always had to do all the analysis myself in some simple code. I don't think that's provided. You could probably include it in KNIME, but I haven't tried to do it. If it just said, "Analyze scales," and you'd choose which sort of scale you want to analyze and it gave you the options of normalizing or reversing or whatever it happens to be, that would be helpful. There are lots of simple functions that you want to apply to scales, which would be useful in any software, including KNIME.
For how long have I used the solution?
I have been using this solution for about a year, but most particularly in the last six months.
Buyer's Guide
KNIME
June 2022
Learn what your peers think about KNIME. Get advice and tips from experienced pros sharing their opinions. Updated: June 2022.
610,336 professionals have used our research since 2012.
What do I think about the stability of the solution?
It's been remarkably stable, much more so than most software. They have an active community forum. Problems seem to get fixed pretty quickly. I haven't had problems, but other people do report problems. So, there must be problems there, I just haven't had any.
Which solution did I use previously and why did I switch?
Compared to RapidMiner, at the moment I would go for KNIME, but that's largely because I haven't used RapidMiner much for the last year. It may have improved enormously since then. It was a very good package. They do much the same thing. I'm more familiar with KNIME, so I would be able to talk more about it, whereas for RapidMiner, I was very enthusiastic when I used it. KNIME is a bit cheaper in a sense. In RapidMiner, you can have up to 10,000 rows of data free of charge. For many things that I do, 10,000 rows of data is enough. I use quite a few UK government surveys, and I get the raw data from the UK Data Archive. They're often of the order of 10,000, 8,000. So, under 10,000 rows. I could use it free of charge.
What's my experience with pricing, setup cost, and licensing?
With KNIME, you can use the desktop version free of charge as much as you like. I've yet to hit its limits. If I did, I'd have to go to the server version, and for that you have to pay. Fortunately, I don't have to at the moment.
What other advice do I have?
I would rate this solution 8 out of 10. I'm unwilling to give anything a ten, because everything can be improved. But it's been very useful so far to me and has saved me many hours of work. I could have written it all in Python if necessary, but it would have taken me weeks for what would been a few days' work. My advice is to just download it and use it. The documentation is pretty good. There are many good videos online for it. If I go to YouTube, you can get pages and pages of RapidMiner tutorials. They're pretty clear, and they are produced by people who've used it. It's not just company advertising, as far as I can see.
Disclosure: I am a real user, and this review is based on my own experience and opinions.