What is our primary use case?
My main use case for Datadog involves working on projects related to our sales reps in terms of registering new clients, and I've been using Datadog to pull up instances of them while they're beta testing our product that we're rolling out just to see where their errors are occurring and what their behavior was leading up to that.
I can't think of all of the specific details, but there was a sales rep who was running into a particular error message through their sales registration process, and they weren't giving us a lot of specific screenshots or other error information to help us troubleshoot. I went into Datadog and looked at the timestamp and was able to look at the actual steps they took in our platform during their registration and was able to determine what the cause of that error was. I believe if I remember correctly, it was user error; they were clicking something incorrectly.
One thing I've seen in my main use case for Datadog is an option that our team can add on, and it's the ability to track behavior based on the user ID. I'm not sure at this time if our team has turned that on, but I do think that's a really valuable feature to have, especially with the real-time user management where you can watch the replay. Because we have so many users that are using our platform, the ability to filter those replay videos based on the user ID would be so much more helpful. Especially in terms where we're testing a specific product that we're rolling out, we start with smaller beta tests, so being able to filter those users by the user IDs of those using the beta test would be much more helpful than just looking at every interaction in Datadog as a whole.
What is most valuable?
The best features Datadog offers are the replay videos, which I really find super helpful as someone who works in QA. So much of testing is looking at the UI, and being able to look back at the actual visual steps that a user is taking is really valuable.
Datadog has impacted our organization positively in a major way because not even just as a QA engineer having access to the real-time replay, but just as a team, all of us being able to access this data and see what parts of our system are causing the most errors or resulting in the most frustration with users. I can't speak for everybody else because I don't know how each other segment of the business is using it, but I can imagine just in terms of how it's been beneficial to me; I can imagine that it's being beneficial to everybody else and they're able to see those areas of the system that are causing more frustration versus less.
What needs improvement?
I think Datadog can be improved, but it's a question that I'm not totally sure what the answer is. Being that my use case for it is pretty specific, I'm not sure that I have used or even really explored all of the different features that Datadog offers. So I'm not sure that I know where there are gaps in terms of features that should be there or aren't there.
I will go back to just the ability to filter based on user ID as an option that has to be set up by an organization, but I would maybe recommend that being something part of an organization's onboarding to present that as a first step. I think as an organization gets bigger or even if the organization starts using Datadog and is large, it's going to be potentially more difficult to troubleshoot specific scenarios if you're sorting through such a large amount of data.
For how long have I used the solution?
I have been working in this role for a little over a year now.
What do I think about the stability of the solution?
As far as I can tell, Datadog has been stable.
What do I think about the scalability of the solution?
I believe we have about 500 or so employees in our organization using our platform, and Datadog seems to be able to handle that load sufficiently, as far as I can tell. So I think scalability is good.
How are customer service and support?
I haven't had an instance where I've reached out to customer support for Datadog, so I do not know.
How would you rate customer service and support?
Which solution did I use previously and why did I switch?
I do not believe we used a different solution previously for this.
What was our ROI?
I cannot answer if I have seen a return on investment; I'm not part of the leadership in terms of making that decision. Regarding time saved, in my specific use case as a QA engineer, I would say that Datadog probably didn't save me a ton of time because there are so many replay videos that I had to sort through in order to find the particular sales reps that I'm looking for for our beta test group. That's why I think the ability to filter videos by the user ID would be so much more helpful. I believe features that would provide a lot of time savings, just enabling you to really narrow down and filter the type of frustration or user interaction that you're looking for. But in regards to your specific question, I don't think that's an answer that I'm totally qualified to answer.
Which other solutions did I evaluate?
I was not part of the decision-making process before choosing Datadog, so I cannot speak to whether we evaluated other options.
What other advice do I have?
Right now our users are in the middle of the beta test. At the beginning of rolling the test out, I probably used the replay videos more just as the users were getting more familiar with the tool. They were probably running into more errors than they would be at this point now that they're more used to the tool. So it kind of ebbs and flows; at the beginning of a test, I'm probably using it pretty frequently and then as it goes on, probably less often.
It does help resolve issues faster, especially because our sales reps are used to working really quickly in terms of the sales registration, as they're racing through it. They're more likely to accidentally click something or click something incorrectly and not fully pay attention to what they're doing because they're just used to their flow. Being able to go back and watch the replay and see that a person clicked this button when they intended to click another button, or identifying the action that caused an error versus going off of their memory.
I have not noticed any measurable outcomes in terms of reduction in support tickets or faster resolution times since I started using Datadog. For myself, looking at the users in our beta test group, none of those came as a result of any sort of support ticket. It came from messages in Microsoft Teams with all the people in the beta group. We have resulted in fewer messages in relation to the beta test because they are more familiar with the tool. Now that they know there might be differences in terms of what their usual flow is versus how their flow is during the beta test group, they are resulting in fewer messages because they are probably being more careful or they've figured out those inflection points that would result in an error.
My biggest piece of advice for others looking into using Datadog would be to use the filters based on user ID; it will save so much time in terms of troubleshooting specific error interactions or occurrences. I would also suggest having a UI that's more simple for people that are less technical. For example, logging into Datadog, the dashboard is pretty overwhelming in terms of all of the bar charts and options; I think having a more simplified toggle for people that are not looking for all of the options in terms of data, and then having a more technical toggle for people that are looking for more granular data, would be helpful.
I rate Datadog 10 out of 10.
Which deployment model are you using for this solution?
Public Cloud
Disclosure: My company does not have a business relationship with this vendor other than being a customer.