Buyer's Guide
Application Performance Management (APM)
November 2022
Get our free report covering Dynatrace, Datadog, Splunk, and other competitors of AppDynamics. Updated: November 2022.
655,465 professionals have used our research since 2012.

Read reviews of AppDynamics alternatives and competitors

Richard Mitchell - PeerSpot reviewer
DevOps Leader at a legal firm with 501-1,000 employees
Real User
Good executive-level dashboards with powerful automation and AI capabilities, but the management interface could be more intuitive
Pros and Cons
    • "The user interface for the management functions is not particularly intuitive for even the most common features."

    What is our primary use case?

    Our primary use case is the consolidation of observability platforms.

    How has it helped my organization?

    Looking at Dynatrace's automation and AI capabilities, automation is generally a great place to start. In products where there has been no observability or a very limited amount, the automation can give a great deal of insight, telling people things that they didn't know that they needed to know.

    Davis will do its best to provide root cause analysis, but you, as a human, are still responsible for joining as many of the dots together as possible in order to provide as big a picture as possible. As long as you accept that you still have to do some work, you'll get a good result.

    I have not used Dynatrace for dynamic microservices within a Kubernetes environment in this company, but I have had an AWS microservice cluster in the past. Its ability to cope with ephemeral incidences, as Kubernetes usually are, was very good. The fact that we didn't have to manually scale out to match any autoscaling rules on the Kubernetes clusters was very useful. Its representation of them at the time wasn't the best. Other products, Datadog, for example, had a better representation in the actual portal of the SaaS platform. That was about three years ago, and Dynatrace has changed, but I haven't yet reused the Kubernetes monitoring to see if it has improved in that regard.

    Given that Dynatrace is a single platform, as opposed to needing multiple tools, the ease of management is good because there is only one place to go in order to manage things. You deal with all of the management in one place.

    The unified platform has allowed our teams to better collaborate. In particular, because of the platform consolidation, using Dynatrace has made the way we work generally more efficient. We don't have to hop between seven different monitoring tools. Instead, there's just one place to go. It's increased the level of observability throughout the business, where we now have development looking at their own metrics through APM, rather than waiting until there's a problem or an issue and then getting a bug report and then trying to recreate it.

    It's increased visibility for the executive and the senior management, where they're getting to see dashboards about what's happening right now across the business or across their products, which didn't used to exist. There's the rate at which we can monitor new infrastructure, or applications, or custom devices. We had a rollout this week, which started two days ago, and by yesterday afternoon, I was able to provide dashboards giving feedback on the very infrastructure and applications that they had set the monitoring up on the day before.

    As we've only been using Dynatrace in production for the past month in this company, the estimate as to the measurement of impact isn't ready yet. We need more time, more data, and more real use cases as opposed to the synthetic outages we've been creating. In my experience, Dynatrace is generally quite accurate for assessing the level of severity. Even in scenarios where you simply rely on the automation without any custom thresholds or anything like that, it does a good job of providing business awareness as to what is happening in your product.

    Dynatrace has a single agent that we need to install for automated deployment and discovery. It uses up to four processes and we found it especially useful in dealing with things like old Linux distros. For example, Gentoo Linux couldn't handle TLS 1.2 for transport and thus, could not download the agent directly. We only had to move the one agent over SSH to the Gentoo server and install it, which was much easier than if we'd had to repeat that two or three times.

    The automated discovery and analysis features have helped us to proactively troubleshoot products and pinpoint the underlying root cause. There was one particular product that benefited during the proof of concept period, where a product owner convened a war room and it took about nine hours of group time to try and reason out what might be the problem by looking at the codebase and other components. Then, when we did the same exercise for a different issue but with Dynatrace and the war room was convened, we had a likely root cause to work from in about 30 minutes.

    In previous companies where the deployment has been more mature, it was definitely allowing DevOps to concentrate on shipping quality rather than where I am now, which is deploying Dynatrace. The biggest change in that organization was the use of APM and the insights it gave developers.

    Current to the deployment of Dynatrace, we adopted a different methodology using Scrum and Agile for development. By following the Scrum pattern of meetings, we were able to observe the estimated time in the planning sessions for various tasks. It started to come down once the output of the APM had been considered. Ultimately, Dynatrace APM provided the insight that allowed the developers to complete the task faster.

    What is most valuable?

    The most valuable features for us right now are the auto-instrumentation, the automatic threshold creation, and the Davis AI-based root cause analysis, along with the dashboarding for executives and product owners.

    These features are important because of the improved time it takes for deployment. There is a relatively small team deploying to a relatively large number of products, and therefore infrastructure types and technology stacks. If I had to manually instrument this, like how it is accomplished using Nagios or Zabbix, for example, it would take an extremely long time, perhaps years, to complete on my own. But with Dynatrace, I can install the agent, and as long as there is a properly formed connection between the agent and the SaaS platform, then I know that there is something to begin working with immediately and I can move on to the next and then review it so that the time to deployment is much shorter. It can be completed in months or less.

    We employ real user monitoring, session replay, and synthetic monitoring functionalities. We have quite a few web applications and they generally have little to no observability beyond the infrastructure on which the applications run. The real user monitoring has been quite valuable in demonstrating to product owners and managers how the round-trips, or the key user actions, or expensive queries, for example, have been impacting the user experience.

    By combining that with session replay and actually watching through a problematic session for a user, they get to experience the context as well as the raw data. For a developer, for example, it's helpful that you can tell them that a particular action is slow, or it has a low Apdex score, for example, but if you can show them what the customer is experiencing and they can see state changes in the page coupled with the slowness, then that gives a much richer diagnostic experience.

    We use the synthetics in conjunction either with the real user monitoring or as standalone events for sites that either aren't public-facing, such as internal administration sites, or for APIs where we want to measure things in a timely manner. Rather than waiting for seasonal activity from a user as they go to work, go home, et cetera, we want it at a constant rate. Synthetics are very useful for that.

    The benefit of Dynatrace's visualization capabilities has been more apparent for those that haven't used Dynatrace before or not for very long. When I show a product owner a dashboard highlighting the infrastructure health and any problems, or the general state of the infrastructure with Data Explorer graphs on it, that's normally a very exciting moment for them because they're getting to see things that they could only imagine before.

    In terms of triaging, it has been useful for the sysadmins and the platform engineering team, as they normally had to rely on multiple tools up until now. We have had a consolidation of observability tools, originally starting with seven different monitoring platforms. It was very difficult for our sysadmins as they watched a data center running VMware with so many tools. Consolidating that into Dynatrace has been the biggest help, especially with Davis backing you up with RCAs.

    The Smartscape topology has also been useful, although it is more for systems administrators than for product owners. Sysadmins have reveled in being able to see the interconnectedness of various infrastructures, even in the way that Dynatrace can discover things to which it isn't directly instrumented. When you have an agent on a server surrounded by other servers, but they do not have an agent installed, it will still allow a degree of discovery which can be represented in the Smartscape topology and help you plan where you need to move next or just highlight things that you hadn't even realized were connected.

    What needs improvement?

    The user interface for the management functions is not particularly intuitive for even the most common features. For example, you can't share dashboards en masse. You have to open each dashboard, go into settings, change the sharing options, go back to dashboards, et cetera. It's quite laborious. Whereas, Datadog does a better job in the same scenario of being a single platform of making these options accessible.

    User and group management in the account settings for user permissions could be improved.

    The way that Dynatrace deals with time zones across multiple geographies is quite a bone of contention because Dynatrace only displays the browser's local time. This is a problem because when I'm talking with people in Canada, which I do every day, they either have to run, on the fly, time recalculations in their heads to work out the time zone we're actually talking about as relevant to them, or I have to spin up a VM in order to open the browser with the time zone set to their local one in order to make it obvious to them without them having to do any mental arithmetic.

    For how long have I used the solution?

    Personally, I have been using Dynatrace since November of 2018. At the company I am at, we have been using it for approximately four months. It was used as a PoC for the first three months, and it has been in production for the past month.

    What do I think about the stability of the solution?

    The SaaS product hasn't had any downtime while I've been at my current company. I've experienced downtime in the past, but it's minimal.

    What do I think about the scalability of the solution?

    To this point, I've not had any problems with the scalability, aside from ensuring that you have provisioned enough units. However, that is another point that is related to pricing.

    Essentially, its ability to scale and continue to work is fine. On the other hand, its ability to predict the required scalability in order to purchase the correct number of various units is much harder.

    How are customer service and support?

    Talking about Dynatrace as a company, the people I've spoken to have always been responsive. The support is always available, partly because of our support package. As a whole, Dynatrace has always been a very responsive entity, whether I've been dealing with them in North America or in the UK.

    Which solution did I use previously and why did I switch?

    We have used several other solutions including Grafana, Prometheus, Nagios, Zabbix, New Relic, AWS CloudWatch, Azure App Insights, and AppDynamics. We switched to Dynatrace in order to consolidate all of our observability platforms.

    Aside from differences that I discuss in response to other questions, other differences would come from the product support rather than the product itself. Examples of this are Dynatrace University, the DT One support team, the post-sales goal-setting sessions, and training.

    We're yet to have our main body of training, but we're currently scheduled to train on about 35 modules. Whereas, last year, when I rolled out Datadog, the training wasn't handled in the same way. It was far more on request for specific features. Whereas, this is an actual curriculum in order to familiarize end users with the product.

    How was the initial setup?

    In my experience, the initial setup has been straightforward, but I've done it a few times. When I compare it to tools like Nagios, Zabbix, Grafana, and Prometheus, it is very straightforward. This is largely for two reasons.

    First, they're not SaaS applications, whereas Dynatrace is, and second, the amount of backend configuration you have to do in preparation for those tools is much higher. That said, if we were to switch to Dynatrace Managed rather than Dynatrace SaaS, I imagine that the level of complexity for Dynatrace would rise significantly. As such, my answer is biased towards Dynatrace SaaS.

    What was our ROI?

    In my previous company, it allowed a very small team to manage what was a very fast-moving tech stack. In my current company, it is still very early.

    The consolidation of tools due to implementing Dynatrace has saved us money, although it's tricky to measure the impact. The list price of Dynatrace was more than the previous list price spend on monitoring tools because the various platforms had been provided as open-source tools, were provided through hosting companies, or had been acquired as part of acquisitions of other companies.

    The open-source applications that we used included Grafana, Prometheus, Nagios, and Zabbix. New Relic through Carbon60 in Canada, as an example, was provided through a hosting company. Also, we acquired a Canadian company or had been acquired as part of acquisitions of other companies, AppDynamics, in a Canadian company, for example, with us in the budget of the previous company rather than our own company.

    The hope was that Dynatrace through consolidation would release the material cost of the administrative overheads of tools like Prometheus and Grafana and the cost of hosting infrastructure for solutions like Nagios, Zabbix, Prometheus, Grafana, et cetera. This means that it is more of an upstream cost-saving, where we would be saving human effort and hosting costs by consolidating into a SaaS platform, which is pretty much all-in-one.

    What's my experience with pricing, setup cost, and licensing?

    Dynatrace's pricing for their consumption units is rather arcane compared to some of the other tools, thus making forward-looking calculations based on capacity planning quite hard. This is because you have to do your capacity planning, work out what that would mean in real terms, then translate that into Dynatrace terms and try to ensure you have enough Davis units, synthetics units, DEM units, and host units.

    Catching those and making sure you've got them all right for anything up to a year in advance is quite hard. This means that its ability to scale and continue to work is fine but predicting the correct number of various units to purchase is much harder.

    The premium support package is available for an additional charge.

    What other advice do I have?

    At this point, we have not yet integrated Dynatrace with our CICD tool, which is Azure DevOps. However, in the future, our plan is to provide post-release measurements and automated rollbacks when necessary. Even further down the road, there's ServiceNow on the roadmap, which we're currently bringing in from an Australian acquisition in order to try and promote the ITSM side of the business.

    There is nothing specific that has been implemented so far, although there have been general degrees of automation. When we get Agile, DevOps, and ServiceNow in place, the degree of automation will increase dramatically. For example, automated rollbacks in the case of deployment failure or change management automation through the current state of the target system are being included in the ServiceNow automation.

    The automation that has been done to alleviate the effort spent on manual tasks is still very light because I'm the only person doing the work. I generally don't have time to do the ancillary tasks at the moment, such as creating automations. It's mostly a case of deploying instruments, observing, and moving on. When we come back to revisit it, then we'll look at the automations.

    My advice for anybody who is looking into implementing Dynatrace is to make sure you talk constantly with your Dynatrace representatives during the PoC, or trial phase because there is invariably far more that Dynatrace can do than you realize. We only know what we know. I'm not suggesting that you let Dynatrace drive but instead, constantly provide the best practices. You will achieve faster returns afterward, whether that's labor savings, or recovery time, or costs from downtime. Basically, you want to make sure that you leverage the expertise of the company.

    In summary, this is a very good product but they need to sort out their user interface issues and provide a more logical experience.

    I would rate this solution a seven out of ten.

    Which deployment model are you using for this solution?

    Public Cloud
    Disclosure: PeerSpot contacted the reviewer to collect the review and to validate authenticity. The reviewer was referred by the vendor, but the review is not subject to editing or approval by the vendor.
    Technical Manager at Tech Mahindra Limited
    Real User
    Top 20
    It's highly customizable but lacks many features of available in competing solutions
    Pros and Cons
    • "DX allows you to customize and gives you a high degree of control."
    • "DX SaaS is a latecomer to the APM market. Some things that are straightforward in Dynatrace are complicated in DX. For example, upgrading the agents is a seamless process in Dynatrace, but it's a pain in DX SaaS. You should be able to upgrade in the Application Command Center. However, it is not working correctly."

    What is our primary use case?

    DX isn't something people use all day, every day. In some cases, the application teams use it when there is a problem in their application. It allows them to identify the core issue causing the problem in the application. 

    Any application team in the use case environment can access DX if it's integrated. Every application rolled out in the use case environment could be APM-enabled. We aren't sure how many people are using it at a time.

    We are using the SaaS version, but we were previously on-premise. There was no DX. It was known as Application Performance Monitoring at the time. Now it's DX APM, and DX means "digital experience." The cloud version is known as DX APM SaaS or DX SaaS APM.

    What is most valuable?

    DX allows you to customize and gives you a high degree of control. 

    What needs improvement?

    DX SaaS is a latecomer to the APM market. Some things that are straightforward in Dynatrace are complicated in DX. For example, upgrading the agents is a seamless process in Dynatrace, but it's a pain in DX SaaS. You should be able to upgrade in the Application Command Center. However, it is not working correctly.

    They upgrade the product every 15 to 30 days, and the process isn't seamless. It's like implementing the solution all over again. We monitor around 1,000-plus applications and have more than 100,000 agents, so we require a smooth upgrade process. It's nearly impossible to stay updated on the latest version.

    Upgrading the Dynatrace agent is smoother. You don't need to worry about it. If the agent is on the Dynatrace server, you only need to push it. After that, you will be notified to reboot the APM or CLM. That's it.

    It took us three years to deploy the agent on 1,000-plus applications across 40,000-plus servers. Now, they are saying they are ending support for 7.0.49, and we need to upgrade. The path to upgrading isn't straightforward. The first process is manual, and we can push it to different servers so it is visible. What's our configuration? Who is going to do the configuration?

    It's not typical or practical. I don't understand how product teams don't see that. That feature is not there. We hope they add this feature to the new product called DX Platform, which consists of net apps. All those network monitoring tools will be combined into DX Platform. 

    All the monitoring functionality is moved to DX Platform. You can't see a trend of your metrics grouped according to the last month, six months, one year, etc. The resolution is not there. I want granular visibility into data captured in the last 15 seconds. Those are essential features. 

    I am not saying that DX lacks solid features, but they need to consider it. Some core functionality of the product is missing. We have around 50-plus requests to add previously available features in the on-premise version. That is one reason application teams are reluctant to go to DX SaaS. We are struggling to make them understand and trying to find alternatives for the existing features.

    We've had many discussions with the product team, telling them we need this functionality. However, they tell us it's not on their product roadmap. They are gradually adding other features, but we need our requirements to be a priority. You cannot say you will try to add those requested features that aren't on your product roadmap. 

    There is always a catch in the product. We use around 10 tenants in production and six in the test run. First of all, there is nothing in the pane. If we are trying to see the data from an application, how do we know which tenant and application are reporting? There was a feature called Enterprise Team Center, but that functionality has been removed.

    All the applications are connected to the manager, which is connected to ETC. If you go to ETC, you can find the server and see your data, but that functionality was not there. Every product should have a management feature, but that is missing, and they are saying that it is not there in the roadmap. It is a basic requirement. You need to understand that. That is not there, manager, and they are saying that is not there in the roadmap as well.

    They have created a new tenant page temporarily. It is not there currently. It is not a required thing. There is a feature called Domain, but that concept is gone. We've struggled a lot, and what they provided in the initial migration stage is no longer working. We were delayed for two months because we didn't give them the correct input. They don't know their product. We tell them there is a problem, and they say they're fixing it. Are we their Guinea pig? You cannot treat your customers like this.

    For how long have I used the solution?

    I have used DX for more than two years.

    What do I think about the stability of the solution?

    We were using the on-premise version and moved to DX SaaS. A lot of metrics were clamped when we were on-premise. There is a limit to the number of agents it can handle. You will get a clamp once that's exceeded. That agent will not be able to pull the latest metrics from the application.

    There is no feature to tell us which agent is clamping and why. What actions can we take to reduce clamping? Suppose that agent selects unnecessary metrics that won't be useful for the application team. We must stop them and tell the agent not to pull the metrics.

    Finding that information on the dashboard was challenging when we were on-premises. They still had the same problem when we moved to DX SaaS. They cannot fix it, but they are tweaking some of the hardware sizes of containers. Sometimes they increase the JVM heat styling. 

    What do I think about the scalability of the solution?

    Scalability isn't a problem. If you are paying for an enterprise-level product, it should be able to handle around 1,000 agents successfully and provide end-to-end visibility of application transaction processes. It may work for a smaller environment with 50 applications or 5,000 agents. 

    A customer may be happy with the outcome of the product or the value added. Still, you need to have a different approach for a big implementation like it is here where we have 1,000 agents and 30,000, 40,000 deploying the application agent. We are also using the APM manager. You need a different approach altogether. It's not that this product is terrible, but it is not built for our requirements. There is a performance issue now and then. 

    How are customer service and support?

    I rate Broadcom customer support six out of 10. I have stopped submitting tickets because every time they say, "We can't do anything else. Our product team will handle it." They will raise a feature request that will be tracked separately. I need a resolution when I submit a case.

    Why should I struggle? I know it's an issue, and support should fix it, but their priority is showing their manager how many tickets they have closed. I have been working on this product for the last 15 years. I know what support and operational services can do. I understand that, but I need the support engineer's intent to be precise. If that isn't clear, I will not open a new ticket. Why should I go and raise a support ticket for every issue I have encountered. 

    I sincerely think they have released a good product, but they haven't tested it properly. They are updating the product to keep pace with their competitors. We use it to identify the missing features and tell them to add them. I think that is not the right way to develop a product.

    I have worked with Broadcom for many years, and I appreciate its technical support. They always respond quickly, but the outcome is often unsatisfactory. There have been many changes since Broadcom acquired CA. The strength of their support has been reduced. There are significantly fewer people, and they're handling more tickets. 

    I understand all these things, but I still interact with some of the top-notch Broadcom support. They will resolve the issue, then get off the call. I don't care if it takes three hours or 24 hours. I appreciate them, but, I am not seeing that approach with the technical support team I am currently working with.

    How would you rate customer service and support?

    Neutral

    How was the initial setup?

    We did the initial setup a while ago and then migrated from on-premise to DX SaaS. That process was relatively easy because we laid the groundwork well. We spent two months preparing ourselves and planning our strategy. Our deployment team includes around eight members offsite and one person on site. 

    Resources aren't a constraint. We have had this solution for five years, so we know the process in and out. We have a good team with the experience needed to handle the implementation and provide application support when needed.

    It took us around three years to deploy. It's more than simply deploying the agent. The whole goal of an APM in our environment is end-to-end transaction tracing. In the first year, we deployed all the agents. The second phase is end-to-end tracing, but we are unable to activate this for many applications. 

    Implementing APM aims to provide added value to the application team and the customer. If you want to show that, you need to have end-to-end visibility. Out-of-the-box end-to-end tracing isn't possible in Broadcom APM. You need to do a lot of configuration and identify missing components during the deployment. 

    What's my experience with pricing, setup cost, and licensing?

    I don't know the precise cost, but the licensing is based on the number of agents.

    Which other solutions did I evaluate?

    I have attended some presales presentations and training sessions for AppDynamics and Dynatrace.

    What other advice do I have?

    I rate DX SaaS seven out of 10. I have worked on DX APM for many years, and they are lagging behind the leading APM products. Many of the features readily available in other products are not there. At the same time, I am not saying it's a bad product. Every solution has its pain points. By that same token, I can't say Dynatrace is without issues. Unless you spend time working hands-on with a product, everything is theoretical.

    Which deployment model are you using for this solution?

    Public Cloud
    Disclosure: I am a real user, and this review is based on my own experience and opinions.
    Flag as inappropriate
    Suraj Branwal - PeerSpot reviewer
    Senior Analyst Monitoring and Reporting at Empower
    Real User
    Top 10
    Has glitch capturing and notification features, quickly triggers alerts, fast and user-friendly, and makes data available in a single UI
    Pros and Cons
    • "I found multiple features in AlertSite that I like. One is data monitoring history, for example, a history of the blackouts and the data. Sometimes, there's a glitch in the application that AlertSite can capture, and the tool notifies you about that glitch. It also identifies issues that could happen in the future. AlertSite is also a fast tool that you can use to monitor on-premise servers, internal applications, and even global applications. The data is available on a single UI in AlertSite, so there's no need to monitor and click on different applications or URLs. I also love that the tool is quite user-friendly, apart from being fast."
    • "One area that needs improvement in AlertSite is the slow UI. I'm using the classic UI because it's quite user-friendly, or in human-readable format, but it's a little slow. It takes time to load all the monitors, so if that could be sped up, that would be helpful. The AlertSite support team also needs improvement in terms of availability. Most of the agents in the team aren't available 24 x 7, so if there's something you need to discuss, you need to tell the support team ahead of time to schedule it. Multiple times, when a case comes up during weekends or whenever my team is doing an update, support needs improvement, availability-wise. I use AlertSite for synthetic monitoring, so whatever is available right now is okay, but if I want to request a new feature on a website, currently, the UI refreshes automatically. I would like to have the option on AlertSite to configure that manually. The UI auto-updates every few minutes, so if I can control that manually, that would be helpful."

    What is our primary use case?

    My team uses AlertSite for multiple types of monitoring requirements. The main purpose of the tool is synthetic monitoring where you navigate to the website login, find some patterns, then save the searches.

    AlertSite also has a feature called Deja-Click where you can create a recording and configure it to the website, and based on the timeframe, it executes each recording every time, so in case there are some changes in the UI, it gets stuck and it will notify you.

    AlertSite is also being used for URL checks, TCP connections, API monitoring, HTTP monitoring, etc., depending on internal applications.

    How has it helped my organization?

    AlertSite benefits our company because we have configured or integrated it with PagerDuty for the alerts, especially because there are specific critical applications we need to check every minute or two, based on our requirements. These critical alerts are configured on AlertSite via PagerDuty, then connected to ServiceNow, so whenever there are issues, alerts are triggered very quickly. Everyone in our team uses AlertSite, and anyone on call finds the tool very useful.

    What is most valuable?

    I found multiple features in AlertSite that I like. One is data monitoring history, for example, a history of the blackouts and the data. Sometimes, there's a glitch in the application that AlertSite can capture, and the tool notifies you about that glitch. It also identifies issues that could happen in the future.

    AlertSite is also a fast tool that you can use to monitor on-premise servers, internal applications, and even global applications.

    The data is available on a single UI in AlertSite, so there's no need to monitor and click on different applications or URLs.

    I also love that the tool is quite user-friendly, apart from being fast.

    What needs improvement?

    One area that needs improvement in AlertSite is the slow UI. I'm using the classic UI because it's quite user-friendly, or in human-readable format, but it's a little slow. It takes time to load all the monitors, so if that could be sped up, that would be helpful.

    The AlertSite support team also needs improvement in terms of availability. Most of the agents in the team aren't available 24 x 7, so if there's something you need to discuss, you need to tell the support team ahead of time to schedule it. Multiple times, when a case comes up during weekends or whenever my team is doing an update, support needs improvement, availability-wise.

    I use AlertSite for synthetic monitoring, so whatever is available right now is okay, but if I want to request a new feature on a website, currently, the UI refreshes automatically. I would like to have the option on AlertSite to configure that manually. The UI auto-updates every few minutes, so if I can control that manually, that would be helpful.

    For how long have I used the solution?

    I've been using AlertSite for more than five years.

    What do I think about the stability of the solution?

    AlertSite is a stable and reliable tool, though there might have been a few times it wasn't, most of the time, it's reliable, and it works.

    What do I think about the scalability of the solution?

    AlertSite is quite good in terms of scalability. It's already been scaled in my company, and I found it very easy to scale. Every hardware has a licensed capacity, and my team can increase the number of hardware based on the requirements, and at this level, the tool is quite easy to scale.

    How are customer service and support?

    In terms of support, AlertSite has documentation, though it's not completely updated, I find most of the details to be available, and people can try and sort things out themselves, particularly for basic issues.

    The issue is the response time of AlertSite technical support during weekends. On weekdays, the response time is quick, and there's an on-call number you can contact, but normally, my team does updates during weekends and at midnight because during midnight, people are in bed and sleeping, so you need to make the AlertSite support team aware that you'll be doing an update and that you need support, so you'll have to schedule with the team and that takes time.

    How was the initial setup?

    In terms of setting up AlertSite, for the first version, my team didn't have to do anything because that was handled by the support team. It was just handed over to us after it was set up.

    As for the on-premise servers, there was a private node that required my team to work with the AlertSite support team to receive the hardware in the data center. After the hardware was received, my team had to work with the network security team to open the firewall for the URL sensor because the server appliance was in the data center, so the firewall needs to be opened, and after that, it was pretty straightforward.

    The AlertSite team had the document, so whatever was required such as codes, was given, then my team worked with the networking team to get it open, then we could connect and use the tool.

    I was the only one on my team who did most of the updates, installation, and private nodes. AlertSite deployment only requires one person for an hour or two of work.

    How long the initial setup for the tool would take depends on the availability of the internal team, for example, there's a required SLA for the firewall, so this depends on the internal team. It would take two to three days to receive the hardware, and once you receive the hardware, it would be taken to the data center and configured there, etc.

    What's my experience with pricing, setup cost, and licensing?

    The management team handles the licensing policy for AlertSite. I have some billing details, but I'm unable to share them.

    Which other solutions did I evaluate?

    I've focused on enterprise monitoring for the last nine years, so I've worked on multiple monitoring tools such as AppDynamics and Splunk apart from AlertSite, and currently, my team is checking on Elasticsearch as well.

    AppDynamics, NudgeAPM, and AlertSite are used for different types of monitoring, so when you're using AppDynamics, the things you can do with it can't be done on AlertSite because AlertSite is used for synthetic monitoring, and you can't do that on AppDynamics as well. All of the tools have different functionalities.

    For synthetic monitoring, for example, for your connections and checks, you can configure AlertSite, and the tool is more comfortable to use. My team just needs to configure AlertSite and hand it over to the application team, then it's easier for the team to monitor.

    What other advice do I have?

    I have experience with two versions of AlertSite. One is SaaS, with UXM, the older version in February. The other version is the private node, where you install private node servers in the data center, then my team connects to the UI to get all the monitoring data.

    In terms of AlertSite maintenance, the support team releases updates for the private node servers, so you have to keep that updated for vulnerability. Whenever a new update is released, you have to work on the software internally, and that's it. Sometimes, it's just a quick update that the AlertSite team does without requiring my team to do anything. My team just receives a notification from AlertSite.

    The enterprise monitoring team which I'm part of has eight members. Three of the members look into AlertSite issues, so when an application has an issue, the application team would contact my team to identify the error, so my team navigates the AlertSite monitor to get the root cause of the error.

    My team uses AlertSite daily, but the upgrades happen rarely, for example, just once every quarter depending on the release.

    My advice to anyone planning to implement AlertSite, especially if you want to install private nodes, is to get guidance from the support team and work with the ops team as well to get the implementation done without wasting time configuring it by yourself.

    I'm rating AlertSite eight out of ten because it needs improvement in UI performance and support.

    My company doesn't have a partnership with AlertSite.

    Which deployment model are you using for this solution?

    Hybrid Cloud
    Disclosure: I am a real user, and this review is based on my own experience and opinions.
    Flag as inappropriate
    Hani Khalil - PeerSpot reviewer
    Service Assurance , Senior Manager at a computer software company with 1,001-5,000 employees
    Real User
    Top 5
    Great visibility, easy to set up, and has very responsive technical support
    Pros and Cons
    • "Sometimes when we face issues with the new technologies or very old technologies where we cannot enhance the service, they move to work with us directly and start doing some development on this area which is very good for us."
    • "The solution needs to enhance the management dashboards."

    What is our primary use case?

    We publish more than 140 business services to the whole Kingdom of Saudi Arabia. We need to monitor the behavior of service from the customer's perspective which is called real user monitoring. This solution gives us visibility on that. In the same tool, there's another part for diagnostics where we can drill down to the code level and find issues with the code or the data to see, for example, web services code. 

    How has it helped my organization?

    We have a service serving the whole Kingdom of Saudi Arabia. Anyone who wants to do some transaction on vehicles, or anything related to public relations, needs to log into the portal. It's made it easy and enhanced protection. 

    Another way it benefits the organization is in the number of calls from the customers. Before getting reports from the customers about when service is interrupted, we now know from eG any issues are and the team direct can drill down to find the issue. We can find out if it's from our applications, our data center, or from a third-party integration as it integrates with many other companies and ministries in Saudi Arabia.

    In this way, we reduce the time downtime for the service, the number of impacted users, and the issues the customer will face.  It allows us to solve the issues faster.

    What is most valuable?

    The visibility and the ability to monitor user behavior are very useful to us. So is the fact that we have diagnostic capabilities. We divided the usage of these two parts - one for our business team and one for our support team where they can see the availability, performance, and application or service updates from a customer perspective. 

    We appreciate that the team can understand the issue from our application, database, code, data integration, et cetera. 

    We like that two different teams can use it and it fits each of their individual needs. 

    Sometimes when we face issues with the new technologies or very old technologies where we cannot enhance the service, they move to work with us directly and start doing some development on this area which is very good for us.

    The initial setup is pretty simple. 

    The solution can scale.

    Technical support has been great.

    What needs improvement?

    The solution needs to enhance the management dashboards. They have small dashboards. You can do small customizations on it, however, when you have business requirements that require executive dashboards it will cost us too much. They need to do a development like this. 

    We are looking to have tenure-based services covered in eG. They promised us this in the future.

    For how long have I used the solution?

    I've been aware of the solution for 20 years. We recently did a POC and we've been using it for about a year or so now. 

    What do I think about the stability of the solution?

    The solution has been very stable. We faced one issue related to volume traffic where we increased the JVM heap on the servers and some resources then it became stable so stability is great for us.

    What do I think about the scalability of the solution?

    The scalability is quite good. We do have plans to expand usage in the future.

    The main thing here is we have something called connectors. These must be published over the internet to connect as much and we're able to select the services for each connector which means it's scalable for us.

    How are customer service and technical support?

    Technical support is very responsive. If there's a critical issue they connect directly to you so on the same day and do their best to solve it. In the case there needs to be some development or escalation they'll do it on the same day.

    Which solution did I use previously and why did I switch?

    We previously had Dynatrace. It was costly already and then they changed the licensing scheme and made it very, very, very expensive. It cost us almost $3 million Saudi Riyal a year to have it for our own services which is why we decided to switch.

    How was the initial setup?

    The initial setup is not complex. It's almost easy to implement - especially the parts of real user monitoring where just you need to generate a tag for each service and share it with the development team. 

    We have a team for monitoring. We manage almost 10 tools, one of them being eG. We have one primary administrator and one backup, and both did the implementation together. It doesn't take too much time actually. Once you know the procedure it's okay.

    The deployment took about two weeks.

    What about the implementation team?

    We did the implementation by ourselves based on the best practices from the data center. We asked them just to do a health check for us just to see how things looked overall as sometimes you need to do some calculations in terms of the traffic volume that you're going to receive and how many agents will be reporting to your system. You need to make sure everything is accurate and it's best to have a second opinion.

    What was our ROI?

    We definitely have seen an ROI. We, for example, cut the costs by using the tool. We also reduced the number of calls. This was one of the main value-adds for our operations department. We were able to cut service requests by 75%.

    What's my experience with pricing, setup cost, and licensing?

    We paid about 300,000 Saudi Riyal for the solution and it was quite affordable compared to the competition. That's less than $100K a subscription for almost 150 services or 100 system licenses where one portal is a license and looks like the agent is a license. We can vary almost 30 portals until with 10 integration services. We utilize only 85 agents or 85 people and under this license, we can manage more.

    If a user wants to try the solution, they can do a one-year license and, if they decide to continue, can do a perpetual license.

    If you need a professional service from their side, it costs nothing extra.

    Which other solutions did I evaluate?

    I have done POCs on my tools and I found this solution to be one of the easiest tools - even in the administration part.

    I tested AppDynamics, Dynatrace OneAgent, and Micro Focus ADM. This we have on-premise. We have one tool, Catchpoint, on the cloud. It's quite easy and not very technical due to the fact that it was on the cloud.

    One of the main reasons why we moved with eG is not only due to the fact that it's on-premise deployment but due to the cost. When we selected the tool we were at a level where we needed to cut the cost for the whole operations department. We could not move with Dynatrace or AppDynamics as it's close to half a million to cover the whole services where eG only cost us around 300,000 Saudi Riyal and would cover 85 to 90% of the business requirements. We saw everything we needed at that time and we got everything at a proper cost. 

    What other advice do I have?

    We're just a customer.

    If the solution is right for a company depends on the environment and if your environment is very complex, like the one we have with many security items. You need to test everything before buying it actually.

    From the almost six months of doing many POCs on other tools to find a good one for us, I noticed that 95% of these tools are using the same technology and same behavior, however, you need to know your business requirements to decide which would work best for you. If you need just to decide based on the fancy interface you have, you will pay a lot of money for this. However, if you need just to get the value of the technical team and then you present it in any way you need it, this product would be better for you.

    I'd rate the solution at an eight out of ten overall.

    Which deployment model are you using for this solution?

    On-premises
    Disclosure: I am a real user, and this review is based on my own experience and opinions.
    RangaNathan - PeerSpot reviewer
    Technical Consultant at a manufacturing company with 5,001-10,000 employees
    Real User
    The best full stack observability compared to any other tool
    Pros and Cons
    • "For full stack observability, Elastic is the best tool compared with any other tool ."
    • "Elastic APM's visualization is not that great compared to other tools. It's number of metrics is very low."

    What is our primary use case?

    Elastic APM is a kind of log aggregation tool and we're using it for that purpose. 

    What is most valuable?

    Elastic APM is very new so we haven't explored much on it, but it's quite interesting. It comes with a free offering included in the same license. So we are looking to explore more. It is still not as mature as other tools like Kibana, AppDynamics or New Relic products related to application performance monitoring. Elastic APM is still evolving, but it's quite interesting to be able to get all the similar options and features in Elastic APM.

    What needs improvement?

    In terms of what could be improved, Elastic APM's visualization is not that great compared to other tools. It's number of metrics is very low. Their JVM metrics are much less while running on CPU memory and on top of that you get a thread usage. They're not giving much on application performance metrics. In that respect, they have to improve a little bit. If you compare that with other tools, such as New Relic, which is also not giving many insights, it would be good to get internal calls or to see backend calls. We are not getting this kind of metric.

    On the other hand, if you go to the trace view, it gives you a good backend calls view. That backend call view is also capturing everything, and we need some kind of control over it, which it does not have. For example, if I don't want some of the sequence selected, there should be controls for that. Moreover you need to do all these things manually. Nowadays, just imagine any product opted to do conservation manually, that would be really disastrous. We don't want to do that manually. For now this needs to be either by API or some kind of automated procedure. If you want to install the APM Agent, because it is manual we would need to tune it so that the APIs are available for the APM site. That's one drawback.

    Additionally, the synthetic monitoring and real user monitoring services are not available here. Whereas in New Relic the user does get such services.

    The third drawback I see is the control site. For now, only one role is defined for this APM. So if I want to restrict the user domain, for example, if in your organization you have two or three domains, domain A, domain B, domain C, but you want to give access to the specific domain or a specific application, I am not sure how to do that here.

    Both the synthetic and process monitoring should be improved. For the JVM, Java Process Monitoring, and any process monitoring, they have to have more metrics and a breakdown of the TCP/IP, and the tools are giving me - they don't provide many metrics in size. You get everything, but you fail to visualize it. The New Relic only focuses on transactions, and Elastic APM also focuses on similar stuff, but I am still looking for other options like thread usage, backend calls, front end calls, or how many front end and backend calls. This kind of metric is definitely required.

    We don't have much control. For example, some backend calls trigger thousands of prepared statements, update statements, or select statements, and we don't have any control. If I only want select statement, not update statements, this kind of control should be there and properly supplied. The property file is very big and it is still manual, so if you want control agent properties you need UI control or API control. Nowadays, the world is looking for the API site so they'll be able to develop more smartly. They are looking for these kinds of options to enrich their dashboard creation and management.

    For how long have I used the solution?

    I'm new to Elastic APM, but I do have very good APM knowledge since I have been using APM almost 10 years and Elastic APM for just two years. I see that Elastic APM is still evolving.

    How are customer service and technical support?

    Elastic APM's technical support is pretty good and we have a platinum license for log aggregation. They respond very quickly and they follow a very good strategy. They have one dedicated resource especially for us. I'm not sure if that is common for other customers, but they assigned a very dedicated resource. So for any technical issue a dedicated resource will respond. Then, if that resource is busy or not available someone will attend that call or respond with support. In that way, Elastic support fully understands your environment.

    Otherwise, if you go with the global support model, they have to understand your environment first and keep asking the same question again and again. How many clusters do you have, what nodes do you have, these kind of questions. Then you need to supply that diagnosis. This is a challenge. If they have a dedicated or a support resource they usually don't ask these questions because they'll understand your environment very well because they have worked with you on previous cases. In that sense they provide very good support and answer the question immediately.

    They provide immediate support. Usually they get back you the same or the next day. I think it's pretty good compared to any other support. It was even very good compared to New Relic.

    What other advice do I have?

    There are two advantages to Elastic APM. It is open source and if somebody wants to try it out in their administration it's free to use. Also, it has full stack observability. For full stack observability, Elastic is the best tool compared with any other tool like New Relic or AppDynamics or Dynatrace. I'm not sure about Dynatrace, since I never worked with it, but I have worked with AppDynamics and New Relic. However, with their log aggregation side, there is still a lot to get implemented here.

    I'd like bigger flexibility. That means we would get all the system logs, all the cloud logs, all the kinds of logs aggregated in a single location. On top of that, if they could have better metrics for handling data together it would give a greater advantage for observability. The Observability platform is pretty good because you already have logged data and information like that. If you just add APM data and visualize, you will get much needed information. How are you are going to visualize and how are you going to identify the issues?

    For this purpose, Elastic is best. If you are really looking for an observability platform, Elastic provides both of these two options, APM plus log aggregation. But still they have to improve or they have to provide APIs for synthetic monitoring, internet monitoring, etc... If I think about synthetic monitoring, you can't compare New Relic with Elastic today. Elastic is much better.

    These are the improvements they have to look at. They support similar functionalities of synthetic monitoring, so it's not a hundred percent APM friendly, but if you look at their observability platform, their full stack observability together with their log aggregation, Elastic APM is a greater advantage.

    On a scale of one to ten, I would rate Elastic APM an eight out of 10.

    Disclosure: I am a real user, and this review is based on my own experience and opinions.
    Buyer's Guide
    Application Performance Management (APM)
    November 2022
    Get our free report covering Dynatrace, Datadog, Splunk, and other competitors of AppDynamics. Updated: November 2022.
    655,465 professionals have used our research since 2012.