Datadog Primary Use Case
We use Datadog for log aggregation and management, metrics aggregation, application performance monitoring, infrastructure monitoring (serverless (Lambda functions), containers (EKS), standalone hosts (EC2)), database monitoring (RDS) and alerting based on metric thresholds and anomalies, log events, APM anomalies, forecasted threshold breaches, host behaviors and synthetics tests.
Datadog serves a whole host of purposes for us, with an all-in-one UI and integrations between them built in and handled without any effort required from us.
We use Datadog for nearly all of our monitoring and information analysis from the infrastructure level up through the application stack.
View full review »We primarily use the solution for a variety of purposes, including:
- Watching RUM data for frontend site, using LCP and INP metrics to compare across the old and new architecture to inform rollout decisions.
- Watching APM data for backend services, observing how the backend server reacts (CPU util, memory, requests/second) to make sure the backend can handle the load.
- Using Datadog CCM during our free trial period to get visibility over our AWS spend across accounts and resources and looking at recommendations and acting on those.
- Browsing the service catalog to look at the current state of services that are running and what resources it uses.
We use Datadog for monitoring and observing all of our systems, which range in complexity from lightweight, user-facing serverless lambda functions with millions of daily calls to huge, monolithic internal applications that are essential to our core operations. The value we derive from Datadog stems from its ability to handle and parse a massive volume of incoming data from many different sources and tie it together into a single, informative view of reliability and performance across our architecture.
View full review »Buyer's Guide
Datadog
July 2025

Learn what your peers think about Datadog. Get advice and tips from experienced pros sharing their opinions. Updated: July 2025.
861,490 professionals have used our research since 2012.
We have multiple nodes integrated into our Azure infrastructure and our AKS clusters. These nodes are integrated with traces (as APM hosts).
We also have infrastructure Hosts integrated to see the metrics and the resources of each hosts mainly for Azure VMs and AKS nodes. Additionally, we also have hosts from our VMs in Azure which act as Activemq and we integrate them as messaging queues to show up in the Activemq dashboard.
We have recently added Activemq as containers in the AKS and we are also integrating those as messaging queues to show up in the Activemq dashboard integration
View full review »We utilize Datadog to monitor both some legacy products and a new PaaS solution that we are building out here at Icario which is Micro-Service arch.
All of our infrastructure is in AWS with very few legacies being rackspace. For the PaaS we mainly just utilize the K8s Orchestrator which implements the APM libraries into services deployed there as well as giving us infra info regarding the cluster.
For legacies, we mainly just utilize the Agent or the AWS integration. With APM in specific places. We monitor mainly prod in Legacy and the full scope in the PaaS for now.
View full review »The primary use case of Datadog within our organization encompasses providing a comprehensive and sophisticated solution that caters to the diverse needs of our internal customers. We have strategically implemented Datadog to serve as a centralized platform for monitoring, analyzing, and optimizing various aspects of our operations. With a robust suite of functionalities, Datadog empowers us to meet the dynamic requirements of over 40 internal customers efficiently.
Through Datadog, we offer a wide array of services to our internal stakeholders, allowing them to access and leverage its capabilities to enhance performance, troubleshoot issues, and make data-driven decisions. The tool's versatility enables different teams within our organization to monitor and track distinct metrics, such as application performance, infrastructure health, and logs, tailored to their specific requirements.
Moreover, Datadog serves as a pivotal component in our organizational ecosystem by streamlining processes, enhancing collaboration, and fostering a culture of data-driven decision-making. By harnessing the power of Datadog, our internal customers can proactively address issues, optimize resources, and ultimately improve operational efficiency across the board.
In essence, the primary use case of Datadog in our organization revolves around empowering our internal customers with a comprehensive and feature-rich solution that enables them to monitor, analyze, and optimize various aspects of our operations seamlessly and effectively. This strategic implementation of Datadog plays a vital role in enhancing our overall performance, fostering transparency, and driving continuous improvement within our organization.
View full review »We use Datadog across the enterprise for observability of infrastructure, APM, RUM, SLO management, alert management and monitoring, and other features. We're also planning on using the upcoming cloud cost management features and product analytics.
For infrastructure, we integrate with our Kube systems to show all hosts and their data.
For APM, we use it with all of our API and worker services, as well as cronjobs and other Kube deployments.
We use serverless to monitor our Cloud Functions.
We use RUM for all of our user interfaces, including web and mobile.
View full review »The primary use case for this solution is to enhance our monitoring visibility, determine the root cause of incidents, understand end-user behaviour from their point of view (RUM), and understand application performance.
Our technical environment consists of a local dev env where Datadog is not enabled, we have deployed environments that range from UAT testing with our product org to ephemeral stacks that our developers use to test there code not on there computer. We also have a mobile app where testing is also performed.
We use Datadog for monitoring the performance of our infrastructure across multiple types of hosts in multiple environments. We also use APM to monitor our applications in production.
We have some Kubernetes clusters and multi-cloud hosts with Datadog agents installed. We have recently added RUM to monitoring our application from the user side, including replay sessions, and are hoping to use those to replace existing monitoring for errors and session replay for debugging issues in the application.
View full review »We use the solution to monitor and investigate issues with production services at work. We're periodically reviewing the service catalog view for the various applications and I use it to identify any anomalies with service metrics, any changes in user behavior evident via API calls, and/or spikes in errors.
We use monitors to trigger alerts for on-call engineers to act upon. The monitors have set thresholds for request latency, error rates, and throughput.
We also use automated rules to block bad actors based on request volume or patterns.
View full review »The current use case for Datadog in our environment is observability. We use Datadog as the primary log ingestion and analysis point, along with consolidation of application/infrastructure metrics across cloud environments and realtime alerting to issues that arise in production.
Datadog integrates within all aspects of our infrastructure and applications to provide valuable insights into Containers, Serverless functions, Deep Logging Analysis, Virtualized Hardware and Cost Optimizations.
View full review »We are using Datadog to improve our monitoring and observability so we can hopefully improve our customer experience and reliability.
I have been using Datadog to build better actionable alerts to help teams across the enterprise. Also by using Datadog we are hoping to have improved observability into our apps and we are also taking advantage of this process to improve our tagging strategy so teams can hopefully troubleshoot incidents faster and a much reduced mean time to resolve.
We have a lot of different resources we use like Kubernetes, App Gateway and Cosmos DB just to name a few.
View full review »We use the solution to monitor production service uptime/downtime, latency, and log storage.
Our entire monitoring infrastructure runs off Datadog, so all our alarms are configured with it. We also use it for tracing API performance; what are the biggest regression points.
Finally we use it to compare performance on SEO metrics vs competitors. This is a primary use case as SEO dictates our position from google traffic which is a large portion of our customer view generation so it is a vital part of the business we rely on datadog for.
View full review »The product monitors multiple systems, from customer interactions on our web applications down to the database and all layers in between. RUM, APM, logging, and infrastructure monitoring are all surfaced into single dashboards.
We initially started with application logs and generated long-term business metrics out of critical logs. We have turned those metrics and logs into a collection of alerts integrated into our pager system. As we have evolved, we have also used APM and RUM data to trigger additional alerts.
View full review »FK
Franz Kettwig
Head of Software at Emporia
Our primary use case is for custom and vendor-supplied web application log aggregation, performance tracing, and alerting.
We run a mix of AWS EC2, Azure serverless, and colocated VMWare servers to support higher education web applications. We're managing a hybrid multi-cloud solution across hundreds of applications is always a challenge.
Datadog agents are on each web host, and we have native integrations with GitHub, AWS, and Azure to get all of our instrumentation and error data in one place for easy analysis and monitoring.
BH
reviewer907251
Product Engineering Manager at FMG Suite
We use the solution for APM, AWS, Lambda, logging, and infrastructure. We have many different things all over AWS, and having one place to look is great.
We have all sorts of different AWS things out there that are in C# and Node. Having a single place to log and APM into is very important to us.
Keeping track of the cloud infrastructure is also important. We have Lambda, containers, EC2, etc.
Having a super simple interface to filter the searching for APM and logging is great. It is super easy to show people how to use. This is super important to us.
View full review »FK
Franz Kettwig
Head of Software at Emporia
Our primary use case is custom and vendor-supplied web application log aggregation, performance tracing and alerting.
We run a mix of AWS EC2, Azure serverless, and colocated VMWare servers to support higher education web applications.
Managing a hybrid multi-cloud solution across hundreds of applications is always a challenge.
Datadog agents on each web host and native integrations with GitHub, AWS, and Azure get all of our instrumentation and error data in one place for easy analysis and monitoring.
Our primary use case is custom and vendor-supplied web application log aggregation, performance tracing and alerting.
We run a mix of AWS EC2, Azure serverless, and colocated VMWare servers to support higher education web applications.
Managing a hybrid multi-cloud solution across hundreds of applications is always a challenge. Datadog agents on each web host and native integrations with GitHub, AWS, and Azure get all of our instrumentation and error data in one place for easy analysis and monitoring.
View full review »We have a tech stack including all backend services written in TS/Node (mostly) and as a full stack engineer, it is crucial to keep track of new and existing errors. Our logs have been consolidated in Datadog and are accessible for search and review, so the service has become a daily tool for my work.
More recently, session replay has been adopted at my company, but I do not like it so much because the UI elements are not in their place, so it is very hard to see what the users on the web app are actually clicking on.
View full review »We were looking for an all-in-one observability platform that could handle a number of different environments and products. At a basic level, we have a variety of on-premises servers (Windows/Mac/Linux) as well as a number of commercial, cloud-hosted products.
While it's often possible to let each team rely on its own means for monitoring, we wanted something that the entire company could rally around - a unified platform that is developed and supported by the very same people, not others just slapping their name on some open source products they have no control over.
View full review »BS
reviewer254673
Senior Engineer at a retailer with 1,001-5,000 employees
Our primary use of Datadog involves monitoring over 50 microservices deployed across three distinct environments. These services vary widely in their functions and resource requirements.
We rely on Datadog to track usage metrics, gather logs, and provide insight into service performance and health. Its flexibility allows us to efficiently monitor both production and development environments, ensuring quick detection and response to any anomalies.
We also have better insight into metrics like latency and memory usage.
View full review »Our primary use case for this solution is comprehensive cloud monitoring across our entire infrastructure and application stack.
We operate in a multi-cloud environment, utilizing services from AWS, Azure, and Google Cloud Platform.
Our applications are predominantly containerized and run on Kubernetes clusters. We have a microservices architecture with dozens of services communicating via REST APIs and message queues.
The solution helps us monitor the performance, availability, and resource utilization of our cloud resources, databases, application servers, and front-end applications.
It's essential for maintaining high availability, optimizing costs, and ensuring a smooth user experience for our global customer base. We particularly rely on it for real-time monitoring, alerting, and troubleshooting of production issues.
We currently have an error monitor to monitor errors on our prod environment. Once we hit a certain threshold, we get an alert on Slack. This helps address issues the moment they happen before our users notice.
We also utilize synthetic tests on many pages on our site. They're easy to set up and are great for pinpointing when a bug is shipped, but they may take down a less visited page that we aren't immediately aware of. It's a great extra check to make sure the code we ship is free of bugs.
View full review »We use the solution for logs, infrastructure metrics, and APM. We have many different teams using it across both product and data engineering.
View full review »Our primary use case is custom and vendor supplied web application log aggregation, performance tracing and alerting. We run a mix of AWS EC2, Azure serverless, and colocated VMWare servers to support higher education web applications.
Managing a hybrid multi-cloud solution across hundreds of applications is always a challenge. Datadog agents on each web host, and native integrations with GitHub, AWS, and Azure gets all of our instrumentation and error data in one place for easy analysis and monitoring.
View full review »We are using Datadog to improve our cloud monitoring and observability across our enterprise apps. We have integrated a lot of different resources into Datadog, like Kubernetes, App Gateways, App Service Environments, App Service Plans, and other Web App resources.
I will be using the monitoring and observability features of Datadog. Dashboards are used very heavily by teams and SREs. We really have seen that Datadog has already improved both our monitoring and our observability.
View full review »We use the solution for monitoring microservices in a complex AWS-based cloud service.
The system is comprised of about a dozen services. This involves processing real-time data from tens of thousands of internet connected devices that are providing telemetry. Thousands of user interactions are processed along with real-time reporting of device date over transaction intervals that can last for hours or even days. The need to view and filter data over periods of several months is not uncommon.
Datadog is used for daily monitoring and R&D research as well as during incident response.
View full review »In our fast-paced environment, managing and analyzing log data and performance metrics is crucial. That’s where Datadog comes in. We rely on it not just for monitoring but for deeper insights into our systems, and here’s how we make the most of it.
One of the first things we appreciate about Datadog is its ability to centralize logs from various sources—think applications, servers, and cloud services. This means we can access everything from one dashboard, which saves us a lot of time and hassle. Instead of digging through multiple platforms, we have all our log data in one place, making it much easier to track events and troubleshoot issues.
View full review »We use the product for instrumentation, observability, monitoring, and alerting of our system.
We have multiple environments and a variety of pieces of infrastructure including servers, databases, load balancers, cache, etc. and we need to be able to monitor all of these pieces, while also retaining visibility into how the various pieces interact with each other.
Tracing data between components and user interactions that trigger these data flows is particularly important for understanding where problems arise and how to resolve them quickly.
View full review »We use the solution for APM, anomaly detection, resource metrics, RUM, and synthetics.
We use it to build baseline metrics for our apps before we start focusing in on performance improvements. A lot of times that’s looking at methods that take too long to run and diving into db queries and parsing.
I’ve used it in multiple configurations in aws and azure. I’ve built it using terraform and hand rolled.
I’ve used it predominantly with Ruby and Node and a little bit of Python.
View full review »Our primary use case is custom and vendor-supplied web application log aggregation, performance tracing and alerting.
We run a mix of AWS EC2, Azure serverless, and colocated VMWare servers to support higher education web applications. Managing a hybrid multi-cloud solution across hundreds of applications is always a challenge.
Datadog agents on each web host, and native integrations with GitHub, AWS, and Azure gets all of our instrumentation and error data in one place for easy analysis and monitoring.
View full review »TB
Traci Ortiz
Works at Koddi
We monitor our multiple platforms using Datadog and post alerts to Slack to notify us of server and end-user issues. We also monitor user sessions to help troubleshoot an issue being reported.
We monitor 3.5 platforms on our Datadog instance, and the team always monitors the trends and Dashboards we set up. We have two instances to span the 3.5 platforms and are currently looking to implement more platform monitoring over time. The user session monitoring is consistent for one of these platforms.
View full review »From day one, we have seamlessly integrated our new product into Datadog, a comprehensive monitoring and analytics platform. By doing so, we are continuously collecting essential data such as host information, system logs, and key performance metrics. This enables us to gain deep insights into product adoption, monitor usage patterns, and ensure optimal performance. Additionally, we use Datadog to capture and analyze errors in real-time, allowing us to troubleshoot, replay, and resolve production issues efficiently.
View full review »Our primary use case is custom and vendor-supplied web application log aggregation, performance tracing and alerting.
We run a mix of AWS EC2, Azure serverless, and colocated VMWare servers to support higher education web applications.
We're managing a hybrid multi-cloud solution across hundreds of applications, which is always a challenge. There are Datadog agents on each web host, and native integrations with GitHub, AWS, and Azure and that gets all of our instrumentation and error data in one place for easy analysis and monitoring.
View full review »Internally our primary usage of Datadog pertains around APM/tracing, logging, RUM (real user monitoring), synthetic testing of service/application health and state, overall general monitoring + observability, and custom dashboards for aggregate observability. We also are more frequently leveraging the more recent service catalog feature.
We have several microservices, several databases, and a few web applications (both external and internal facing), and all of these within our systems are contained within several environments ranging from dev, sit, eat, and production.
RL
reviewer902462
Senior Software Engineer at Clearstory.build
Our primary use case for Datadog is monitoring, analyzing, and optimizing the performance and health of our applications and infrastructure.
We leverage its logging, metrics, and tracing capabilities to pinpoint issues, track system performance, and improve overall reliability. Datadog’s ability to provide real-time insights and alerting on key metrics helps us quickly address issues, ensuring smooth operations.
It’s integral for visibility across our microservices architecture and cloud environments.
View full review »Our primary use case for Datadog involves utilizing its dashboards, monitors, and alerts to monitor several key components of our infrastructure.
We track the performance of AWS-managed Airflow pipelines, focusing on metrics like data freshness, data volume, pipeline success rates, and overall performance.
In addition, we monitor Looker dashboard performance to ensure data is processed efficiently. Database performance is also closely tracked, allowing us to address any potential issues proactively. This setup provides comprehensive observability and ensures that our systems operate smoothly.
View full review »RL
reviewer902462
Senior Software Engineer at Clearstory.build
Our primary use case for Datadog is to monitor, analyze, and optimize the performance and health of our applications and infrastructure.
We leverage its logging, metrics, and tracing capabilities to pinpoint issues, track system performance, and improve overall reliability.
Datadog’s ability to provide real-time insights and alerting on key metrics helps us quickly address issues, ensuring smooth operations. It’s integral for visibility across our microservices architecture and cloud environments.
View full review »We have several teams and several different projects, all working in tandem, so there are a lot of logs and monitoring that need to be done. We use Datadog mostly for alerting when things go down.
We also have several dashboards to keep track of critical operations and to make sure things are running without issues. The Slack messaging is essential in our workflow in letting us know when an alert is triggered. I also appreciate all the graphs you can make, as it gives our team a good overview of how our services are doing.
View full review »We're using the product for logging and monitoring of various services in production environments.
It excels at providing real-time observability across a wide range of metrics, logs, and traces, making it ideal for DevOps teams and enterprises managing complex environments.
The platform integrates seamlessly with our cloud services, but browser side logging is a little lagging.
Dashboards are very useful for quick insights, but can be time consuming to create, and the learning curve is steep. Documentation is vast, but not as detailed as I'd like.
View full review »Our primary use cases include:
- Alert on errors customers encounter in our product. We've set up logs that go to slack to tell us when a certain error threshold is hit.
- Investigate slow page load times. We have pages in our app that are loading slowly and the logs help us figure out which queries are taking the longest time.
- Metrics. We collect metrics on product usage.
- Session replays. We watch session replays to see what a user was doing when a page took a long time to load or hit an error. This is helpful.
We use Datadog as our main log ingestion source, and Datadog is one of the first places we go to for analyzing logs.
This is especially true for cases of debugging, monitoring, and alerting on errors and incidents, as we use traffic logs from K8s, Amazon Web Services, and many other services at our company to Datadog. In addition, many products and teams at our company have dashboards for monitoring statistics (sometimes based on these logs directly, other times we set queries for these metrics) to alert us if there are any errors or health issues.
View full review »The soluton is used for full stack enterprise performance monitoring for our primarily cloud-based stack on AWS. We have implemented monitoring coverage using RUM for critical apps and websites and utilize APM (integrated with RUM) for full stack traceability.
We use Datadog as our primary log repository for all apps and platforms, and the advanced log analytics enable accurate log-based monitoring/alerting and investigations.
Additionally, we some advanced RUM capabilities and metrics to track and optimize client-side user experience. We track SLO's for our critical apps and platforms using Datadog.
View full review »CW
Charlie W.
SecOps Engineer at Ava Labs
Our primary use case is custom and vendor-supplied web application log aggregation, performance tracing and alerting.
View full review »JE
Julie Eyer
Operations Manager at TodayTix
We utilize Datadog mainly to monitor our API integrations and all of the inventory that comes in from our API partners. Each event has its own ID, so we can trace all activity related to each event and troubleshoot where needed.
View full review »Our company deploys the solution for our customers as an observability tool to define SLOs and SLIs along with logs and metrics.
The solution includes incident, post-mortem, and root cause analysis that provides a level of truth for incidents and issues with applications.
We have SREs and teams in operations, management, and applications who all access to the solution and ensure proper integrations.
View full review »The solution is basically used for servers and applications.
View full review »I work in product design, and although we use Datadog for monitoring, etc, my use case is different as I mostly review and watch session recordings from users to gain insight into user feedback.
We watch multiple sessions per week to understand how users are using our product. From this data, we are able to hone in on specific problems that come up during the sessions. We then reach out to specific users to follow up with them via moderated testing sessions, which is very valuable for us.
View full review »SB
Siva Badmaraj
Delivery Manager, DBA Services at a manufacturing company with 10,001+ employees
We use Datadog for monitoring to get the traces and logs of all our applications. Datadog provides dashboard and alert capabilities to identify if something is wrong with various teams. More than 200 users, mostly software engineers, work with Datadog.
View full review »We are using a mixture of on-prem and cloud solutions to bridge the gap with healthcare entities in the service of providing patients with the medication they need to live healthy lives.
Since we're a heavily regulated company, a lot of our solutions grew from on-premises monoliths. However, as we scaled out, it became harder and harder to move forward with that architecture. Today, we're investing heavily in transforming our systems from monoliths into distributed systems.
With this change in mind, the ability for us to connect the dots using Datadog has been invaluable.
Our primary use case for Datadog is to monitor and manage our fully cloud-native infrastructure. We utilize DataDog to gain real-time visibility into our cloud environments, ensuring that all our services are running smoothly and efficiently.
The platform’s extensive integration capabilities allow us to seamlessly track performance metrics across various cloud services, containers, and microservices.
With Datadog’s robust alerting and visualization tools, we can proactively identify and resolve issues, minimizing downtime and optimizing our system’s performance. This has been crucial in maintaining the reliability and scalability of our cloud-native applications.
View full review »Datadog is mainly used to set up alerts and thresholds to monitor real-time metrics and checks.
View full review »I have been using Datadog products and capabilities increasingly over the last 4 years, from POC to widespread adoption.
The capabilities we use are unique for each use case and can be combined in various ways to provide the full observability coverage needed to maintain stable operations and shift from becoming more reactive to proactive.
Our organization uses both site/service reliability for the range of backend and frontend services, custom monitoring, and dashboards that can be dynamic and reused for multiple teams.
View full review »We deploy various services for our main platform on AWS across multiple regions. We have a development environment, a staging environment, a QA environment, and a production environment. We deploy our many services across hundreds of instances.
We have many server farms, all responsible for various services on our market intelligence platform. The deployment of each server farm or even individual instances varies depending on what stood up. We have instances built in three different ways, with two different pipelines and some even on user data scripts.
View full review »BH
Brian Hanuska
Architect at SEI Investments
We primarily use Datadog for:
- Native memory
- Logging
- APM
- Context switching
- RUM
- Synthetic
- Databases
- Java
- JVM settings
- File i/o
- Socket i/o
- Linux
- Kubernetes
- Kafka
- Pods
- Sizing
We are testing Datadog as a way to reduce our operational time to fix things (mean time to repair). This is step one. We hope to use Datadog as a way to be proactive instead of reactive (mean time to failure).
So far, Datadog has shown very good options to work on all of our operational and development issues. We are also trying to use Datadog to shift left, and fix things before they break (MTTF increase).
We use Datadog for three main use cases, including:
- Infrastructure and application monitoring. It is ensuring that our services are available and performant at all times. This allows us to proactively address incidents and outages without customers contacting us. This includes monitoring of cloud resources (databases, load balancers, CPU usage, etc.), high-level application monitoring (response times, failure rates, etc.), and low-level application monitoring (business-oriented metrics and functional exceptions to customer experience.
- Analyzing application behavior, especially around performance. We often use Datadog's application performance monitoring on non-production environments to evaluate the impact of newly introduced features and gain confidence in changes.
- End-to-end regression testing for APIs and browser-based experiences. Using Datadog's synthetic testing checks periodically that the system behaves in the exact correct way. This is often used as a canary to detect issues even before users reach them organically.
Datadog is a SaaS solution we tried for URL and synthetic monitoring. You record a transaction going into a website and replay that transaction from various locations. Datadog is mainly used by the admin, but three or four other guys had access to the reports and notifications, so it's five altogether.
We probably tried no more than 8 percent of what Datadog can do. There are so many other bits and modules. I've only gone into about half of what APM can do in the Datadog stack.
View full review »We primarily use the solution for application monitoring (APM, logs, metrics, alerts).
It's useful for active monitoring (static monitors, threshold monitors). We get a lot of value out of anomaly detection as well. SLOs and monitoring of SLOs have been another value add.
In terms of metrics, the out-of-the-box infrastructure metrics that come with the Datadog agent installation are great. We have made use of both the custom metrics implementation as well as the log-based metrics which are extremely convenient.
We also leverage Datadog for use of RUM and want to explore session replay.
View full review »We use the solution primarily for distributed tracing, service insight and observability, metrics, and monitoring. We create custom metrics from outbound service calls to trace the availability of back-office systems.
We use the flame graph to get insights into our GraphQL implementation. It helps highlight how resolvers work.
However, it's lacking in tracing which GraphQL queries are run, and we use custom spans for that.
View full review »We are using the solution for migrating out of the data center. Old apps need to be re-architected. We are planning on moving to multi-cloud for disaster recovery and to avoid vendor lockouts.
The migration is a mix between an MSP (Infosys) and in-house developers. The hard part is ensuring these apps run the same in the cloud as they do on-premises. Then we also need to ensure that we improve performance when possible. With deadlines approaching quickly it's important not to cut corners - which is why we needed observability
View full review »We use this solution for enterprise monitoring across a large number of applications in multiple environments like production, development, and testing. It helps us track application performance, uptime, and resource usage in real time, providing alerts for issues like downtime or performance bottlenecks.
Our hybrid environment includes cloud and on-premise infrastructure. The solution is crucial for ensuring reliability, compliance, and high availability across our diverse application landscape.
View full review »Our company has a microservice architecture, with different teams in charge of different services. Also, it is a start, which means that we have to build fast and move very fast as well. So before we were properly using DD, we often had issues of things breaking, but without much information on where in our system the breaking happened. This was quite a big-time sync as teams were unfamiliar with other teams' codes, so they needed the help of other teams to debug. This slowed our building down a lot. So implementing dd traces fixed this
View full review »We use it to monitor and alert our ECS instances as well as other AWS services, including DynamoDB, API Gateway, etc.
We have it connected to Pagerduty for alerting all our cloud applications.
We also use custom RUM monitoring and synthetic tests for both our internal and public-facing websites.
For our cloud applications, we can use Datadog to define our SLOs, and SLIs and generate dashboards that are used to monitor SLOs and report them to our senior leadership.
View full review »We use the product for recording loggers on our various services across different teams. For example, we use logs to keep track of info logs for events and error logs to catch exceptions.
When users ask us to investigate a situation, we use logs to keep track of events and where the user's code traveled to. We also use synthetic testing and monitoring features to keep track of our many alerts in the production and QA environments.
View full review »We use Datadog to view and aggregate logs and monitor all of our services. We have a lot of running infrastructure and it is very convenient to have logs and metrics all aggregated somewhere we can view and chart them.
I use Datadog to create dashboards and runbooks, and sharable graphs, which really help out my whole team. We mostly use logs and APM, yet have been starting to use other products. I would like to use more synthetic monitors.
View full review »We use the solution for monitoring, logging, and alerts.
Thanks to Datadog, we report errors using the logger integrated into our services, which is crucial since we only do unit tests. The infrastructure team handles the monitoring part, so I can't give more insights about that. I am an API developer, so I use Datadog mainly for logging.
The alerts are connected to Microsoft Teams in a specific channel, and we pay a lot of attention to it, and we usually create tickets based on these alerts.
View full review »We use Datadog for observability and system/application health, mainly for product support, triaging, debugging, and incident responses.
We use a lot of the logging and the Datadog agent to collect logs, metrics, and traces from our GKE workloads. We use APM and continuous profiling for latency and performance measurement. We use RUM to observe frontend user events, such as tracing on request and what actions they take before errors occur. We also use error tracking and source maps to debug production failures.
We are still relatively new to the product, and we are planning to use more of the notebook functionality and power packs to record run books and break knowledge silos. We also need to utilize dashboards and continuous profiling more for performance measurement and integrate Datadog alerts for incident response.
This solution is for physical device monitoring across breweries, including PLCs, HMI Cameras, RFID panels, scales, etc. We want to gain visibility into these devices to influence predictive maintenance and unscheduled downtime. We want to monitor physical devices across the zone from a control tower perspective for end users and support teams alike. Understanding more about the performance of the devices and mechanical components will allow us to schedule downtime to fix imminent catastrophic failures and prevent unplanned downtime and lost revenue.
View full review »We primarily use the solution for the service catalog.
We use this type of offering for our Microservices applications, and it gives a good view of flow. It is a must when we have different developers working on different services.
Having the trace and log features are useful for locating the microservice for the on-call person.
We would like to see some more useful applications for health monitoring where we can customize the cases based on data from the database.
It needs to have the facility to monitor data inside tables and the status of the UI.
View full review »We use this solution for our customer's IP and to support their cloud infrastructure.
SE
Sean Edgeington
Director of IT at a consumer goods company with 201-500 employees
I used Datadog typically for monitoring website statistics and some of the cloud networking equipment.
View full review »We primarily use the solution for centralized dashboarding and telemetry viewing for teams across the organization.
We're focused on ensuring that both development teams and leadership can reasonably gain insights into the status of various systems.
At the end of the day, managing various dashboards and metrics aggregators like Prometheus, Kubernetes server, AWS Cloudwatch, and Grafana have lead to some confusion, and we've had issues with teams not knowing where their data exists and where they can view their system metrics.
Datadog has proven to be easy to set up and legible for both development and operational teams.
View full review »We mainly use the product to monitor our infrastructure and apps. It is the go-to tool when we want to check that things are running properly. We use Datadog synthetic monitors to ensure our app works across different locations in the United States.
We also have set up Datadog monitors to send alerts if things stop working as expected.
We use Continuous Integration Pipeline visibility to make sure our developers are not being blocked by infrastructure and other things that might be out of their control.
View full review »We primarily use Datadog for alerts. If we're running out of database connections or CPU credits we want to find out in Slack. Datadog provides nice features for that.
Secondarily, we use Datadog for analyzing historical trends and forecasting potential issues.
I'm trying to learn how to add in Continuous Profiler in our primary backend servers and set up Synthetic Tests for monitoring our front end.
Everything is mostly on AWS, and the Datadog integrations help a ton.
View full review »Our use case is to provide cloud organization application monitoring. I use it for insight into what host in what region has activity or what market is using Datadog to its fullest potential and utilizing that for cost. This may also help determine who is using monitoring and setting alerts or just setting up monitoring and not doing anything about it. The use case can also be to check when the host or applications are down, or if the usage of CPU, memory, etc, is too high.
View full review »The main use cases are to provide visibility to costs for each product in the company as well as to consolidate all the observability in one tool. We are moving the team from being an operational team that needs to keep the tool up and running (applying patches and resolving problems) to a team that is focused on providing meaningful visibility of the systems, applications, and services of the company. We want to add value where the developers and the systems administrators are not able to focus.
View full review »We currently use it for log aggregation and SEIM. We send logs from our AWS account (particularly our Cloudtrail and S3 logs) and use them to give us security signals.
This has helped with our SOC2 certification process and has given us a window into our processes and the security holes in our system.
We are also considering using the APM features to help with our development effort. We want to be able to profile all of our code and see what is going on with it.
View full review »We use the solution primarily for platform monitoring for the services that are deployed in AWS. It gives a better way to monitor the services, including pods, cost, high availability, etc. This way, observability is ensured and also customer services are uninterrupted.
Also, we host the data pipelines between the cloud and the on-prem for which Datadog is used to ensure better services. We report issues based on the metrics reported over it.
View full review »We primarily use the solution for log management and application performance monitoring. We have been getting into using more solutions on Datadog, such as runbooks, monitoring, and dashboards.
Another area that we've been investing some time in is the database monitoring. We've been able to get some relatively new employees onboarded into the tool, and they've been able to create some meaningful dashboards and reports without too much hand-holding at all.
We plan on exploring the synthetics solution as well.
View full review »We are trying to get a handle on observability. Currently, the overall health of the stack is very anecdotal. Users are reporting issues, and Kubernetes pods are going down. We need to be more scientific and be able to catch problems early and fix them faster.
Given the fact that we are a new company, our user base is relatively small, yet growing very fast. We need to predict usage growth better and identify problem implementations that could cause a bottleneck. Our relatively small size has allowed us to be somewhat complacent with performance monitoring. However, we need to have that visibility.
The product is used for APM solutions for the metrics and traces for the REST API requests and service maps to understand the upstream and downstream services.
We are creating dashboards and widgets to monitor the status. We are creating alerts and monitors as well. We integrated the alerts and ticketing system in our organization with SNOW and Netcool.
We are using Kubernetes, AWS, and infrastructure metrics. We are using Kafka and Aurora Postgres logs as well, and we are using HTTP status codes to identify the error types.
We use the solution for application hosting and a little bit of everything when it comes to supporting a worldwide logistics tracking service. It's used as a central service for collecting telemetrics and logs. We find it does the same work as all of our old tools combined, including Prometheus, Kibana, Google Logs, and more; putting all of this information in a single platform makes it easy to corroborate information and associate a request with the data, which might be lost when it is saved as logs.
View full review »We are using the solution for migrating out of the data center. Old apps need to be re-architected. We plan to move to multi-cloud for disaster recovery and avoid vendor lockouts. The migration is a mix between an MSP (Infosys) and in-house devs. The hard part is ensuring these apps run the same in the cloud as they do on-prem. Then we also need to ensure that we improve performance when possible. With deadlines approaching quickly, it is important not to cut corners which is why we needed observability.
View full review »We share dashboards, set up alerts, and monitor everything that happens in our system. We use it in staging, features, production, and our load test environment. It is exceptionally helpful for making our engineering more data-driven.
I came from a company that believes we should focus on being telemetry driven. Instilling this in a smaller, less mature engineering organization has been challenging. However, it is much easier while using Datadog.
View full review »We primarily use the solution for log monitoring across our entire cloud infra (EB, EC2, Batch, and Lambda).
This is in addition to Rstudio Workbench, which has its own logs that would not be picked up via Cloudwatch(https://docs.rstudio.com/ide/server-pro/server_management/logging.html#default-log-file-locations).
We own several dozen of these servers, and we used to manage instance logs by tailing logs when incidents occurred. Datadog allows for much better visibility across our entire fleet and has saved us countless hours.
My customers were using Datadog for monitoring purposes. They were using it only because the solution is running on AWS and it's a microservices-based solution. They were using an application called Dynatrace for their log.
We primarily use the solution for logging and APM, and for real user metrics.
View full review »The product is primarily used for the DevOps team.
View full review »We use the solution for monitoring our logs across distributed clusters. Right now, we have an Elasticsearch solution that is tied to each platform (our product is a PaaS solution).
We are looking at moving to a single pane of glass solution, which Datadog would be good for (plus, we could wrap up other tools like Prometheus, Grafana, Pagerduty, Pingdom, and more). We want to be able to have Datadog running on one single cluster and ingesting and processing logs from all our distributed clusters.
View full review »We use different tools for log collection and monitoring. Using Datadog will combine different use cases into one product that will be easier to manage.
The tools we use are open-source, so there is no commercial support. Having customer support would be ideal since we're a small team.
Profiling would be another great feature to have. Currently, it's manual. Having Datadog would give us a standard, and we don't have to do much manual work.
View full review »We are using Datadog for server metrics, log aggregation and searching, system monitoring, alerting the team about errors, and dashboards for our developers. It's used by the Site Reliability Engineering team and Management of all levels.
It's assisting us in proving SOC II compliance.
We're looking to improve our usage of Datadog's RUM and APM components to get better and more performance insights on our production environments.
We're also looking to leverage more synthetic monitors and runbooks for anyone responding to incidents.
View full review »We use Datadog for general observability into our infrastructure, as well as running analytics queries for our SLI/SLO platform. This helps all of our teams be informed of how well their products are actually performing in production, and aim their efforts at the thing that will provide the highest ROI.
We also use it for general monitoring and alerting during load tests and service releases to detect any issues related to the deployments. This helps us maintain our high contractual uptime promises to our clients.
View full review »We collect all data logs from all operating systems, such as Windows, Linux, VMware, and bare metal data centers. We also automatize the installation of the agent on servers.
Now we are starting a POC to analyze the APM module. In the feature, the next step is to do a POC of security modules.
The final idea is to have a unique portal for observability. This will make it easy to troubleshoot and for layer levels 1 and 2.
View full review »We use Datadog for observability and monitoring primarily. Various cross-functional teams have built various dashboards, including Developers, QA, DevOps, and SRE.
There are also some dashboards created for senior leadership to keep tabs on days to day activities like cost, scale, issues, etc.
Also, we've set up monitors and alarms that kick off when any metrics go beyond the threshold. With Slack and PagerDuty integration, correct team members get alerted and react to solve the issue based on various runbooks.
View full review »I use the solution to manage security-related logs and metrics, as well as create detection rules for security events. I am a security engineer, so one area of interest is the CSPM product, giving us the ability to look at findings across the cloud environment.
The great part about the Datadog security products is that they incorporate the context of the resources/hosts where the security event is found. This allows us to see exactly what is running on a host that we see as a security alert.
View full review »I primarily use the solution to learn, watch and monitor business and engineering metrics in the production and QA environments of my team.
We create monitors on key business metrics and observe regressions and anomalies.
Less often, I leverage the events ability in Datadog to get notified about significant activities happening in my teams' deployments.
We learn about Datadog monitor alerts through Slack and often attempt to create SLOs using Terraform.
We use APM for observability.
Most recently, I learned about WatchDog Alerts that I will be heavily looking into.
View full review »We primarily use the solution for observability.
View full review »We mostly use it to handle log aggregation, monitor our web application, and alert us on data pipeline failures.
Our system is fully on AWS, and so we pipe in all of our Cloudwatch logs into Datadog to have a central place to index and search logs.
Our web app is built on an Elastic Beanstalk backend, and we use the Datadog agent to keep track of all of the requests that hit our backend and all of their components.
We also use the prebuilt AWS pipeline dashboards to monitor our batch jobs and lambdas.
View full review »Our primary use case is log management and we also use the solution for monitoring the application and underlying infrastructure. I'm an IT test manager.
We are evaluating Datadog for observability and monitoring requirements that we have in our company. In our use case, our intention is to provide some kind of framework for multiple app teams to use the tool for our cyber ability and engineering practices.
AM
Attila Mate Kovacs
Senior Cyber Security Expert at a security firm with 11-50 employees
We implement these solutions for our clients. We have implemented Datadog as an SIEM solution.
View full review »BH
BrianHeisler
Principal Enterprise Systems Engineer at a healthcare company with 10,001+ employees
We deploy agents on-premise to collect data on on-premise VM instances. We don't use Datadog in our cloud network. We do have some Cloud apps that we have it on and we also have Containers. We have it on their headquarters, the main software for them is on their own Cloud.
Eventually, we're building out the process now and using it better. We plan to use Datadog for root cause analysis relating to any kinds of issues we have with software, with applications going down, latency issues, connection issues, etc. Eventually, we're going to use Datadog for application performance, monitoring, and management. To be proactive around thresholds, alerts, bottlenecks, etc.
Our developers and QA teams use this solution. They use it to analyze network traffic, load, CPU load, CPU usage, and then Tracey NPM, API calls for their application. There are roughly 100 users right now. Maybe there's 200 total, but on a given day, maybe 13 people using this solution.
View full review »Our primary use case would be using the dashboards and getting proper insights based on the dashboards.
The monitoring, SLO, and SLA have been better and easier since we started using the Terraform infrastructure. APM has been easier as we had to enable it through the CronJob directly.
Profiling has been made easier. We are able to get many insights into the code. Profiling provides really good insights right now.
Logs are the most valuable and the best solution so far. Datadog can help solve any slow queries or database-related errors.
The primary use case would be using the dashboards and getting proper insights based on the dashboards.
View full review »We use Datadog to monitor our Kubernetes clusters.
We have 3 different clusters for different parts of the SDLC. We run the Datadog agent DaemonSet as well as the Datadog cluster agent. Our services have the APM installed by default.
To create monitors, we use Terraform. This is provided out-of-the-box for our service owner.
We run EKS on top of K8s, therefore, we also make use of some of the AWS monitoring capabilities that can be integrated into Datadog.
We are hugely reliant on Datadog for all aspects of our system.
View full review »Datadog provides us with a solution for data ingesting for all of our application metrics, resource metrics, APM/tracing data etc.
We use it for use in dashboards, monitoring/alerting, SLO targets, incident response etc.
We have a lot of applications across multiple languages/frameworks etc., and have deployed in Kubernetes across multiple regions in AWS, along with underlying managed resources such as SQS, Aurora, etc.
Datadog makes understanding the state of these seamless. We are a company with millions of daily active users, and this level of detail is excellent.
View full review »I am using the solution for monitoring metrics, logs, traces, etc. It's mainly for making dashboards as well as monitoring our services.
We also use Datadog to help centralize our incident management to show the logs, where issues spiked, and some metrics.
We use Datadog to do troubleshooting in Kubernetes, specifically in our Azure Kubernetes service. Beyond that, we are looking to use open telemetry in tandem with Datadog to further our log-tracing efforts. In the future, this may be expanded.
View full review »Our use case is mainly deploying into our applications for monitoring/logging observability. We currently have our microservices feed into an actuator that exists in each instance of our application that extends to a local and central Grafana for client and internal visibility. The application we use is Grafana.
Logging captures application and system logs that are ported to each application instance for querying.
Whenever anything occurs that is considered unhealthy from a range of health checks, we have notification rules configured internally and externally for a prompt response time.
View full review »We use it mostly for logging log messages from our Kubernetes and EC2 instances, for example, system messages and errors. Also, we want log messages from our firewalls and other network infrastructure in case of network issues. We intend to use it for application logging, et cetera, to get insight into internal problems in the applications in Kubernetes pods. We want to use it for monitoring in case of system problems and hardware failures so that it can notify us.
View full review »We use the application for our application monitoring, data security monitoring, and log management. What we like about the application is that it helps us to track issues more proactively instead of reactively.
There are other improvements we would like to see.
1. Being able to restrict users from seeing or viewing specific dashboards once they log in
2. They can cut down the prices for Cloud SIEM. It seems very useful, however, the prices are high. Some organizations are finding it difficult to make decisions in terms of getting the tool.
View full review »Ingesting data from various sources to monitor the log metrics of the system and enabling an alert mechanism to notify the teammates if something goes wrong.
More specifically, having Datadog agents as integration to different services provides easy access and management.
View full review »We have deep integration with Datadog for observability and monitoring.
We use everything from APM, logs, and RUM to monitor and dashboards for tracking system health.
We are trying to move from many different solutions for error tracking/observability to a single platform (Datadog).
We are currently in the process of setting up logging in Datadog in order to maintain our logs better. We are looking to create more insights into the real user flows by using real user monitoring (RUM) too.
We use Datadog for application logs, error tracking, performance tracking, alerting, and overall production state surveillance.
It helps us improve observability and ease of maintenance through better information for our support teams and their issue qualification.
We also use dashboards to keep all the information at ready and easy to access. SLOs notably for our uptimes but also our feature usage. It also feeds our alerting for our on-call SREs into PagerDuty by launching alerts when specific parameters are exceeded.
View full review »The RUM is implemented for customer support session replays to quickly route, triage, and troubleshoot support issues which can be sent to our engineering teams directly.
Customer Support will log in directly after receiving a customer request and work on the issue. Engineers will utilize the replay along with RUM to pinpoint the issue combined with APM and Infra trace to be able to look for signals to find the direct cause of the customer impact.
Incident management will be utilized to open a Jira ticket for engineering, and it integrates with ITSM systems and on-call as needed.
View full review »We use metrics to track the metrics of our application. We use logging to log any errors or erroneous application behavior as well as successful behavior. We use events to log successful steps in our pipeline or failed steps in our deployment. We use a combination of all these features to diagnose bugs.
It makes it much more efficient to look at all the data in one place. This speeds up our development speed so that we can be agile.
View full review »Log aggregation for us was a key component since we have a fairly old-school app running on VMs on bare metal. We previously didn't have much insight into our logs unless we manually tunneled them into each server.
The solution is reducing manual labor in troubleshooting problems in our environments server by server.
We also needed to monitor our Java app and MySQL database to understand their problems so that we could take action and resolve them.
Our use cases have since expanded to encompass all aspects of monitoring.
View full review »We use the solution for monitoring time spent on views and events triggered. For example, for one of our products, we have created a custom dashboard that lets us track all the custom events and multiple entry points in the same part of the application.
Knowing the entry point helps us choose which part of the program should be improved next. It also helps us with collecting important data about the overall usage of each module within our application.
We're moving towards the cloud yet still have several active data center contracts. As we move to the cloud, we are interested in knowing more about our services, and DataDog APM/logs should give us this perspective.
We currently use the infrastructure monitoring part of DataDog. Still, I've really seen the advantage of moving more data into the cloud for comparison and being able to have one place where we can view all related pieces of information regarding a possible incident or potential issue.
View full review »I'm a Datadog partner in Brazil, and I monitor all my applications with Datadog too. I would like to enable all features in my DPN portal and get access to custom demos. We resell Datadog and a full stack of pre-sales, sales, and post-sales services. We have customers for all sectors, including governmental, financial services, services in general, telecom, et cetera. Today, we are the biggest Datadog partner in Brazil, and we are searching for an expansion in our MSP environment.
View full review »We primarily use the solution for observability, metrics, logs, tracing, and end-to-end user flow monitoring.
We are looking to implement this as a company-wide standard for cloud solutions.
At this time, we're currently in a POC, and we're interested in using either a Datadog agent or the OTel agent with a Datadog exporter. We have dashboards with panels that correlate metrics and allow you to link through to traces. Flame graphs to show latency across services and the various spans.
While we are not security minded, we still require it and are interested in more. It's used for monitoring critical systems.
View full review »We primarily use the solution for charting application metrics.
We use it for all our application metrics, host metrics, and monitors with a PagerDuty integration.
We integrate our application logs. It is great to be able to tie our metrics and our traces together.
We use the APM module with traces. It is great to be able to link APM, logs, and metrics in one go, as it shortens our troubleshooting and RCA dramatically.
We are loving the tool; it is great to have all those insights in one place.
We hope that they keep making my life and our engineers' life easier.
We primarily use the solution for monitoring and telemetry.
We use lots of log collections, log-based metrics, and dashboard visualization. The logging, metrics, and APM are vital.
One of the things we use it for is the same thing that we use FullStory for, which is to replay customer interactions with our platform. However, it also does the monitoring. It's like monitoring cloud tools. We're really mostly monitoring our own software to make sure that everything is functioning properly. We can check a bunch of things, and we can even play back customer sessions. It’s basically monitoring our application.
View full review »AP
Abdulla Pathan
Technology Competency and Solution Head at LearningMate
We use Datadog for application monitoring, to help identify errors. It is also used to monitor application performance.
It helps organizations to understand User Experience with user behaviour pattern
View full review »We have used this solution primarily for application performance monitoring. To do this, we needed to make sure we had the right data in the system so that people could be able to monitor their applications end-to-end.
View full review »We use Datadog to monitor our product on the cloud.
View full review »RG
RomainGiacalone
Software Engineering Manager at a tech vendor with 51-200 employees
I am using Datadog for error reporting.
View full review »Our clients use it for monitoring applications. Its deployment depends on our customer's use case.
It is 100% cloud. We have got a multi-tenant environment, so we segment it out.
PA
Pablo Albornoz
Co-Founder at Exeo IT
We're in the process of doing a Proof of Concept with the solution right now.
View full review »We primarily use Datadog for logs, APM, infrastructure monitoring, and lambda visibility.
We have built a number of critical dashboards that we display within our office for engineers to have a good understanding of the application performance, as well as business partners to understand at a high level the traffic flowing through the app.
We started with logging, as our primary monitor, and have shifted to APM to get a deeper understanding of what our system is doing, and how the changes we are making impact the apps.
We primarily use DataDog for performance and log monitoring of cloud environments, which include VMs and Azure Services like Azure compute, storage, network, firewall, and app services via event hubs.
Alerting based on monitors via teams and PagerDuty.
Logs collection for Azure services like Azure database, Azure Application Gateway, Azure AKS, and other Azure services.
Custom metrics using a Python script to collect metrics for components not natively supported by Datadog.
Synthetic testing to ensure uptime and browser tests via CI/CD pipeline.
View full review »We were in need of a cloud monitoring tool that was operationally focused on the AWS Platform. We wanted to be able to responsibly and effectively monitor, troubleshoot, and operate the AWS platform, including Server, Network, and key AWS Services.
Tooling that highlighted and detected problems, anomalies, and provided best practice recommendations. Tooling that expedites root-cause analysis and performance troubleshooting.
Datadog provided us the ability to monitor our cloud infrastructure (network, servers, storage), platform/middleware (database, web/applications servers, business process automation), and business applications across our cloud providers.
We use Datadog as a monitoring platform to achieve visibility into our container environments.
Almost all of our workloads are containerized and with DataDog, we are able to get metrics, logs, alerts, and events about all the containers that we are running. Our developers also extensively use APM to find and diagnose performance issues that might appear.
We use Terraform to automatically create all of the necessary monitors and dashboards that our developers need to make sure that our level of service is sufficient.
View full review »We primarily use this product for availability and performance monitoring, log aggregation.
View full review »We primarily use Datadog for the monitoring of EC2 and ECS containers running mostly Rails applications that host a SaaS product. We also monitor ElasticSearch and RDS, and we are working on adding their Application Performance Monitoring solution to monitor our applications directly.
We use DataDog to create dashboards, graphs, and alerts based on interesting metrics. DataDog is our first place to look to find the performance of our system.
We also use their logging platform and it works well. Especially useful is that the logs and metrics are tightly integrated so you can jump between them easily.
View full review »Our primary use of Datadog includes:
- Keeping a close look into our AWS resources. Monitoring our multiple RDS and ElastiCache instances play a big role in our indicators.
- Kubernetes. We aren't using all of the available Kubernetes integrations but the few of them that work out of the box adds great value to our metrics.
- Monitoring and alerting. We wired our most relevant monitoring and alerts to services like PagerDuty, and for the rest of them, we keep our engineers up to date with constant Slack updates.
RC
Richard Chennault
Senior Director with 10,001+ employees
We used Datadog to capture the salvatory of our AWS fleet of around 1,200 servers.
View full review »We’re currently using logging, monitoring, metrics, APM, etc.
We've started to use e-SLOs, however, it takes a bit of time to work through those.
RUM has been very useful. I have used this in the past to debug problems in production, which has been g great.
We also want to start using synthetics and tracing more.
Our application currently runs in many different environments based on our customers' requirements. This allows us to see everything in one place and filter by environment as required, which is extremely useful.
View full review »The main use case is observability and reliability as part of a platform/delivery engineering solution. We use the product to assist tenants and clients within the company to get more ramped up on SRE/DevOps.
View full review »We primarily use the product for tracing, metrics, and alarms in various deployment environments.
View full review »We primarily use the solution for monitoring and log analysis.
View full review »We use actual user monitoring and have set up thresholds for alerts to PagerDuty, Sentry, Slack, and so on. We also have dashboards set up for tracking latency and error rates.
As an individual contributor, I also try to set up dashboards for the individual feature projects I work on. I'd like to learn more ways to use this, though, especially when it comes to more proactive approaches to issues. A starter pack of common-use types would be nice.
View full review »We use the solution for testing all of our application's endpoints. It is making sure that they work on a consistent basis.
View full review »We are using the solution for scaling up the website for market data applications. EC2 and Datadog have enabled high-level monitoring of underlying infra and services.
The Datadog profiler comes in handy to pinpoint issues with resource utilization during peak hours, and traces/log management helps narrow down the root cause.
The network map is crucial in identifying bottlenecks and determining what needs more attention.
Host map helps identify problematic hardware and devise ways to counter issues that arise during scaling, and deploying solutions on the cloud.
View full review »We use an enterprise version of a CMS platform which is enabling businesses to transmit content to their customers. The tool is fully customizable to the end user, including out-of-the-box integrations as well as APIs for custom plugin support.
Our systems fully manage content using AWS as the back-end cloud provider. Assets are kept in secure buckets and utilize the Kubernetes infrastructure to deliver our product to end users and internal authors. Using the CMS allows for business people to manage content without needing development efforts.
View full review »We primarily use the solution for the RUM, security monitoring, and streams.
We need to monitor users and what they access. We also need to identify security loopholes and attack patterns and identify and quickly respond to issues.
We can identify pushbacks, and get insight into application components that stack up with each other. We can understand which components, libraries, and code to alert teams.
Using Datadog, we can raise incidents, track incidents to completion, and be able to gather data for reporting and post-mortem.
The solution allows us to track fixes and tracks their test coverage. With it, we get confidence in the fix/improvement phase and be able to provide a response.
View full review »After a security incident, we needed to find and migrate to a different cloud provider, and after evaluating different competitors and the skill set of the team, we decided to move to AWS. AWS also enables the team to have finer control over how our apps are deployed and how security and access are managed. By leveraging AWS's functionality, we have increased our application's security and sped up the deployment process. We've even been able to handle higher workloads due to AWS's auto-scaling functionality.
View full review »We use the solution for logs from all our applications. In Datadog, for monitoring logs, our team creates an automation for implementing massive logging in all our systems. Now, we are deploying it in our core systems.
We’re currently using logging, monitoring, metrics, APM, etc.
We've started to use קSLOs. However, it takes a bit of time to work through those.
RUM (Real User Monitoring) has been very useful. I have used this in the past to debug problems in production, which has been great.
We also want to start using synthetics and tracing more.
Our application currently runs in many different environments based on our customers' requirements. This allows us to see everything in one place and still filter by the environment as required, which is extremely useful.
View full review »We primarily use the dashboard/metric with many tags.
Our company is transitioning to using the solution for monitoring and analytic services we provide to customers. Once fully rolled out, there will be 80-100 users companywide.
View full review »NK
Nagendra K
Lead Blockchain and Back-End Developer at Torum Technology
We are using the solution from a monitoring and management perspective. We use it for alerts.
View full review »I'm not sure which version we're using, although I believe it to be the latest.
We essentially use the solution in advance of performance testing, performance monitoring, and troubleshooting.
We have a web infrastructure that uses Amazon Web Services containers with everything included, and we use Datadog to monitor them all.
View full review »GC
GAD COBBINA
Security Analyst at a tech services company with 11-50 employees
We are currently testing it. If the testing goes well, we'll purchase the full version, and it will probably be our main monitoring tool. We plan to use it for monitoring our activities and the attacks on our systems or network.
View full review »We use it for our infrastructure network and servers.
View full review »FC
Fernando Cunha JR
Tech Lead & Solutions Architect at DXC
I implement this solution for clients.
View full review »We use it for monitoring and instrumentation of security. We secure our databases and servers. It is typically for the security of apps, services, and systems. We are using its latest version.
View full review »I'm a senior cloud security engineer and we are customers of Datadog.
View full review »SB
Santhosh Batchu
Cloud Architect at a tech services company
We are a solution provider and Datadog is one of the products that I was working on with one of my clients. They are currently evaluating it for use in cloud monitoring.
Specifically, Datadog is used for monitoring cloud applications in terms of performance. The logs come into this solution from AWS and it provides dashboards for various environments.
View full review »AS
Aaron Singh
DevOps Engineer at Spark New Zealand
We use it for notifications, alerting, and capturing most of the information from Amazon, such as EC2 instances.
View full review »We are using the infrastructure and app monitoring side, such as process monitoring. We are using it in a very traditional way. We are not using the APM capabilities. When it comes to something like containers, we will generally use it on the host but not inside the container itself.
We are using it with our customers and in-house day-to-day.
View full review »- Monitoring
- Analytics
- Tracing
- APM
EY
Enrique Yanez
Software Engineer at Sony Corporation of America
If our app is up and running, we use it to monitor how many credits the app is using up on each node. We also monitor services by how long each call is taking with the help of EC2s off of application.
View full review »BB
Brendan Buono
Software Engineer at Lovepop
The primary use case is application monitoring. We also use it set custom metrics and watch our AWS metrics, as well as data.
At my current job, I have only use it a couple months. However, I used it for a few years at a previous company.
View full review »We mainly use it to send metrics about CV and memory usage, in addition to the number of files descriptors on a socket.
View full review »We use it to monitor our infrastructure, particularly our different EC2 instances, and our containers. We also use it to capture our logs.
View full review »MI
Mark
Site Reliability Engineer at a computer software company with 201-500 employees
We use it for custom metrics of our applications and monitoring of our systems.
View full review »YS
Yun Sung Eun
Software Developer at AhnLab, Inc.
We use it to store editorial content.
We started out on the on-premise version, then moved to the AWS version.
View full review »DJ
DusanJovanovic
Software Engineer at a computer software company with 51-200 employees
We run the agent in AWS.
View full review »We are providing managed services to our customers across multiple industries. Datadog is key to delivering these services by bringing the observability, monitoring, and alerting capabilities we need to operate at scale.
We operate custom cloud native workloads as well as ISV products such as Atlassian Jira or Confluence.
Integrating Synthetics, infrastructure, and application performance monitoring, as well as piping all logs through Datadog allows us to operate more with less with good alerting right in time.
View full review »We use Datadog incident management for our incident tooling. Whenever we run into an incident, we try to use it. It allows us to create a separate Slack channel for it.
We primarily use the solution for security monitoring and anomaly detection.
The solution is primarily used for better understanding the health of applications, modern environments, and many other solutions, which are the main focus of Datadog and many other monitoring tools.
With Datadog specifically, I can look at the health of the technology stack and services, and also integrate multiple metric sources, security, business data, and much more. This makes it a real software solution for centralizing data and unifying monitoring silos in one place. Datadog is like a hub - not just a monitoring software.
Observability is a key use case, as is security.
View full review »We use this solution to monitor our Kubernetes clusters, nodes, deployments, daemon sets, replica sets, and pods.
We primarily use the solution for monitoring applications and informing customers via Pagerduty and Statuspage. The monitoring and alerts can be personalized internally, and we are able to find problems and issues. The response time monitor has been great, and it has been validating upgrades. We can check in to see which step fails,
Buyer's Guide
Datadog
July 2025

Learn what your peers think about Datadog. Get advice and tips from experienced pros sharing their opinions. Updated: July 2025.
861,490 professionals have used our research since 2012.