My main use case for PagerDuty Operations Cloud is for cloud-based operations, including incident management and resolving incident responses to reduce downtime and improve reliability. PagerDuty Operations Cloud provides a central command center that collects data signals from various IT systems, which helps us detect high-priority incidents and reduce the noise. In addition to my main use case, we are able to perform on-call scheduling and routing with the help of PagerDuty Operations Cloud very easily.
My main use case of PagerDuty Operations Cloud is in incident management . If there was any issue that occurred, I used to be the point of contact to resolve that particular issue, and in some cases I used to integrate that as well. My usual use cases for PagerDuty Operations Cloud were alert notification, on-call scheduling, and escalations. If an incident arose and we wanted to escalate that issue to a senior person within the service level agreement, we used to schedule escalation policies so that we could ensure coverage at all times. Some of my use cases were setting up an alarm, integrating and monitoring with Slack or other tools so that we could monitor and resolve issues better because of using the application.
We took a subscription and started integrating many applications with it. We have integrated it with ServiceNow and also use our wiki repository by Spotify called Beacon. We onboard multiple services and integrate them with PagerDuty so that different engineering teams can update their line of escalation policy and primary, secondary, and tertiary users who are on call 24/7 or following a Follow the Sun model. Additionally, we have integrated PagerDuty for incident commanders who can acknowledge any incident page they receive and update their responses. If they need to hand it over to someone else, they can do that as well. This is how we are actually using PagerDuty. We primarily leverage it for service onboarding. For instance, if you have created a product and need to support it, which might run on any cloud such as EC2, AWS, or Azure, the service teams or triage engineering teams have their members added into PagerDuty along with different playbooks, runbooks, and SOPs integrated with any ticketing tool such as ServiceNow or Jira, whichever you are using. On these fronts, we effectively use PagerDuty. PagerDuty Operations Cloud has been reliable as it has never gone down in my experience; I have never seen it fail. I rely on PagerDuty Operations Cloud for on-call support for any high severity incidents or sev zero scenarios. This is a great feature, and I can update it using the mobile app because we also use PagerDuty Operations Cloud mobile app. Occasionally, I may be on call during weekends, and something might come up. For example, on October 20th, I was on leave but used PagerDuty Operations Cloud to stay in sync, even without my laptop. I could join Teams and Zoom calls and simultaneously update required documentation via Copilot and ChatGPT on PagerDuty Operations Cloud regarding different incidents. This capability was extremely helpful.
L1 SecOps Analyst at a tech vendor with 10,001+ employees
Real User
Top 20
Feb 23, 2026
PagerDuty Operations Cloud is used to understand whether there are false positives or false alerts because it is integrated into a system where there would be a phone number, such as anyone from the team's phone number or perhaps the shift supervisor's phone number. Once those triggers are there, it would hit that number until somebody picks it up and acknowledges or escalates the alert or threat. In a recent incident through PagerDuty Operations Cloud, there was one issue with one appliance that was continuously going wrong. Even if we were acknowledging it and closing it and had denoted it as a false positive and a false trigger or false alarm, it was still continuously hitting us. There was some issue with the appliance or some issue with the server. Once we understood that, we escalated this to the company's SecOps team, and they had to go inside to find out more details. PagerDuty Operations Cloud team was coordinated with them. Once they were coordinated, they could dig in deep and find out what the issue was. A high-priority P1 ticket was raised for that as per ITIL principles. PagerDuty Operations Cloud was being used inside the office only, and if we enter anybody's number, for example, it would continuously be hitting at any time whenever the alerts are there. Whoever's numbers are added, such as a shift supervisor or shift people, those would keep on hitting back.
I have been working in my current field for over seven years as a DevOps and site reliability engineer, and my primary experience involves managing the reliability of infrastructure platforms hosted in multi-cloud and on-premises environments. I have predominantly worked with systems hosted in AWS services, setting up infrastructure, CI/CD, observability, and completely establishing the release process where I utilize PagerDuty Operations Cloud for triage and other SRE operations. I have been using PagerDuty Operations Cloud for over four years, and I have utilized it in multiple ways. One involves using PagerDuty Operations Cloud through enterprise services via a subscription model, and I have also used it in a project at Intel where I utilized PagerDuty Operations Cloud from AWS for approximately one to one and a half years. After that period, I have been using it as a subscription currently at IBM. One of the main use cases for PagerDuty Operations Cloud involves handling the operation center, particularly concerning incident resolutions and triaging different incidents as part of the score platform engineering team within a central IBM cloud where various IBM cloud services are hosted. To ensure continuous reliability, automated incidents are created in PagerDuty Operations Cloud and incident management automation is heavily utilized as part of the project. Previously, I worked on integrating PagerDuty Operations Cloud with default AWS services to create incidents for different AWS services as part of the host infrastructure at Intel. Currently, I am creating different incident workflows within IBM internal cloud operations to ensure an effective incident management process, utilizing integrations with different LLMs as part of incident management, along with agentic SRE tasks that have arisen in the project. Since I am part of a larger platform engineering team and SRE operations team, there are many incidents and services that my team handles. I handle over 26 IBM cloud score services hosted in our internal platform, where there have been many incidents related to service downtime, reliability issues, and update issues. A dedicated SRE team handles end-to-end incident management, and we wanted to automate the incident management process, especially since we receive hundreds of incidents per day, up to thousands of incidents during critical release times of different services. Thus, the manual on-call process has been automated through utilizing PagerDuty Operations Cloud.
We receive a notification if there are any failed jobs or operations. We have some Bamboo agents working, so if one of the jobs fails on one of these servers, PagerDuty Operations Cloud creates an incident and notifies us. We use PagerDuty Operations Cloud for monitoring purposes, and it works great for our current needs.
My main use case for PagerDuty Operations Cloud is monitoring and on-call management for downtime. Recently, we had a service go down last week, and we were alerted via PagerDuty Operations Cloud of the issue. One of our on-call engineers responded to the page and quickly resolved the problem through PagerDuty Operations Cloud app.
My main use case for PagerDuty Operations Cloud is to set up alerts for any failures, such as when one server is down, a particular service is down, or when APIs are not responding due to technical issues, with PagerDuty triggering an alert and also calling my personal mobile number to notify me about the issue, allowing me to acknowledge that I am looking into it and take necessary actions. I can give an example of a situation where PagerDuty Operations Cloud helped us handle an incident, such as when our payment system was about to go down. During that time, we usually monitor the system manually, but there are incidents where an automated system works more efficiently than a human. PagerDuty Operations Cloud identified the issue first by alerting us that something went wrong with the servers or services, which enabled us to contact the DevOps and Dev team to identify the exact issue in our banking app, highlighting how helpful PagerDuty Operations Cloud has been from the beginning. PagerDuty Operations Cloud is very helpful for monitoring purposes, allowing us to set up multiple alerting methods such as SMS alerting, email alerting, and call alerting, all of which we commonly use, proving its usefulness across various banking services, with teams including Dev, DevOps, and SecOps relying on it heavily.
Initially, I started using it for managing on-call schedules. As the tech stack developed, we began using it for service alerts and event routing, and then transitioned to operational views and dashboarding. It eventually became central to our alerting systems, where all monitoring tools would send information to PagerDuty, enabling event management and routing to whoever was on call.
The primary use case of the solution is to alert the on-call person when there are any critical errors or when the servers are down. It is also used for the on-call scheduling of personnel.
Principal Architect at a energy/utilities company with 10,001+ employees
Real User
Sep 19, 2022
It's mainly for IT call scheduling, emergency contacts, events, and those kinds of things. It's integrated with AWS, MS Teams, Remedy, and other solutions.
Compliance, Security & Testing Manager at a financial services firm with 11-50 employees
Real User
Oct 8, 2020
We are a 24-hour online business. We use it for scheduling our on-call engineers and making sure that there is follow-the-sun or round-the-clock coverage for alerting and network operations. It ingests all our alert paths, i.e., anything that generates an alert of any description, such as, Splunk, AWS, and internal applications. We feed all our events into it, then it generates alerts which need a response from an engineer with a description. Another thing is it is built-in scheduling is pretty much hands-off for our on-call engineers unless somebody goes on holidays. That is the only time that we have to jump in there and make any changes.
VP of Engineering at a comms service provider with 201-500 employees
Real User
Jun 25, 2020
We mostly use it for our on-call engineers, for schedules, alerting, and critical alerts. And, of course, we use it for the management of an issue, so that people acknowledge the alerts, reassign them, etc.
Tier 4 Support Team Leader at a comms service provider with 10,001+ employees
Real User
Mar 1, 2020
The most common use case is the result of alerts coming from a monitoring system, like New Relic or Nagios, alerts that we define as critical. They are alerts where we need someone to get on a bridge or to start working on them during the night. Once such an alert is firing, it fires a PagerDuty alert and it triggers the current on-call who is scheduled in PagerDuty's schedule. The on-call person acknowledges the alert and looks into it to understand what is going on and to update, via PagerDuty, what the status is. The update will be sent to all the groups that are part of the PagerDuty schedule until the issue is resolved. We mostly integrate it with other monitoring tools like New Relic or Nagios, or we are using their email integration for on-call processes to page people in groups. We also use it for Sev 1 issues that are coming from alerts from New Relic or from Nagios or other monitoring systems.
PagerDuty Operations Cloud specializes in incident management and alert automation, with integration to 700+ platforms, reducing resolution time and manual workloads through AI-driven features and automated alerts.PagerDuty Operations Cloud is a robust tool for automated incident management, integrating seamlessly with platforms like ServiceNow, Datadog, New Relic, and AWS. It provides AI-driven alert grouping, customizable escalation policies, and multiple notification methods, including SMS...
The major part of PagerDuty Operations Cloud is for communication and pinging people and engaging them and pulling them into the call.
My main use case for PagerDuty Operations Cloud is for cloud-based operations, including incident management and resolving incident responses to reduce downtime and improve reliability. PagerDuty Operations Cloud provides a central command center that collects data signals from various IT systems, which helps us detect high-priority incidents and reduce the noise. In addition to my main use case, we are able to perform on-call scheduling and routing with the help of PagerDuty Operations Cloud very easily.
My main use case of PagerDuty Operations Cloud is in incident management . If there was any issue that occurred, I used to be the point of contact to resolve that particular issue, and in some cases I used to integrate that as well. My usual use cases for PagerDuty Operations Cloud were alert notification, on-call scheduling, and escalations. If an incident arose and we wanted to escalate that issue to a senior person within the service level agreement, we used to schedule escalation policies so that we could ensure coverage at all times. Some of my use cases were setting up an alarm, integrating and monitoring with Slack or other tools so that we could monitor and resolve issues better because of using the application.
We took a subscription and started integrating many applications with it. We have integrated it with ServiceNow and also use our wiki repository by Spotify called Beacon. We onboard multiple services and integrate them with PagerDuty so that different engineering teams can update their line of escalation policy and primary, secondary, and tertiary users who are on call 24/7 or following a Follow the Sun model. Additionally, we have integrated PagerDuty for incident commanders who can acknowledge any incident page they receive and update their responses. If they need to hand it over to someone else, they can do that as well. This is how we are actually using PagerDuty. We primarily leverage it for service onboarding. For instance, if you have created a product and need to support it, which might run on any cloud such as EC2, AWS, or Azure, the service teams or triage engineering teams have their members added into PagerDuty along with different playbooks, runbooks, and SOPs integrated with any ticketing tool such as ServiceNow or Jira, whichever you are using. On these fronts, we effectively use PagerDuty. PagerDuty Operations Cloud has been reliable as it has never gone down in my experience; I have never seen it fail. I rely on PagerDuty Operations Cloud for on-call support for any high severity incidents or sev zero scenarios. This is a great feature, and I can update it using the mobile app because we also use PagerDuty Operations Cloud mobile app. Occasionally, I may be on call during weekends, and something might come up. For example, on October 20th, I was on leave but used PagerDuty Operations Cloud to stay in sync, even without my laptop. I could join Teams and Zoom calls and simultaneously update required documentation via Copilot and ChatGPT on PagerDuty Operations Cloud regarding different incidents. This capability was extremely helpful.
PagerDuty Operations Cloud is used to understand whether there are false positives or false alerts because it is integrated into a system where there would be a phone number, such as anyone from the team's phone number or perhaps the shift supervisor's phone number. Once those triggers are there, it would hit that number until somebody picks it up and acknowledges or escalates the alert or threat. In a recent incident through PagerDuty Operations Cloud, there was one issue with one appliance that was continuously going wrong. Even if we were acknowledging it and closing it and had denoted it as a false positive and a false trigger or false alarm, it was still continuously hitting us. There was some issue with the appliance or some issue with the server. Once we understood that, we escalated this to the company's SecOps team, and they had to go inside to find out more details. PagerDuty Operations Cloud team was coordinated with them. Once they were coordinated, they could dig in deep and find out what the issue was. A high-priority P1 ticket was raised for that as per ITIL principles. PagerDuty Operations Cloud was being used inside the office only, and if we enter anybody's number, for example, it would continuously be hitting at any time whenever the alerts are there. Whoever's numbers are added, such as a shift supervisor or shift people, those would keep on hitting back.
I have been working in my current field for over seven years as a DevOps and site reliability engineer, and my primary experience involves managing the reliability of infrastructure platforms hosted in multi-cloud and on-premises environments. I have predominantly worked with systems hosted in AWS services, setting up infrastructure, CI/CD, observability, and completely establishing the release process where I utilize PagerDuty Operations Cloud for triage and other SRE operations. I have been using PagerDuty Operations Cloud for over four years, and I have utilized it in multiple ways. One involves using PagerDuty Operations Cloud through enterprise services via a subscription model, and I have also used it in a project at Intel where I utilized PagerDuty Operations Cloud from AWS for approximately one to one and a half years. After that period, I have been using it as a subscription currently at IBM. One of the main use cases for PagerDuty Operations Cloud involves handling the operation center, particularly concerning incident resolutions and triaging different incidents as part of the score platform engineering team within a central IBM cloud where various IBM cloud services are hosted. To ensure continuous reliability, automated incidents are created in PagerDuty Operations Cloud and incident management automation is heavily utilized as part of the project. Previously, I worked on integrating PagerDuty Operations Cloud with default AWS services to create incidents for different AWS services as part of the host infrastructure at Intel. Currently, I am creating different incident workflows within IBM internal cloud operations to ensure an effective incident management process, utilizing integrations with different LLMs as part of incident management, along with agentic SRE tasks that have arisen in the project. Since I am part of a larger platform engineering team and SRE operations team, there are many incidents and services that my team handles. I handle over 26 IBM cloud score services hosted in our internal platform, where there have been many incidents related to service downtime, reliability issues, and update issues. A dedicated SRE team handles end-to-end incident management, and we wanted to automate the incident management process, especially since we receive hundreds of incidents per day, up to thousands of incidents during critical release times of different services. Thus, the manual on-call process has been automated through utilizing PagerDuty Operations Cloud.
We receive a notification if there are any failed jobs or operations. We have some Bamboo agents working, so if one of the jobs fails on one of these servers, PagerDuty Operations Cloud creates an incident and notifies us. We use PagerDuty Operations Cloud for monitoring purposes, and it works great for our current needs.
My main use case for PagerDuty Operations Cloud is monitoring and on-call management for downtime. Recently, we had a service go down last week, and we were alerted via PagerDuty Operations Cloud of the issue. One of our on-call engineers responded to the page and quickly resolved the problem through PagerDuty Operations Cloud app.
My main use case for PagerDuty Operations Cloud is to set up alerts for any failures, such as when one server is down, a particular service is down, or when APIs are not responding due to technical issues, with PagerDuty triggering an alert and also calling my personal mobile number to notify me about the issue, allowing me to acknowledge that I am looking into it and take necessary actions. I can give an example of a situation where PagerDuty Operations Cloud helped us handle an incident, such as when our payment system was about to go down. During that time, we usually monitor the system manually, but there are incidents where an automated system works more efficiently than a human. PagerDuty Operations Cloud identified the issue first by alerting us that something went wrong with the servers or services, which enabled us to contact the DevOps and Dev team to identify the exact issue in our banking app, highlighting how helpful PagerDuty Operations Cloud has been from the beginning. PagerDuty Operations Cloud is very helpful for monitoring purposes, allowing us to set up multiple alerting methods such as SMS alerting, email alerting, and call alerting, all of which we commonly use, proving its usefulness across various banking services, with teams including Dev, DevOps, and SecOps relying on it heavily.
Initially, I started using it for managing on-call schedules. As the tech stack developed, we began using it for service alerts and event routing, and then transitioned to operational views and dashboarding. It eventually became central to our alerting systems, where all monitoring tools would send information to PagerDuty, enabling event management and routing to whoever was on call.
We use the solution for incident management.
The solution is used to alert the on-call users if we have priority-one or business-critical issues.
Our use cases include generating alerts from our site 24/7. We are managing the cloud infrastructure there.
The two major use cases were alerts for events and scheduling of engineers to get pages based on incidents.
The primary use case of the solution is to alert the on-call person when there are any critical errors or when the servers are down. It is also used for the on-call scheduling of personnel.
We primarily use this solution to track alerts from our cloud environment and monitor and respond to alerts on our cloud platform.
It's mainly for IT call scheduling, emergency contacts, events, and those kinds of things. It's integrated with AWS, MS Teams, Remedy, and other solutions.
We use PagerDuty for incident managment. We're looking at integrating PagerDuty with Rundeck in the future.
We are a 24-hour online business. We use it for scheduling our on-call engineers and making sure that there is follow-the-sun or round-the-clock coverage for alerting and network operations. It ingests all our alert paths, i.e., anything that generates an alert of any description, such as, Splunk, AWS, and internal applications. We feed all our events into it, then it generates alerts which need a response from an engineer with a description. Another thing is it is built-in scheduling is pretty much hands-off for our on-call engineers unless somebody goes on holidays. That is the only time that we have to jump in there and make any changes.
We mostly use it for our on-call engineers, for schedules, alerting, and critical alerts. And, of course, we use it for the management of an issue, so that people acknowledge the alerts, reassign them, etc.
The most common use case is the result of alerts coming from a monitoring system, like New Relic or Nagios, alerts that we define as critical. They are alerts where we need someone to get on a bridge or to start working on them during the night. Once such an alert is firing, it fires a PagerDuty alert and it triggers the current on-call who is scheduled in PagerDuty's schedule. The on-call person acknowledges the alert and looks into it to understand what is going on and to update, via PagerDuty, what the status is. The update will be sent to all the groups that are part of the PagerDuty schedule until the issue is resolved. We mostly integrate it with other monitoring tools like New Relic or Nagios, or we are using their email integration for on-call processes to page people in groups. We also use it for Sev 1 issues that are coming from alerts from New Relic or from Nagios or other monitoring systems.
Our primary use case of this solution is for alarming and to mitigate threats in our organization.