PagerDuty Operations Cloud Primary Use Case
NS
Nambi Srinivasan
Senior SRE at IBM
I have been working in my current field for over seven years as a DevOps and site reliability engineer, and my primary experience involves managing the reliability of infrastructure platforms hosted in multi-cloud and on-premises environments. I have predominantly worked with systems hosted in AWS services, setting up infrastructure, CI/CD, observability, and completely establishing the release process where I utilize PagerDuty Operations Cloud for triage and other SRE operations.
I have been using PagerDuty Operations Cloud for over four years, and I have utilized it in multiple ways. One involves using PagerDuty Operations Cloud through enterprise services via a subscription model, and I have also used it in a project at Intel where I utilized PagerDuty Operations Cloud from AWS for approximately one to one and a half years. After that period, I have been using it as a subscription currently at IBM.
One of the main use cases for PagerDuty Operations Cloud involves handling the operation center, particularly concerning incident resolutions and triaging different incidents as part of the score platform engineering team within a central IBM cloud where various IBM cloud services are hosted. To ensure continuous reliability, automated incidents are created in PagerDuty Operations Cloud and incident management automation is heavily utilized as part of the project. Previously, I worked on integrating PagerDuty Operations Cloud with default AWS services to create incidents for different AWS services as part of the host infrastructure at Intel. Currently, I am creating different incident workflows within IBM internal cloud operations to ensure an effective incident management process, utilizing integrations with different LLMs as part of incident management, along with agentic SRE tasks that have arisen in the project.
Since I am part of a larger platform engineering team and SRE operations team, there are many incidents and services that my team handles. I handle over 26 IBM cloud score services hosted in our internal platform, where there have been many incidents related to service downtime, reliability issues, and update issues. A dedicated SRE team handles end-to-end incident management, and we wanted to automate the incident management process, especially since we receive hundreds of incidents per day, up to thousands of incidents during critical release times of different services. Thus, the manual on-call process has been automated through utilizing PagerDuty Operations Cloud.
View full review »My main use case of PagerDuty Operations Cloud is in incident management . If there was any issue that occurred, I used to be the point of contact to resolve that particular issue, and in some cases I used to integrate that as well.
My usual use cases for PagerDuty Operations Cloud were alert notification, on-call scheduling, and escalations. If an incident arose and we wanted to escalate that issue to a senior person within the service level agreement, we used to schedule escalation policies so that we could ensure coverage at all times. Some of my use cases were setting up an alarm, integrating and monitoring with Slack or other tools so that we could monitor and resolve issues better because of using the application.
View full review »Our main use case for PagerDuty Operations Cloud is for alerting purposes whenever any kind of downtime or downstream incident happens with our application which causes any downtime, and PagerDuty Operations Cloud will alert us through calls and SMS so we can get notified and quickly remediate the issue.
A unique aspect of our main use case with PagerDuty Operations Cloud is using the Runbook flow. Whenever we experience a specific kind of incident, the Runbook will trigger automation to either remediate the issues or perform root cause analysis, thus enhancing our workflow automations.
View full review »Buyer's Guide
PagerDuty Operations Cloud
February 2026
Learn what your peers think about PagerDuty Operations Cloud. Get advice and tips from experienced pros sharing their opinions. Updated: February 2026.
884,328 professionals have used our research since 2012.
We took a subscription and started integrating many applications with it. We have integrated it with ServiceNow and also use our wiki repository by Spotify called Beacon. We onboard multiple services and integrate them with PagerDuty so that different engineering teams can update their line of escalation policy and primary, secondary, and tertiary users who are on call 24/7 or following a Follow the Sun model.
Additionally, we have integrated PagerDuty for incident commanders who can acknowledge any incident page they receive and update their responses. If they need to hand it over to someone else, they can do that as well. This is how we are actually using PagerDuty. We primarily leverage it for service onboarding. For instance, if you have created a product and need to support it, which might run on any cloud such as EC2, AWS, or Azure, the service teams or triage engineering teams have their members added into PagerDuty along with different playbooks, runbooks, and SOPs integrated with any ticketing tool such as ServiceNow or Jira, whichever you are using. On these fronts, we effectively use PagerDuty.
PagerDuty Operations Cloud has been reliable as it has never gone down in my experience; I have never seen it fail. I rely on PagerDuty Operations Cloud for on-call support for any high severity incidents or sev zero scenarios. This is a great feature, and I can update it using the mobile app because we also use PagerDuty Operations Cloud mobile app. Occasionally, I may be on call during weekends, and something might come up. For example, on October 20th, I was on leave but used PagerDuty Operations Cloud to stay in sync, even without my laptop. I could join Teams and Zoom calls and simultaneously update required documentation via Copilot and ChatGPT on PagerDuty Operations Cloud regarding different incidents. This capability was extremely helpful.
View full review »PagerDuty Operations Cloud is a platform that helps teams manage incidents, automate operations, and ensure system reliability by bringing alerts, on-call schedules, and real-time responses into one place. When we had to push things into production, we set up PagerDuty schedules on a weekly or biweekly basis. If an issue occurred at night, a roster would pop up, and the respective engineer would have to handle that use case.
A specific incident where PagerDuty Operations Cloud helped my team was during the peak season in America when lakhs of orders were placed in December, and a major S1 severity production issue suddenly happened. If no monitoring tool had been in place, the company would have faced doomed circumstances, incurring lakhs of dollars in losses. PagerDuty came to our rescue at the last moment when nothing was happening. At 3:00 a.m. my time, I received a message and subsequently a call while sleeping, and I learned that this issue had occurred. I logged in quickly, promptly fixed that issue, and within an hour or so, the issue was resolved with minimal damage. I even received appreciation for my quick response.
PagerDuty Operations Cloud helps in similar situations because whenever some issue happens and we are not aware of it, PagerDuty comes with a flag telling us that there is an issue that needs to be fixed before it becomes a major problem.
View full review »My main use case for PagerDuty Operations Cloud is to set up shifts for people on-call.
A specific example of how I use PagerDuty Operations Cloud for setting up shifts is for when we need to set up shifts for holidays. In our team, we'll assign people who will be on-call and create an Excel sheet and upload it to PagerDuty. It works normally, gives notifications, and everything else functions properly. It is very easy to set up and manage.
I usually discuss with my team who will be on-call during holidays, and we will set up how many people are needed. We create an Excel sheet, upload it to PagerDuty, and set up the line of who is the first person to reach, and if they miss it, then whom to escalate to. The web view and website are also very easy to use. I think this is the normal use case. Perhaps other teams are using it differently, but this works well for us. Before, it was very manual, and it was quite difficult.
View full review »My use case for PagerDuty Operations Cloud is from the SRE and DevOps team. We use PagerDuty Operations Cloud for specific alerting purposes and for the pipeline process. When we build a pipeline and it suddenly fails due to some job and issues, we receive an error. We set up PagerDuty Operations Cloud with our monitoring services, which we are currently using, Datadog. Datadog is connected with PagerDuty Operations Cloud, and whenever Datadog receives an alert or a spike or anything critical, it will trigger an alert to PagerDuty Operations Cloud, and we quickly get a notification. We are currently using this process, and we are also maintaining our on-shift call rotation. For example, on Monday, Wednesday, and Friday, I am working as a shift lead, and then on Tuesday, Saturday, and Sunday, someone else is the shift lead. Regarding MTTR and all those statistics, we can see how many alerts we received, how many alerts we acknowledged this month, and we have a timeline as well. One of the valuable parts of PagerDuty Operations Cloud is that in our team, we can have a competitive environment. For example, if I resolved the most alerts triggered and resolved this month, then someone else can do it next month, and whoever resolves the most critical alerts on time receives appreciation every month.
View full review »
My main use case for PagerDuty Operations Cloud is to set up alerts for any failures, such as when one server is down, a particular service is down, or when APIs are not responding due to technical issues, with PagerDuty triggering an alert and also calling my personal mobile number to notify me about the issue, allowing me to acknowledge that I am looking into it and take necessary actions. I can give an example of a situation where PagerDuty Operations Cloud helped us handle an incident, such as when our payment system was about to go down. During that time, we usually monitor the system manually, but there are incidents where an automated system works more efficiently than a human. PagerDuty Operations Cloud identified the issue first by alerting us that something went wrong with the servers or services, which enabled us to contact the DevOps and Dev team to identify the exact issue in our banking app, highlighting how helpful PagerDuty Operations Cloud has been from the beginning. PagerDuty Operations Cloud is very helpful for monitoring purposes, allowing us to set up multiple alerting methods such as SMS alerting, email alerting, and call alerting, all of which we commonly use, proving its usefulness across various banking services, with teams including Dev, DevOps, and SecOps relying on it heavily.
View full review »
My main use case for PagerDuty Operations Cloud is monitoring and on-call management for downtime.
Recently, we had a service go down last week, and we were alerted via PagerDuty Operations Cloud of the issue. One of our on-call engineers responded to the page and quickly resolved the problem through PagerDuty Operations Cloud app.
View full review »In my organization, we use PagerDuty Operations Cloud to acknowledge alerts. PagerDuty Operations Cloud is organized so that it is often used to page the SMEs. Whenever we work on any tasks and face critical situations where we are unable to troubleshoot from our end, we page for the SMEs. Irrespective of the team, if it is infra-related issues, we page to infra. If it is related to some other product, we page to that product's SMEs and involve them into a PagerDuty Operations Cloud call. We inform them regarding the issues, and then they acknowledge the alerts. After acknowledging the alerts, they start working on that particular error.
PagerDuty Operations Cloud is also organized so that if there is any critical issue, it will create an alert that will go in a particular notification form to the SME with a phone call, stating that there is a critical issue which is in progress. The particular SME will acknowledge the alert and come and join the call, mentioning that they have been paged for this issue. Then we will start working with that particular team to resolve the issue from our end.
Regarding the incident command system, we use the Freshservice tool. Freshservice and PagerDuty Operations Cloud have been synced in my organization. The incident command system is a structured way for major incidents. Whenever there would be any outage, in order to proceed with the communication flow, we use the incident command system in PagerDuty Operations Cloud. Everyone will jump into a call, and then multiple people will start fixing the issues. Everyone will be working hard to bring that instance back online or to restore that particular environment.
View full review »I have been in my current role for the past 18 months, and we started using PagerDuty Operations Cloud earlier this year around January or February to manage our operations.
PagerDuty Operations Cloud's primary use case is alerting. We switched to ensure alerts are efficient and effective so that the on-call engineer does not miss any alert. We instrument many alerts on it, including VPN downtime, transaction monitoring, success rate, and latency. We configured PagerDuty Operations Cloud so that if any of those metrics are met or if any of those SLOs and SLIs are breached, we can quickly take action and resolve the issue. For day-to-day use, we run a 24-hour shift where all shifts are entered into the system, and every on-call engineer uses PagerDuty Operations Cloud to receive alerts. Beyond alerting, we also use scheduling, incident management, and incident reports.
View full review »PagerDuty Operations Cloud is used to understand whether there are false positives or false alerts because it is integrated into a system where there would be a phone number, such as anyone from the team's phone number or perhaps the shift supervisor's phone number. Once those triggers are there, it would hit that number until somebody picks it up and acknowledges or escalates the alert or threat.
In a recent incident through PagerDuty Operations Cloud, there was one issue with one appliance that was continuously going wrong. Even if we were acknowledging it and closing it and had denoted it as a false positive and a false trigger or false alarm, it was still continuously hitting us. There was some issue with the appliance or some issue with the server. Once we understood that, we escalated this to the company's SecOps team, and they had to go inside to find out more details. PagerDuty Operations Cloud team was coordinated with them. Once they were coordinated, they could dig in deep and find out what the issue was. A high-priority P1 ticket was raised for that as per ITIL principles.
PagerDuty Operations Cloud was being used inside the office only, and if we enter anybody's number, for example, it would continuously be hitting at any time whenever the alerts are there. Whoever's numbers are added, such as a shift supervisor or shift people, those would keep on hitting back.
View full review »We receive a notification if there are any failed jobs or operations. We have some Bamboo agents working, so if one of the jobs fails on one of these servers, PagerDuty Operations Cloud creates an incident and notifies us. We use PagerDuty Operations Cloud for monitoring purposes, and it works great for our current needs.
View full review »My main use case for PagerDuty Operations Cloud is incident management, as we use it for alerting people who are on call.
I definitely use PagerDuty Operations Cloud for incident management; we have set up the account, schedule, teams, etc., and we continuously monitor our logs for any anomalies with proactive alerts. We define priority because we don't want to alert people on the phone unnecessarily, thus we categorize alerts based on severity and business disruption, sending information via the integrated APIs to the relevant teams, specifying whether to communicate through Slack or phone based on the severity.
This is the main use case we have; it's a tool that last mile connect kind of people use.
View full review »Initially, I started using it for managing on-call schedules. As the tech stack developed, we began using it for service alerts and event routing, and then transitioned to operational views and dashboarding. It eventually became central to our alerting systems, where all monitoring tools would send information to PagerDuty, enabling event management and routing to whoever was on call.
View full review »PR
Prathik Rokhade
Associate Sr. Manager at Financial Insight Technology, Inc.
We use the solution for incident management.
View full review »Our use cases include generating alerts from our site 24/7. We are managing the cloud infrastructure there.
View full review »We have a support team consisting of roughly 20 support agents, and we used PagerDuty to raise alerts to people on the roster. It was integrated with our help desk ticketing system and AppDynamics, which we use for application monitoring.
There were rules in place for when AppDynamics generated an alert. For example, if a transaction is slow or something is about to go down, PagerDuty would notify the IT team members to look at the issue.
It's mainly for IT call scheduling, emergency contacts, events, and those kinds of things. It's integrated with AWS, MS Teams, Remedy, and other solutions.
View full review »We primarily use this solution to track alerts from our cloud environment and monitor and respond to alerts on our cloud platform.
View full review »I interact with customers and then see the use cases and how they would use the PagerDuty tool and the integrations that come with it. I basically see how the customers will use the product. I then have conversations with the higher members or the stakeholders in the company on what ROIs they would get from this tool and the measurements of metrics, etc.
View full review »We use PagerDuty for alerting and escalations. We have it integrated with our monitoring tools.
View full review »The primary use case of the solution is to alert the on-call person when there are any critical errors or when the servers are down. It is also used for the on-call scheduling of personnel.
View full review »DM
Deepak Malik
Director at a computer software company with 1,001-5,000 employees
The two major use cases were alerts for events and scheduling of engineers to get pages based on incidents.
View full review »I tie alerts from our GCP tenants to PagerDuty.
View full review »I use PagerDuty for incidents, event orchestration, alerts, and creating services.
View full review »We use this solution to alert us to system errors.
I use the solution as for log aggregation, scheduling, and filtering out false positives.
View full review »We use PagerDuty for incident managment. We're looking at integrating PagerDuty with Rundeck in the future.
DK
Darrin Khan
Compliance, Security & Testing Manager at a financial services firm with 11-50 employees
We are a 24-hour online business. We use it for scheduling our on-call engineers and making sure that there is follow-the-sun or round-the-clock coverage for alerting and network operations.
It ingests all our alert paths, i.e., anything that generates an alert of any description, such as, Splunk, AWS, and internal applications. We feed all our events into it, then it generates alerts which need a response from an engineer with a description. Another thing is it is built-in scheduling is pretty much hands-off for our on-call engineers unless somebody goes on holidays. That is the only time that we have to jump in there and make any changes.
GK
Gilad Karmy
Tier 4 Support Team Leader at a comms service provider with 10,001+ employees
The most common use case is the result of alerts coming from a monitoring system, like New Relic or Nagios, alerts that we define as critical. They are alerts where we need someone to get on a bridge or to start working on them during the night. Once such an alert is firing, it fires a PagerDuty alert and it triggers the current on-call who is scheduled in PagerDuty's schedule.
The on-call person acknowledges the alert and looks into it to understand what is going on and to update, via PagerDuty, what the status is. The update will be sent to all the groups that are part of the PagerDuty schedule until the issue is resolved.
We mostly integrate it with other monitoring tools like New Relic or Nagios, or we are using their email integration for on-call processes to page people in groups. We also use it for Sev 1 issues that are coming from alerts from New Relic or from Nagios or other monitoring systems.
EM
Emery Lyne Manayan
Manager, Service Delivery at Coherent Capital Advisors
The solution is used to alert the on-call users if we have priority-one or business-critical issues.
We mostly use it for our on-call engineers, for schedules, alerting, and critical alerts. And, of course, we use it for the management of an issue, so that people acknowledge the alerts, reassign them, etc.
View full review »MM
Michael Maack
Owner at IT Verke limited
Our primary use case of this solution is for alarming and to mitigate threats in our organization.
View full review »We use the product for intrusion management.
View full review »Buyer's Guide
PagerDuty Operations Cloud
February 2026
Learn what your peers think about PagerDuty Operations Cloud. Get advice and tips from experienced pros sharing their opinions. Updated: February 2026.
884,328 professionals have used our research since 2012.




































