Site Reliability Engineer 2 at a tech vendor with 1,001-5,000 employees
Real User
Top 5
Feb 19, 2026
Insights Hub serves as a centralized monitoring and data-driven decision-making platform. It acts as a single place where data is collected, analyzed, and turned into actionable insights. The primary use case changes slightly depending on the platform. For application monitoring, the main use case is to monitor application performance and troubleshoot issues, tracking response time and failures, monitoring dependencies of databases and APIs, helping with log analysis, alerting on issues, and enabling performance optimizations. This is very common in DevOps and SRE environments. Regarding troubleshooting, I will explain this using a real troubleshooting example from an application monitoring scenario in Microsoft Azure and AWS contexts. When users report that the application is very slow, Insights Hub helps troubleshoot step-by-step. First, I check the overview dashboard of Insights Hub where the request rate, average response time, failure rate, and server response trends are displayed. From this dashboard, I immediately noticed that response time suddenly increased from 200 milliseconds to 3 seconds and the failure rate also slightly increased, indicating that something has changed. In step two, I drill into performance by opening the performance section and checking requests. Insights Hub shows the slowest endpoints, percentile response times, and request breakdown by operations. I discovered a specific API command to check and narrowed the issue to one API. In step three, I check dependencies by opening the dependencies section in Insights Hub, which shows SQL calls, external APIs, and service-to-service calls. I noticed that the SQL dependency call to the order database is taking 3.5 seconds, revealing that the API itself is not slow but the database call is slow. In step four, I use logs with KQL queries, running a query like "request where duration is greater than 3000". From here I can correlate the slow request, database dependency duration, and any error patterns, potentially finding that a specific query is causing table scans, missing indexes, and lock contention. In the final step, I check live metrics to see if an ongoing issue exists. I can open live metrics streams and watch real-time CPU, memory, and request rate. The root cause identified in this example outcome was that a new deployment introduced a poorly optimized SQL query with a missing index that caused a full table scan, resulting in a slow database query that made the API slow and caused user complaints.
Insights Hub serves as a centralized monitoring and data-driven decision-making platform. It acts as a single place where data is collected, analyzed, and turned into actionable insights. The primary use case changes slightly depending on the platform. For application monitoring, the main use case is to monitor application performance and troubleshoot issues, tracking response time and failures, monitoring dependencies of databases and APIs, helping with log analysis, alerting on issues, and enabling performance optimizations. This is very common in DevOps and SRE environments. Regarding troubleshooting, I will explain this using a real troubleshooting example from an application monitoring scenario in Microsoft Azure and AWS contexts. When users report that the application is very slow, Insights Hub helps troubleshoot step-by-step. First, I check the overview dashboard of Insights Hub where the request rate, average response time, failure rate, and server response trends are displayed. From this dashboard, I immediately noticed that response time suddenly increased from 200 milliseconds to 3 seconds and the failure rate also slightly increased, indicating that something has changed. In step two, I drill into performance by opening the performance section and checking requests. Insights Hub shows the slowest endpoints, percentile response times, and request breakdown by operations. I discovered a specific API command to check and narrowed the issue to one API. In step three, I check dependencies by opening the dependencies section in Insights Hub, which shows SQL calls, external APIs, and service-to-service calls. I noticed that the SQL dependency call to the order database is taking 3.5 seconds, revealing that the API itself is not slow but the database call is slow. In step four, I use logs with KQL queries, running a query like "request where duration is greater than 3000". From here I can correlate the slow request, database dependency duration, and any error patterns, potentially finding that a specific query is causing table scans, missing indexes, and lock contention. In the final step, I check live metrics to see if an ongoing issue exists. I can open live metrics streams and watch real-time CPU, memory, and request rate. The root cause identified in this example outcome was that a new deployment introduced a poorly optimized SQL query with a missing index that caused a full table scan, resulting in a slow database query that made the API slow and caused user complaints.