What is our primary use case?
In my organization, particularly in Ericsson's telecom BSS domain, the primary use case of Coralogix is centralized log management and real-time monitoring of telecom applications, such as the BSS software products and infrastructure. We have multiple SDP and AIR components in our telecom stack, including application servers, network elements, microservices, and APIs. Coralogix acts as a single platform where all logs are aggregated, which makes troubleshooting much faster compared to using different systems. The platform also helps with monitoring live traffic and system behavior. We have configured alerts for critical scenarios such as service downtime, API failures, error rates, and spikes.
Using Coralogix has significantly improved the efficiency and structure of my daily work, especially in monitoring and troubleshooting. Previously, we relied on manual checks and customer complaints to identify different issues. Now, with real-time alerts in Coralogix, we get notified immediately when something goes wrong, such as API failures or abnormal error spikes. The approach has shifted from reactive troubleshooting to proactive monitoring. Previously, we had to log into multiple servers and manually check logs, which was time-consuming. Now, everything is centralized in Coralogix, and with powerful search and filtering, we can quickly pinpoint issues within minutes instead of hours.
One recent incident involved a production issue where one of our SDP application services started showing intermittent API failures during peak traffic hours. Subscribers in Singapore were experiencing delayed responses, and some transactions were failing. Initially, it was difficult to identify the root cause because multiple microservices and backend systems were involved. Using Coralogix, we were able to quickly narrow down the issue through real-time log monitoring, centralized log correlation, advanced filtering and search, and alerting dashboards. We observed through Coralogix dashboards that error rates suddenly increased for a particular service after a deployment. By filtering logs based on transaction IDs and timestamps, we traced the issue to a backend service timeout caused by a misconfigured connection pool. The biggest advantage was that we did not need to manually log into multiple servers to investigate. Everything was visible from a single platform, which saved considerable troubleshooting time. Because of Coralogix alerts, the operations team was informed immediately, and we resolved the issue much faster than our general earlier processes, significantly reducing troubleshooting time and providing faster root cause identification.
What is most valuable?
In my experience, a few features of Coralogix stand out and make it very effective for day-to-day operations. First, the centralized log management makes all logs from different systems, applications, servers, and microservices available in one place. This eliminates the need to check multiple servers and makes troubleshooting much faster. Second, the powerful search capability is very strong. We can filter logs based on time range, severity, transaction IDs, or specific keywords, which helps quickly identify issues in complex telecom environments such as our BSS products. Third, the real-time alerting allows us to configure alerts for critical events such as error spikes, service failures, and latency issues. This helps in immediate detection and faster response, reducing downtime.
Among all the features, the one that makes the biggest difference for our team is centralized log management with powerful search capability in Coralogix. In our environment, multiple systems are involved, including application servers, APIs, backend systems, and network components, all in production with customers and subscribers using them. Previously, troubleshooting meant logging into those servers and manually checking each log, which was time-consuming and inefficient. With Coralogix, all logs are available in one place. We can quickly search using filters such as transaction ID, particular timestamp, or error code. We can correlate logs across multiple services. This has drastically reduced our mean time to resolution. The real impact is faster issue identification and less manual effort.
What needs improvement?
Coralogix works well for our needs, but there are a few areas where improvements can be made. One area is querying performance for large-scale data sets. When we are dealing with very high log volumes, some complex queries take time to return results. Improving query speed and optimization would enhance the troubleshooting experience. Another point is the learning curve for advanced features. While basic usage is straightforward, advanced querying and dashboard configurations can take time for new users we are onboarding. We have faced this situation in our organization's domain frequently. More simplified UI options or guided templates would help new team members onboard faster. Additionally, dashboard customization flexibility needs improvement. Although dashboards are useful, having more flexibility in customization would make them even more powerful. An important point is cost optimization. Since log volume is high in our environment, better visibility and control over cost optimization would be beneficial.
These are minor improvements overall. Coralogix already provides strong capabilities for centralized logging and monitoring, but enhancing these areas would make it even more efficient for large-scale environments in our telecom servers. Improvements could include query performance, alert noise reduction, and ease of use for advanced features, especially for large-scale environments like ours.
For how long have I used the solution?
I have been using Coralogix for more than four years.
What do I think about the stability of the solution?
In our experience, Coralogix has been quite stable and reliable for our day-to-day operations. We use it continuously for monitoring and troubleshooting, and we have not faced any major stability issues that impacted our work significantly. The platform is generally highly available. Log ingestion and processing are consistent, even during high traffic. Our charging system dashboards and alerts work reliably in real time. There have been occasional minor delays or latency during very high log volume, but nothing critical or long-lasting.
What do I think about the scalability of the solution?
Coralogix has been highly scalable and well-suited for a high-volume environment such as our charging products. As our system usage and log volume increased, Coralogix was able to handle the growth without requiring any major changes from our side. Our systems generate a huge amount of logs, especially during peak hours, and Coralogix has been able to ingest and process the data smoothly without major performance issues. Since it is a managed SaaS platform, scaling is handled by Coralogix itself. We do not need to worry about infrastructure, storage, or cluster management. Even as new services or microservices were added, the platform continued to perform reliably in terms of search, monitoring, and alerting.
How are customer service and support?
Whenever we have raised issues or queries, the support team has generally responded in a timely manner and helped us resolve problems effectively. For most cases, especially critical issues, the response time has been quick, which is important in our charging environment in telecom. The support team has good technical knowledge and is able to understand log-related monitoring issues without much back and forth. During initial setup or when configuring alerts and dashboards, their guidance was useful. The support has been reliable and helpful for our day-to-day needs.
Which solution did I use previously and why did I switch?
Before Coralogix, we were using a combination of traditional logging approaches and tools such as the ELK stack and ElasticSearch specifically. While ELK is powerful, we faced a few challenges in our environment. It had a high maintenance overhead, and managing infrastructure storage and scaling ElasticSearch clusters required significant effort. As log volume increased day by day, query performance and indexing became slower. Compared to modern observability platforms, it had very limited real-time alerting capabilities. It also had operational complexity and required dedicated effort for tuning and upkeep. We switched to Coralogix because of its SaaS model with no need to manage infrastructure, better real-time monitoring and alerting, and faster and more efficient log search with scalability without operational burden. Previously using the ELK stack, we moved to Coralogix due to high maintenance and scalability challenges, choosing it for its SaaS model, better performance, and real-time monitoring capabilities.
How was the initial setup?
Before adopting Coralogix in our organization, our organization evaluated a few other observability and log management tools to find the best fit for our telecom environment. Some options were Splunk, Datadog, and the ELK stack. Tools such as Splunk can become quite expensive with high log volume. Coralogix provided a more cost-effective model for our use cases. In telecom, log ingestion is massive, and Coralogix handled high-volume logs efficiently without requiring us to manage infrastructure. Compared to ELK, Coralogix removed the need for managing cluster storage and scaling.
What about the implementation team?
In our case, the procurement of Coralogix was handled at the organizational or management level. I am not directly involved in the purchasing process. However, it can be purchased either directly from the vendor or through platforms such as the AWS marketplace. From a user perspective, we mainly focus on using the platform for monitoring and troubleshooting rather than the procurement side.
What was our ROI?
We have definitely seen a positive return on investment from using Coralogix, both in terms of time savings and improved system reliability. Regarding mean time to resolution, which I have already discussed, previously one to two hours were required to resolve major issues. Now it takes around ten to twenty minutes, representing approximately a sixty to seventy percent reduction in resolution time. Less downtime means fewer service disruptions, which is very critical in our telecom services. Even a small outage can have significant business impacts. Another point is reduced service downtime for revenue protection. With faster detection and resolution, approximately thirty to forty percent reduction in downtime has been achieved over these years. In one incident involving an API failure during peak hours, Coralogix helped us quickly identify a backend timeout issue. Because we resolved it faster, we avoided prolonged service impact and potential customer complaints. Engineers spend less time manually checking logs across the system, with approximately forty to fifty percent of time saved in troubleshooting activities.
What's my experience with pricing, setup cost, and licensing?
My experience with Coralogix pricing and licensing has been generally positive, especially considering the value it provides in terms of monitoring and troubleshooting. It follows a usage-based pricing model that mainly depends on log volume and data ingestion. This works well for us because it is scalable, and we can adjust based on our needs. Some challenges we faced include the fact that in our environment, log volume is very high, so cost management becomes important, and we need to be mindful about which logs to ingest and retain to avoid unnecessary costs. Sometimes, it requires fine-tuning log levels and retention policies. Overall, the pricing is reasonable for the value it delivers, but it needs proper optimization to be cost-effective at scale.
Which other solutions did I evaluate?
As far as I am aware, our organization's relationship with Coralogix is primarily as a customer using their platform for observability and log management. I am not aware of any additional business relationships such as partnerships, reselling, or strategic collaboration beyond that. My involvement is mainly on the technical and usage side, so procurement or partnership details would be handled by management. We are primarily customers of Coralogix and do not have any additional business relationship beyond that.
What other advice do I have?
While exact numbers can vary across teams, based on our current usage, we have observed clear improvements in a few key areas. I have already discussed mean time to resolution. Previously, one to two hours were easily required to identify and resolve issues. Now, it is twenty to thirty minutes in most cases, representing approximately a sixty to seventy percent reduction in resolution time. The second point is faster issue detection. Previously, issues were often detected after user complaints or manual checks. Now, real-time alerts detect issues within minutes, achieving approximately seventy to eighty percent faster detection. Downtime has also been reduced by thirty to forty percent in service terms. Additionally, team productivity has increased by forty to fifty percent due to less time spent on manual log checking across all servers.
Based on our experience, Coralogix is a powerful tool, but to get the best value out of it, I would suggest a few things. You should always plan your log strategy early and avoid ingesting everything blindly. Define which logs are critical and set proper log levels. This will help with both performance and cost optimization. Second, you need to set up meaningful alerts by configuring them carefully to avoid noise and focus on actionable alerts that really need attention. Too many alerts can reduce effectiveness. Additionally, you should use structured logging. If logs are well-structured in JSON format or with proper fields, searching and analysis become much easier and faster. Finally, build dashboards for key metrics by creating dashboards for important KPIs such as error rates, latency, and service health, which will help in quick monitoring and decision-making. I would rate this product an eight out of ten based on my overall experience.