I am working on a project where we are using Amazon infrastructure service with Lambda. We are using multiple services on Amazon. I'm working in a company where we are using AWS CloudTrail. We are using Amazon EC2 instances as infrastructure service, and we are using S3 bucket, Lambda function, step function. We have moved to DynamoDB and I'm using DynamoDB and a few databases.
For monitoring, we use Dynatrace. Dynatrace is connected with AWS CloudTrail, so AWS CloudTrail sends the notifications via SNS to Dynatrace. We get notifications, and Dynatrace will send notifications.
AWS CloudTrail is a log function that will store all logs easily for whatever services we are using on Amazon. All logs will be stored in AWS CloudTrail for 15 years, as we have the service purchased for that duration. The logs will be stored in AWS CloudTrail via S3 bucket.
In AWS CloudTrail, we have enabled CPU, disk, and RAM monitoring. These are three services we are monitoring from AWS CloudTrail. AWS CloudTrail will monitor and produce graphs. We have separate L1 teams for monitoring; they will monitor and share the information. Also, Dynatrace will receive the information, and if any service goes beyond the threshold limit, AWS CloudTrail will create an alert.
CloudTrail and CloudWatch are sister services. Both should be preconfigured internally. We only watch the dashboards because we can't go and watch each service; there are multiple servers running and multiple services configured, so we watch only the dashboard graphs. If any graph goes beyond the normal limit, we take action.
We watch the graph. If any of the graphs show abnormal activity, then immediately we dig into AWS CloudWatch and AWS CloudTrail. We check the reason by verifying the logs. The graph will show you the time period, so we go into AWS CloudWatch, filter the logs for that particular time period, and from there, we identify the cause of the issue, and then we troubleshoot.
API is a main element that allows us to connect AWS CloudWatch to AWS CloudTrail and AWS CloudTrail to Dynatrace. That connection is done via API. By API, everything is integrated. The integration part is managed by the API, transferring information from one service to another. We are not working on the API; the configuration team, the cloud operations team, they take care of it.