Amazon CloudWatch Metrics
- Metric is a variable to monitor
- belong to namespaces
- Dimension is an attribute of a metric
- Instance Id, environment
- up to 30 dimensions per metric
- have timestamps
- Can create CloudWatch dashboards of metrics
- Can create CloudWatsh Custom Metrics
Amazon CloudWatch Metric Streams
- Continually stream CloudWatch metrics to a destination of your choice, with near-real-time delivery and low latency
- Option to filter metircs to only stream a subset of them
Amazon CloudWatch Logs
- Log groups
- arbitrary name
- Log stream
- instances within application / log files / containers
- Can define log expiration polices
- CloudWatch Logs can send logs to :
- Amazon S3, Kinesis Data Streams, Kinesis Data Firehose, AWS Lambda, OpenSearch
- S3 Export
- Log data can take up to 12 hours to become available for export
- The API call is CreateExportTask
- Not near-real time or real-time
- Instead, use Logs Subscriptions
- Get a real-time log events from CloudWatch Logs for processing and analysis
- Send to Kinesis Data Streams, Kinesis Data Firehose, or Lambda
- Subscription Filter - filter which logs are evenets delivered to your destination
- Cross-Account Subscription - send log events to resources in a different AWS account
- Instead, use Logs Subscriptions
- Logs are encrypted by default
- Can setup KMS-based encryption with your own keys
- Logs for EC2
- By default, no logs from your EC2 machine will go to CloudWatch
- You need to run a CloudWatch agent on EC2 to push the log files you want
- Make sure IAM permissions are correct
- The CloudWatch log agent can be setup on-premises too
- Logs Agent & Unified Agent
- For virtual servers
- CloudWatch Logs Agent
- Old version of the agent
- Can only send to CloudWatch Logs
- CloudWatch Unified Agent
- Collect additional system-level metrics such as RAM, processes, etc
- Collected directly on your Linux server / EC2 instance
- Collect logs to send to CloudWatch Logs
- Centralized configuration using SSM Parameter Store
- Collect additional system-level metrics such as RAM, processes, etc
Amazon CloudWatch Alarms
- Alarms are used to trigger notifications for any metric
- Various options (sampling, % ...)
- Alarm States:
- OK
- INSUFFICIENT_DATA
- ALARM
- Period:
- Length of time in seconds to evalute the metric
- High resolution custom metrics
- 10 sec, 30 sec or multiples of 60 sec
- Targets
- Stop, Termincate, Reboot, or Recover an EC2 Instance
- Trigger Auto Scaling Action
- Send notification to SNS
- Composite Alarms
- CloudWatch Alarms are on a single metric
- Composite Alarms are monitoring the states of multiple other alarms
- AND and OR conditions
- Helpful to reduce "alarm noise" by creating complex composite alarms
- EC2 Instance Recovery
- Status Check
- Instance status = check the EC2 VM
- System status = check the underlying hardware
- Recovery
- Same Private, Public, Elastic IP, metadata, placement group
- Status Check
- Alarms can be created based on CloudWatch Logs Metrics Filter
- To test alarms and notifications, set the alarm state to Alarm using CLI
- Container Insights
- Collect, aggregate, summarize metrics and logs from containers
- Available for containers on :
- Amazon ECS
- Amazon EKS
- Kubernetes platforms on EC2
- Fargate (both for ECS and EKS)
- In Amazon EKS and Kubernetes, CloudWatch Insight is using a containerized version of the CloudWatch Agetn to discover containers
- Lambda Insights
- Monitoring and troubleshooting solution for serverless applications running on AWS Lambda
- Collects, aggregates, and summarizes system-level metrics including CPU time, memory, disk and network
- Collects, aggregates, and summarizes diagnostic information such as cold starts and Lambda worker shutdowns
- Lambda Insights is provided as a Lambda Layer
- Application Insights
- Provides automated dashboards that show potential problems with monitored applications, to help isolate ongoing issues
- Enhanced visibility into your application health to reduce the time it will take you to troubleshoot and repair your applications
- Findings and alerts are sent to Amazon EventBridge and SSM OpsCenter
- Insights and Operational Visibility
- CloudWatch Container Insights
- CloudWatch Lambda Insights
- CloudWatch Contributors Insights
- CloudWatch Application Insights
Amazon EventBridge
- Schedule : Cron jobs (scheduled scripts)
- Event Pattern : Event rules to react to a service doing something
- Trigger Lambda functions, send SQS/SNS messages
- Event buses can be accessed by other AWS accounts using Resource-base Policies
- You can archive events sent to an event bus
- Ability to reply archived events
- EventBridge can analyze the events in your bus and infer the schema
- The Schema Registry allows you to generate code for your application, that will know in advance how data is structured in the event bus
- Schema can be versioned
- Resource-based Policy
- Manage permissions for a specific Event Bus
- Example : allow/deny events from another AWS account or AWS region
- Use Case : aggregate all events from your AWS Organization in a single AWS account or AWS region
AWS CloudTrail
- Provides governance, compliance and audit for your AWS Account
- CloudTrail is enabled by default
- Get an history of events / API calls made within your AWS Account by :
- Console, SDK, CLI, AWS Services
- Can put logs from CloudTrail into CloudWatch Logs or S3
- A trail can be applied to All Regions(default) or a single Region
- If a resource is deleted in AWS, investigate CloudTrail first!
CloudTrail Events
- Management Events:
- Operations that are performed on resources in your AWS account
- By default, trails are configured to log management events
- Can separate Read Events from Write Events
- Data Events:
- By default, data events are not logged (because high volume operations)
- Amazon S3 object-level activity : can separate Read and Write events
- AWS Lambda function execution activity
- CloudTrail Insights
- Enable CloudTrail insights to detect unusual activity in your account:
- inaccurate resource provisioning
- hitting service limits
- bursts of AWS IAM actions
- gaps in periodic maintenance activity
- CloudTrail Insights analyzes normal management events to create a baseline
- And then continously analyzes write events to detect unusual patterns
- Anomalies appear in the CloudTrail console
- Event is sent to S3
- An EventBridge event is generated
- Enable CloudTrail insights to detect unusual activity in your account:
- CloudTrail Events Retention
- Events are stored for 90 days in CloudTrail
- To keep events beyond this period, log them to S3 and use Athena
AWS Config
- Helps with auditing and recording compliance of your AWS resources
- Helps record configurations and changes over time
- Can receive alerts for any changes
- AWS Config is a per-region service
- Can be aggregated across regions and accounts
- Possibility of storing the configuration data into S3 (analyzed by Athena)
Config Rules
- Can use AWS managed config rules
- Can make custom config rules (must be defined in AWS Lambda)
- Rules can be evaluated / triggered :
- For each config change
- And / or : at regular time intervals
- AWS Config Rules does not prevent actions from happening (no deny)
- Remediation (복원/교정)
- Automate remediation of non-compliant resources using SSM Automation Documents
- Use AWS-Managed Automation Documents or create custom Automation Documents
- You can set Remediation Retries if the resource is still non-compliant after auto-remediation
- Notifications
- Use EventBridge to trigger notifications when AWS resources are non-compliant
- Ability to send configuration changes and compliance state notifications to SNS
CloudWatch vs CloudTrail vs Config
- CloudWatch
- Performance monitoring & dashboards
- Event & Alerting
- Log Aggrgation & Analysis
- CloudTrail
- Record API calls made within your Account by everyone
- Can define trails for specific resources
- Global Service
- Config
- Record configuration changes
- Evaluate resources against compliance rules
- Get timeline of changes and compliance
For an Elastic Load Balancer
- CloudWatch
- Monitoring Incoming connections metric
- Visualize error codes as % over time
- Make a dashboard to get an idea of your load balancer performance
- Config
- Track security group rules for the Load Balancer
- Track configuration changes for the Load Balancer
- Ensure an SSL certificate is always assigned to the Load Balancer (compliance)
- CloudTrail
- Track who made any changes to the Load Balancer with API calls
'aws' 카테고리의 다른 글
| AWS Security & Encryption (1) | 2024.04.18 |
|---|---|
| Advanced Identity in AWS (0) | 2024.04.17 |
| Machine Learning (1) | 2024.04.10 |
| Data & Analytics (1) | 2024.04.08 |
| Databases in AWS (3) | 2024.04.07 |