aws

AWS Integration & Messaging

ayleeee 2024. 4. 4. 01:42

Multiple applications inevitably need to communicate with one another

There are two patterns of application communication

  • Synchronous communications
    • application to application
    • can be problematic if there are sudden spikes of traffic
  • Asynchronous / Event based
    • application to queue to application

What if you need to suddenly encode 1000 videos but usuaslly it's 10

-> Better to decouple the application

  • using SQS : queue model
  • using SNS : pub/sub model
  • using Kinesis : real-time streaming model

Amazon SQS

  • Simple Queue Service
  • 내구력 있고 가용성이 뛰어난 보안 호스팅 대기열 제공
  • 이를 통해 분산 소프트웨어 시스템과 구성 요소를 통합 및 분리할 수 있음
  • Standard Queue
    • Oldest offering
    • fully managed service, used to decouple applications
    • Attirbutes
      • Unlimited throughput, unlimited number of messages in queue
      • Default retention of messages : 4 days, maximum of 14 days
      • Low latency
      • Limitation of 256KB per message sent
    • can have duplicate messages
    • can have out of order messages (best effort ordering)
  • Producing Messages
    • Produced to SQS using the SDK (SendMessageAPI)
    • The message is persisted in SQS until a consumer deletes it
    • Message retention : default 4 days, up to 14 days
    • Example : send an order to be processed
      • Order id
      • Customer id
    • SQS standard : unlimited throughput
  • Consuming Messages
    • Consumers
    • Poll SQS for messages 
    • Process  the message
    • Delete the message using the DeleteMessageAPI
  • Multiple EC2 Instances Consumers
    • Consumers receive and process messages in parallel
    • At least once delivery
    • Best-effort message ordering
    • Consumers delete messages after processing them
    • Can scale consumers horizontally to improve throughput of processing
  • Security
    • Encryption
      • In-flight - HTTPS API
      • At-rest KMS keys
      • Client-side - if the client wants to perfrom encryption/decrpytion itself
    • Access Controls : IAM policies to regulate access to the SQS API
    • SQS Access Policies
      • useful for cross-account access to SQS queues
      • useful for allowing other services to write and SQS queue
  • Message Visibility Timeout
    • Once a message is polled by a consumer, it becomes invisible to other consumers
    • By default, the message visibility timeout is 30 seconds
      • has 30 seconds to be processed
    • After the message visibility timeout is over, the message is visible in SQS
    • If a message is not processed within the visibility timeout, it will be processed twice
      • can call "ChangeMessageVisibility API" to get more time
    • If visibility timeout is high, and consumer crashes, re-processing will take time
    • If visibility timeout is too low, may get duplicates
  • Long Polling
    • When a consumer requests messages from the queue, can optionally "wait" for messages to arrive if there are none in the queue
      • called Long Polling
    • Long Polling decreases the number of API calls made to SQS while increasing the efficiency and reducing latency of your application
      • wait time can be between 1 sec to 20 sec
      • preferable to Short Polling
      • can be enabled at the queue level 
      • can be enabled at the API level using WaitTimeSeconds
  • FIFO Queue
    • First In First Out
    • Limited throughput : 300msg/s without batching, 3000msg/s
    • Exactly-once send capability
    • Messages are processed in order by the consumer
  • If the load is too big, some transactions may be lost
  • SQS can be used as a buffer to database writes as well

Amazon SNS

  • What if you want to send one message to many receivers?
  • The "event producer" only sends message to one SNS topic
  • As many "event receivers" as we want to listen to the SNS topic notifications
  • Each subscriber to the topic will get all the messages
  • Up to 12,500,000 subscriptions per topic
  • 100,000 topics limit
  • Many AWS services can send data directly to SNS for notifications
  • 게시자에서 구독자로 메세지를 전송하는 관리형 서비스
    • 게시자는 논리적 엑세스 지점 및 커뮤니케이션 채널인 주제에 메세지를 전송하여 구독자와 비동기식으로 통신
    • 클라이언트는 Amazon Data Firehose, Amazon SQS, AWS Lambda, HTTP, 이메일, 모바일 푸시 알림, 모바일 문제 메세지 등 지원되는 엔드포인트 유형을 사용하여 SNS 주제를 구독하고 게시된 메세지를 수신할 수 있음
  • How to Publish
    • Topic Publish (Using SDK * Software Development Kit)
      • Create a topic
      • Create a subscription
      • Publish to the topic
    • Direct Publish (for mobile apps SDK)
      • Create a platform application
      • Create a platform endpoint
      • Publish to the platform endpoint
      • Works with Google GCM, Apple APNS, Amazon ADM
  • Security
    • Encryption
      • In-flight HTTPS API
      • At-rest KMS Keys
      • Client-side if the client wants to perform encryption/decryption itself
    • Access Control
      • IAM policies to regulate access to the SNS API
    • SNS Access Policies
      • Useful for cross-account access to SNS topics
      • Useful for allowing other services to write to an SNS topic
    • SNS+SQS : Fan Out
      • Push once in SNS, receive in all SQS queues that are subscribers
        • SNS를 사용하면 애플리케이션이 Push 매커니즘을 통해 시간이 중요한 메세지를 여러 구독자에게 보낼 수 있으므로 업데이터를 주기적으로 확인하거나 Polling 할 필요가 없음.
        • SQS는 분산 애플리케이션이 폴링 모델을 통해 메세지를 교환하는데 사용하는 메세지 대기열 서비스.
        • 즉, 두가지를 함께 사용하면 즉각적인 이벤트 알림을 필요로 하는 애플리케이션에서 메세지 전송할 수 있음. 그리고 다른 애플리케이션에서 나중에 처리할 수 있도록 메세지를 Amazon SNS 대기열에 계속 보관할 수 있음.
      • Fully decoupled, no data loss
      • SQS allows for : data persistence, delayed processing and retries of work
      • Ability to add more SQS subscribers over time
      • Make sure your SQS queue access policy allows for SNS to write
      • Cross-Region Delivery : works with SQS Queues in other regions
      • If you want to send the same S3 event to many SQS queues, use fan-out
    • FIFO Topic
      • First In First Out
      • can have SQS Standard and FIFO queus as subscribers
      • limited throughput
    • Message Filtering
      • JSON policy used to filter messages sent to SNS topic's subscriptions
        • If a subscription does not have a filter policy, it receives every message

Kinesis

  • Makes it easy to collet, process and analyze streaming data in real-time
    • 모든 규모의 스트리밍 데이터를 비용 효율적으로 처리하고 분석하는 완전 관리형 서비스
  • Ingest real-time data : Application logs, Metrics, Website clickstreams, IoT telemetry data
  • Kinesis Data Streams:
    • capture, process and store data streams
      • IT 인프라 로그 데이터, 애플리케이션 로그, 소셜 미디어, 시장 데이터 피드, 웹 클릭스트림 데이터
      • 데이터 인테이크 및 처리에 대한 응답이 실시간으로 이루어짐
      • 데이터를 실시간으로 집계한 후 집계 데이터를 데이터 웨어하우스나 map-reduce 클러스터에 로드
      • 여러 가지 작업이 동시에 개별적으로 이루어질 수 있음
        • 두 어플리케이션이 같은 스트림에서 데이터를 읽고 첫번째는 실행 중인 집계를 계산하고 Amazon DynamoDB 테이블을 업데이트, 두번째는 데이터를 압축하여 S3 같은 데이터 스토어에 보관
    • 지속성과 탄력성 보장
    • 레코드가 스트림에 추가되는 시간과 레코드가 검색될 수 있는 시간 = 1초 미만
      • 즉, 거의 바로 사용 가능
    • 관리형 서비스이므로 데이터 인테이크 파이프라인을 생성하고 실행하는 작업 부담이 줄어든다.
    • retention between 1 day to 365 days
    • ability to reprocess data
    • Once data is inserted in Kinesis, can't be deleted
    • Data that shares the same partition goes to the same shard 
    • Producer : AWS SDK, Kinesis Producer Library, Kinesis Agent
    • Consumers
      • Write your own : Kinesis Client Library, AWS SDK
      • Managed : AWS Lambda, Kinesis Data Firehose, Kinesis Data Analytics
    • Capacity Modes
      • Provisioned Mode
        • choose the number of shards provisioned, scale manually or using API
        • each shard gets 1MB/s in
        • each shard gets 2MB/s out
        • pay per shard provisioned per hour
      • On-demand Mode
        • no need to provision or manage the capacity
        • default capacity provisioned
        • scales automatically based on observed throughput peak during the last 30 days
        • pay per stream per hour & data in/out per GB
    • Security
      • Control access /authorization using IAM policies
      • In-flight - HTTPS endpoints
      • At-rest - KMS
      • can implement encryption/decryption of data on client side
      • VPC Endpoints available for Kinesis to access within VPC
      • Monitor API calls using CloudTrail
  • Kinesis Data Firehose:
    • load data streams into AWS data stores
      •  Amazon Simple Storage Service (Amazon S3), Amazon Redshift, Amazon 서비스, OpenSearch Amazon Serverless, Splunk 및 지원되는 타사 서비스 제공업체가 소유한 Datadog, LogicMonitor Dynatrace, MongoDB, New Relic, Coraloc 등의 대상에 실시간 스트리밍 데이터를 전송하는 완전 관리형 서비스
    • fully managed service, no administration, automatic scaling, serverless
    • pay for data going through Firehose
    • near RealTime
    • supports many data formats, conversions, transformations, compression
      • Amazon Data Firehose로 데이터를 보내도록 데이터 생산자를 구성하면 지정한 목적지로 데이터가 자동으로 전송됨
    • supprots custom data transformations using AWS Lambda
    • can send failed or all data to a backup S3 bucket
    • 핵심 개념
      • Firehose 스트림
        • Amazon Data Firehose의 기본 엔티티
        • Firehose 스트림을 생성한 다음 이 스트림으로 데이터를 전송
      • Record
        • 데이터 생산자가 Firehose 스트림에 보내는 데이터
        • 최대 1000KB
  • Kinesis Data Streams vs Firehose
    • Kinesis Data Streams
      • Streaming service for ingest at scale
      • Write custom code
      • Real-time
      • Manage sacling (shard splitting /merging)
      • Data storage for 1 to 365 days
      • Supports replay capability
    • Kinesis Data Firehose
      • Load streaming data into S3/Redshift/OpenSearch/3rd Party/custom HTTP
      • Fully managed
      • Near real-time
      • Automatic scaling
      • No data storage
      • Doesn't support replay capability
  • Kinesis Data Analytics:
    • analyze data streams with SQL or Apache Flink
  • Kinesis Video Streams:
    • capture, process, and store video streams
  • Ordering data into Kinesis
    • "Partition Key" to track the value of something
      • the same key will always go to the same shard
  • Ordering data into SQS
    • For SQS standard, there is no ordering
    • For SQS FIFO, if you don't use a Group ID, messages are consumed in the order they are sent, with only one consumer
    • If you want messages to be grouped when they are related each ohter, then you can use a Group ID(similar to Partition Key in Kinesis)

Amazon MQ

  • SQS, SNS are "cloud-native" services : proprietary protocols from AWS
  • Traditional applications running from on-premises may use open protocols 
    • In this case, instead of re-engineering the application to use SQS and SNS, we can use Amazon MQ
    • Amazon MQ는 클라우드의 메세지 브로커로 쉽게 마이그레이션 할 수 있도록 하는 관리형 메세지 브로커 서비스
      • * 메세지 브로커를 사용하면 소프트웨어 시스템이, 대개 다양한 플랫폼에서 서로 다른 프로그래밍 언어를 사용하여 통신하고 정보를 교환할 수 있다
      • Amazon MQ 사용시, 몇 단계만으로도 메세지 브로커를 프로비저닝하고 소프트웨어 버전 업그레이드에 대한 지원을 받을 수 있음
    • 현재는 아파치 ActiveMQ 클래식 및 RabbitMQ 엔진 유형을 지원함
    • Amazon MQ는 고유의 메시징 시스템을 관리, 운영 또는 유지 관리할 필요 없이 기존 애플리케이션 및 서비스와 함께 작동 
    • 기존 메세지 브로커에서 애플리케이션을 마이그레이션 할 때 Amazon MQ 사용
    • 무제한에 가까운 확장성과 간편한 API를 활용할 수 있는 새로운 애플리케이션 Amazon SQS/Amazon SNS 사용
  • Amazon MQ doesn't scale as much as SQS/SNS
  • Amazon MQ runs on servers, can run in Multi-AZ with  failover
  • Amazon MQ has both queue feature and topic features

 

'aws' 카테고리의 다른 글

Serverless Overview  (0) 2024.04.06
Containers on AWS  (0) 2024.04.05
S3  (0) 2024.03.29
CloudFront & Global Accelerator  (3) 2024.03.29
Route 53  (1) 2024.03.28