Makayla Mayne.com

• •

AWS: Simplifying Concepts of Data Transfer

S3 Lifecycle Policy

A lifecycle policy allows you to automate the process of transitioning data from S3 Standard to S3 Glacier or other S3 storage classes based on specific rules, effectively reducing costs without the need for manual intervention or continuous monitoring.

Datasync

It’s made to make your life easier by automating the data transfer between your on-premises storage and AWS, which is perfect for keeping sensitive files and documents in sync regularly. Plus, this service makes sure the whole transfer process runs smoothly and is less likely to hit snags.

Datasync is an AWS service that streamlines and automates data transfers between on-premises storage and AWS storage services, as well as between different AWS services. It efficiently manages large-scale data migrations with minimal manual effort, making it suitable for synchronization, backup, and archiving.

Datasync can’t use Amazon EBS (Elastic Block Store) for data transfers; it works with Amazon S3, Amazon EFS, and Amazon FSx for Windows File Server instead.

Amazon Kinesis Data Streams

Monitor Data in Real Time

Amazon Kinesis Data Streams is for monitoring data in real time for applications, leaderboards, and real-time fraud detection. There are two modes: on-demand, which is fully managed by AWS, and provisioned, which means you need to think about how you will manually process the data with shards. Just know that shards on the exam represent the segregation of data going into a stream. I would think of it as a different cord for electricity, but it is all going to power the same device. Shards are the fundamental units of throughput in a Kinesis stream. They divide the data into parallel lanes. In terms of Kinesis Data Streams, we have producers and consumers. Producers send the data into the stream, whilst the consumers ingest the data.

Kinesis Data Firehose

Monitor Data in Near Real Time

Kinesis data Firehose loads data in near real time into S3, Redshift, OpenSearch, 3rd Party and custom HTTP. Unlike data streams there is no storage within the service itself- instead it sends the storage to another service or analytics platform. If the question says near real time you should think “Firehose.” Why do we need Firehose? Firehose is used for when we need real-time data analytics like click streams on a website to be sent to analytic software. Firehose is fully managed, automatically scales and is cost effective for users who want to reduce overhead.

Amazon MQ

Amazon MQ is a message broker service. A message broker helps with decoupling because it helps to allow different applications and platforms that may even use different protocols to send and receive messages. They send messages with producers (senders) and have consumers receive the messages and use a message queue which allows the messages to be delivered in their proper order. Active/standby brokers in Amazon MQ are a way to increase availability when the active broker fails. The standby broker stays synchronized with the active broker and monitors the active broker’s health.

Questions related to Amazon MQ will look somewhat like this on the exam

A company has migrated its message‑processing system to AWS.
The current architecture looks like this:
ActiveMQ running on an EC2 instance receives messages.
A consumer application on EC2 processes those messages.
Processed results are written to a MySQL database running on EC2.
The company now wants the entire system to be highly available while also keeping operational complexity low.
Which architecture provides the highest availability?

Use Amazon MQ active/standby brokers, use an Auto Scaling group for consumers across AZs, use RDS MySQL Multi‑AZ.