25 AWS S3 Features that you must know about

Even though Amazon S3 is one of the most common and widely utilized cloud storage service, many users are not aware of all of its capabilities.
Below are some of the advanced questions for the interviews and most commonly asked in Certifications exams.

So strap in and follow us as we investigate these Amazon S3 ecosystem secrets.


Table of Contents hide
1. What is Amazon S3?

Amazon S3 (Simple Storage Service) is a cloud-based object storage service provided by Amazon Web Services (AWS). It allows you to store and retrieve any amount of data from anywhere over the internet.


2. How does Amazon S3 work? 

It works by dividing data into smaller parts, called objects, and storing them in containers called buckets. Each object is stored with a unique identifier and can be retrieved using a standard web interface or API.


3. What are some of the key features of Amazon S3? What are the benefits of using Amazon S3

Scalability: S3 is designed to scale automatically to handle increasing amounts of data.

Cost-effective: S3 provides a low-cost solution for storing large amounts of data and has a flexible pricing model based on usage.

Security: S3 provides multiple layers of security, including encryption at rest and in transit, access control, and network isolation.

Versatile: S3 supports multiple data types and use cases, including big data analytics, backup and archive, content distribution, and disaster recovery.

Integration: S3 integrates with a variety of other AWS services and can be accessed using a standard web interface, APIs, or the AWS CLI.

Performance: S3 provides low latency and high throughput for data access and retrieval, and can be used for serving data over the internet using the S3 Transfer Acceleration feature.


4. What are Different types of storage classes in S3 and when to use them?

S3 offers following types of storage classes:

Standard: The default storage class for S3, providing high durability and availability for frequently accessed data.

Intelligent-Tiering: Automatically moves data between two access tiers (frequent and infrequent access) based on changing access patterns, optimizing cost.

S3 Infrequest Access: data in S3 IA is stored on multiple devices across multiple Availability Zones in an AWS Region to provide durability and high availability, with a slightly lower level of performance compared to S3 Standard.

S3 One Zone: Low-cost option that stores data in a single availability zone with lower durability guarantees than standard storage.

S3 Glacier: Low-cost, long-term data archiving solution, with retrieval times ranging from minutes to several hours.

S3 Glacier Deep Archive: The lowest cost storage class for long-term data retention, with retrieval times ranging from 12 hours to several days.

When to use:

Use Standard for frequently accessed data that requires high durability and availability.

Use Intelligent-Tiering for data with unknown or changing access patterns.

Use Infrequent access for infrequently accessed but critical data

Use S3 One Zone for infrequently accessed, non-critical data that can be recreated if lost.

Use S3 Glacier for long-term archiving of infrequently accessed data.

Use S3 Glacier Deep Archive for the lowest cost solution for data that can tolerate retrieval times of 12 hours or longer.


5. How to secure the data stored in Amazon S3 bucket?

Below some of the methods to secure data stored in a S3 bucket:

Access Control: To manage access to your S3 buckets and objects, use AWS Identity and Access Management (IAM). You control who has access to your data and what actions they are allowed to take.

Encryption: Encrypt data at rest using either server-side encryption with AWS Key Management Service (SSE-KMS) controlled keys or server-side encryption with Amazon S3 managed keys (SSE-S3). Utilizing SSL/TLS, data can also be encrypted while in transit.

ACLs (Access Control Lists) and bucket policies: To more precisely manage access to your S3 buckets and objects, use policies and ACLs.

Versioning: Turn on versioning for your S3 buckets to keep track of all changes made to objects (including writes and deletions) while they are stored there.

Object Locks: Prevents accidental or intentional deletion or overwriting of objects for a specified retention period

MFA Delete: Require multi-factor authentication (MFA) to delete objects, adding an extra layer of security.

Logging: Enable S3 server access logging to track requests to your S3 bucket. You can use this information to monitor and audit access to your data.

VPC endpoint: Use an Amazon S3 VPC endpoint to access S3 resources in a VPC, eliminating the need to traverse the public internet.

AWS Firewall Manager: Use Firewall Manager to apply security group rules to your S3 resources and control access to your S3 buckets and objects.


6. What are the different Amazon S3 features/Tools available to manage large data transfers?

S3 Transfer Acceleration – This functionality uses the widely dispersed edge locations of Amazon CloudFront to speed up transfers over the open internet. You can upload data to S3 up to 6 times faster with this feature compared with regular uploads.

S3 Transfer Utility – Amazon makes available the S3 Transfer Utility, a quick command-line utility for transferring data to and from S3.

S3 Transfer Manager – It is a Java library that uses a high-level API to transfer huge files to S3. It offers a quick, reliable, and efficient solution to upload and download big data sets.

AWS Snowball – It is a petabyte-scale data transfer service that allows users to move enormous volumes of data into and out of the AWS.

AWS Snowmobile: This exabyte-scale data transfer service physically moves data to an AWS location using a vehicle.

AWS Direct Connect – A dedicated network connection to AWS from your on-premises data-center is known as AWS Direct Connect. Large data transfers into and out of S3 can be done using Direct Connect, which avoids using the open internet and offers a more dependable and secure transfer experience.

Also there are lot of 3rd party programmes/tools/protocols (such as FTP/rsync).

Factors to consider while choosing any of the above services are size of your data, the speed of your transfer, and the level of security needed.


7. How to trigger an action based on an event in a S3 bucket?
What is Amazon S3 Event Notifications?

Amazon S3 Event Notifications is a feature of Amazon Simple Storage Service (S3) that allows you to receive notifications when certain events occur within their S3 buckets. These events can include events like:
Object creation,
Deletion or
Updates,
Restore objects,
Bucket replication,
Object tagging,
S3 Lifecycle expiration
S3 Lifecycle transition
S3-Intelligent tiering archival

S3 Event Notifications can be configured to send notifications to a variety of destinations like

  • Amazon Simple Notification Service (SNS) topics, 
  • AWS Lambda functions
  • SQS queues. 


AWS Lambda functions, SQS queues, Amazon Simple Notification Service (SNS) topics, and more can all be configured to receive notifications from S3 Event Notifications.

As a result, programmers can create unique workflows and automations based on S3 events, such as ones that start image processing or data analytics automatically whenever new files are added to a bucket.

In the following situations, S3 Event Notifications are highly helpful:

1. Automated real-time data processing

2. Create unique workflows

3. Keep track of and watch over S3 object access

4. Automatically respond to S3 data changes

You can use the S3 Management Console, the AWS SDKs, or the S3 REST API to configure S3 Event Notifications. In order to set up notifications, you must first identify the events to be tracked,

In conclusion, Amazon S3 Event Notifications is a powerful feature that allows you to automatically respond to changes in their S3 data, build custom workflows and automations, and track and monitor access to S3 objects, making it an essential tool for real-time data processing and automation.


8. What is the role of Amazon S3 during a disaster recovery scenario?

Amazon S3 helps in backup and recovery of data which is crucial during a disaster recovery.

Backup – Store the Important data frequently to Amazon S3. This provides an extra degree of security.

Recovery – After the disaster, you can start the recovery process to move the applications and data back to the main AWS region. 

The cross-region replication (CRR) of Amazon S3 can be used to copy/restore the data from/to the primary region.


9. What is the difference between Amazon S3 and Amazon EBS storage? 
FeatureAmazon S3Amazon EBS
TypeObject storageBlock-level storage
Purpose / ObjectiveStore and retrieve large amounts of unstructured dataStore data for EC2 instances
Access MethodsREST API, AWS CLI, AWS SDKs, and S3-specific toolsMounted as a file system to EC2 instances
ScalabilityHighly scalableScalable within the constraints of a single EC2 instance
DurabilityDurable and designed for 99.999999999% durabilityDurable and designed for high availability
PerformanceDesigned for high throughput and low latency for bulk operationsOptimized for I/O-intensive workloads and provides low-latency and high throughput performance
Use CasesData archiving, backups, big data analyticsBoot volume or database storage for EC2 instances

10. Explain S3 pricing model and calculation?


The cost of Amazon S3 is based on a combination of the below factors:Storage: charged per GB per month for the data that is kept in your S3 bucket.

Requests: charged for every 1,000 API requests, including GET, PUT, and POST requests.

Data transfer: charged for the amount of data that is uploaded to or downloaded from your S3 bucket. Various price levels apply for data transport within the same AWS region, between regions, or to the internet, depending on the source and destination of the data.

S3 Transfer Acceleration: This uses the widely dispersed edge locations of Amazon CloudFront to speed up data transfers to S3.

Each GB of data moved results in the cost for transfer acceleration.

Always check the AWS website for the most recent price details because the S3 pricing model is subject to change.

Also note that, AWS provides a free S3 tier with a set monthly allocation of free storage, requests, and data transmission.


11. Explain some real-world use case scenarios for Amazon S3?
  1. Backup & Recovery : Store critical data for an organization. During a disaster, quickly retrieve & restore data. 
  2. Big Data Analytics : Store large amounts of structured and unstructured data. For example e-commerce organizations can store purchase data in S3 and use big data analytics services like Amazon EMR to analyze user behaviour and improve business operations.
  3. Content Delivery : Storage for content delivery websites, mobile applications,internet facing applications. 
  4. Entertainment & Media : Store high quality audio & video for media & entertainment organizations.


12. What is S3 intelligent-Tiering? 

S3 Intelligent-Tiering is a cloud storage class which continuously monitors access patterns to your data and automatically moves the data between the two access tiers as needed.

If a data object is frequently accessed, it will stay in the frequent access tier.

If the access frequency decreases, the object will be moved to the infrequent access tier.

This process is done automatically and with no downtime (low latency), ensuring that your data is always stored in the most cost-effective storage tier thereby optimizing costs.

S3 Intelligent-Tiering is an ideal storage solution for data with unknown or changing access patterns, such as backups, archives, log files, and big data analytics workloads. 


13. What are some of the AWS S3 supported storage class transitions?

As shown in below diagram, S3 standard sits on top of the S3 storage class hierarchy. It can transition into other forms of storage class.


14. How to increase the speed of S3 upload/download ?
How to quickly upload/download objects from S3 bucket?
What  is Amazon S3 transfer acceleration?

Amazon S3 Transfer Acceleration is a feature of Amazon S3 that enables faster data uploads to S3 over the public Internet. It uses Amazon CloudFront’s globally distributed edge locations to accelerate transfers over the public Internet to Amazon S3.

For long-distance transfer of larger objects, it speeds up transfer by up to 50–100%.

It logically minimises the distance to S3 for remote applications and reduces the variability in Internet routing, congestion, and speeds that can affect transfers.


15. How to fix the file upload failure issue when there are network issues like network instability, congestion, low speed etc?
What is multi-part upload?

Large files can be uploaded quickly using Amazon S3’s S3 Multi-part Upload capability.

It enables the upload procedure to be broken up into various stages, each of which can be uploaded simultaneously.

This enables concurrent uploads of separate chunks rather than a single upload of the entire file, which speeds up and increases the reliability of uploads, especially for large files.

Additionally, it enables resume-able uploads in the event of a network outage since only the portions that have not been successfully uploaded need to be tried again.


16. How to provide cross account access for an S3 bucket?
What is a Bucket Policy in S3? 

A S3 bucket policy is a type of resource-based policy that can be used to grant permissions to the principal that is specified in the policy. Principals can be in the same account as the resource or in other accounts. For cross-account permissions to other AWS accounts or users in another account, you must use a bucket policy.


17. What is the difference between Multi-part upload & S3 transfer acceleration?

S3 Bucket Multipart Upload:

 A method of uploading large files to S3 by breaking them into multiple parts and uploading each part concurrently.

Improves upload speed and reliability by uploading parts in parallel.

S3 Transfer Acceleration:

 A feature that uses Amazon CloudFront’s globally distributed edge locations to accelerate uploads to S3.

Accelerates uploads by utilizing the CloudFront network, rather than uploading directly to the S3 bucket.

Ideal for uploading large files over long distances. 


18. What are the best practices to save cost on S3 bucket?
Best practices to save cost on S3 bucket.
  1. Enable S3 Transfer Acceleration
  2. Use S3 Intelligent-Tiering storage class
  3. Store infrequently accessed data in S3 Glacier
  4. Use S3 Inventory to track changes and identify objects that can be deleted
  5. Enable S3 object life cycle management
  6. Use S3 data compression or pre-compress data before uploading
  7. Store data in smaller, more specific S3 buckets
  8. Use S3 Cross-Region Replication for disaster recovery
  9. Take advantage of S3 pricing tiers based on access frequency and data retrieval costs.
  1. Enable S3 Transfer Acceleration: This is a feature that uses Amazon CloudFront’s globally distributed edge locations to accelerate uploads to S3. By uploading to a nearby edge location, data transfers to S3 can be faster than uploading directly to an S3 bucket.
  2. Use S3 Intelligent-Tiering storage class: This class automatically moves objects between two access tiers (frequent and infrequent access) based on changing access patterns, reducing storage costs and optimizing performance.
  3. Store infrequently accessed data in S3 Glacier: S3 Glacier is a low-cost storage class for data that is infrequently accessed, yet still requires long-term retention. This option can lower costs compared to storing data in other S3 storage classes.
  4. Use S3 Inventory to track changes and identify objects that can be deleted: S3 Inventory provides a list of your S3 objects and metadata, making it easier to track changes and identify objects that can be deleted to reduce storage costs.
  5. Enable S3 object life cycle management: This feature allows you to automatically transition objects to lower-cost storage classes or delete them as they age, reducing storage costs over time.
  6. Use S3 data compression or pre-compress data before uploading: Compressing data before uploading it to S3 reduces the amount of storage space required, leading to lower storage costs.
  7. Store data in smaller, more specific S3 buckets: Breaking data down into smaller, more specific buckets allows for more granular management and cost optimization.
  8. Use S3 Cross-Region Replication for disaster recovery: By replicating data to another region, you can minimize data loss in the event of a disaster and reduce recovery time. However, this option may increase costs compared to other disaster recovery solutions.
  9. Take advantage of S3 pricing tiers based on access frequency and data retrieval costs: S3 offers different pricing tiers based on access frequency and data retrieval costs, allowing you to optimize costs by choosing the right tier for your data. For example, infrequently accessed data can be stored in S3 Glacier, reducing costs compared to other S3 storage classes.

19. A new joiner to the team has accidentally deleted a file in S3 bucket.
How to recover the deleted S3 file and how to avoid accidental deletion in future?

To recover the deleted file in S3 bucket:

Go to the S3 Management Console

Select the bucket where the file was deleted

Check the “Versions” tab, if versioning is enabled, you can restore the deleted file from a previous version.

To avoid accidental deletion in future:

Enable versioning for the S3 bucket.

Use AWS Identity and Access Management (IAM) policies to restrict access to the S3 bucket.
Setup S3 object lock

Regularly backup the S3 data to another location.

Monitor the S3 bucket activity using Amazon CloudWatch or AWS CloudTrail.


20. What is S3 Object lock? 

S3 Object Lock is a feature in Amazon S3 that allows you to store objects in an immutable mode and prevent accidental or intentional deletion or overwriting of objects for a specified retention period. Once an object is locked, it can only be deleted by using a legal hold. S3 Object Lock provides an additional layer of protection for your critical data, ensuring it remains intact and unchanged for the specified retention period, even if an attacker gains access to your AWS account.


21. How to setup S3 Object lock?

1. Go to the S3 Management Console.

2. Select the desired S3 bucket.

3. Enable versioning for the bucket if it is not already enabled.

4. Go to the “Object Lock” settings for the bucket.

5. Choose the desired lock type: Governance or Compliance.

6. Choose the desired retention mode: COMPLIANCE or GOVERNANCE.

7. Set the desired retention period for the objects in the bucket.

8. Confirm the settings and enable Object Lock for the bucket.


22. What are the pre-requisites for setting up S3 Object lock? 

S3 Object Lock requires versioning to be enabled for the S3 bucket. 

Also, once Object Lock is enabled, it cannot be disabled or altered, but the retention settings can be changed.


23. What are the differences between Governance lock types & Compliance lock types in S3? 

In S3 Object Lock, there are two lock types: Governance and Compliance.

Governance Lock: It allows you to lock objects in a bucket for a specified period of time, during which they cannot be deleted or overwritten, even if someone has full access to the AWS account.
This type of lock is best used for preserving records and data for long-term retention and regulatory compliance.

Compliance Lock: It provides an additional layer of protection for critical data that must be retained for a specific period. This type of lock is intended to help you meet specific regulatory or legal requirements, such as SEC 17a-4(f) or FINRA Rule 4511.
The retention period for Compliance locked objects cannot be changed, and the objects can only be deleted by using a legal hold.

Both lock types are intended to provide additional protection for your S3 data, but the specific use case will dictate which lock type is most appropriate.


A Legal Hold is a feature in Amazon S3 Object Lock that allows you to temporarily prevent the deletion or overwriting of an object, even if it is locked with a retention period. This is useful in cases where an object needs to be preserved due to a legal or regulatory requirement.

To place a legal hold on an object, you need to issue a request to the S3 Object Lock API or the S3 Management Console. Once a legal hold is placed, the object’s retention period is ignored and it cannot be deleted or overwritten until the legal hold is removed. Legal holds are useful for preserving evidence in legal or regulatory cases and ensuring that critical data remains intact.

25. What are the different types of Encryption available on Amazon S3?

Amazon S3 (Simple Storage Service) itself does not provide encryption, but it supports various encryption mechanisms that you can use to secure your data. Here are the different types of encryption available for use with Amazon S3:

  1. Server-Side Encryption (SSE):

    a. Server-Side Encryption with Amazon S3 Managed Keys (SSE-S3): Amazon S3 manages the keys used for encryption. Data is encrypted before being stored on disks and decrypted when accessed.
    b. Server-Side Encryption with AWS Key Management Service (SSE-KMS): AWS Key Management Service (KMS) is used to manage the encryption keys. KMS provides additional control and auditing capabilities for key management.
    c. Server-Side Encryption with Customer-Provided Keys (SSE-C): You can provide your own encryption keys that are used to encrypt and decrypt data stored in Amazon S3. Amazon S3 doesn’t store your encryption keys, but it uses them to encrypt and decrypt data on your behalf.
  2. Client-Side Encryption: With client-side encryption, you encrypt the data on your own before sending it to Amazon S3. Amazon S3 stores the encrypted data as-is and doesn’t have access to the encryption keys or the plaintext data. This gives you full control over the encryption process. You can use any encryption library or tool of your choice to perform client-side encryption.
  3. Transit Encryption: Amazon S3 supports encryption of data during transit, ensuring that data sent between your client and Amazon S3 is encrypted and secure. This is achieved by using SSL/TLS (Secure Sockets Layer/Transport Layer Security) to establish an encrypted connection when uploading or downloading data.

    It’s important to note that it’s possible to combine different encryption options to create multi-layered security. For example, you can use SSE-S3 or SSE-KMS for server-side encryption and also perform client-side encryption before uploading the data to Amazon S3.

Conclusion

A quick reference through this will help in advanced cloud professional interviews. Also it helps solve most of the S3 questions in a Certification exam.

Do let us know your thoughts and comment if you think more can be added to this list.


0 Shares:
Leave a Reply

Your email address will not be published. Required fields are marked *

You May Also Like