S3 (Simple Storage Service)
Amazon S3 (Simple Storage Service) is an object storage service that offers industry-leading scalability, data availability, security, and performance. It’s designed to store and retrieve any amount of data, at any time, from anywhere on the web. Think of it as a highly scalable, durable, and secure hard drive in the cloud.
Key Concepts
- Object Storage: S3 stores data as objects within buckets. Unlike file systems (which have a hierarchical directory structure), object storage is flat. Each object has a unique key (like a filename) and consists of data, metadata (information about the data), and a globally unique ID.
- Buckets: Containers for objects. Bucket names are globally unique across all AWS accounts. You create buckets in specific AWS regions to optimize for latency, minimize costs, or address regulatory requirements.
- Keys: The unique identifier for an object within a bucket. The combination of a bucket name and a key uniquely identifies each object. Keys can include
/
characters to simulate a directory structure, but it’s still fundamentally flat storage. (e.g.,my-bucket/images/logo.png
-my-bucket
is the bucket,images/logo.png
is the key). - Regions: S3 buckets are created in specific AWS regions. Choosing a region close to your users can reduce latency.
- Durability and Availability: S3 is designed for 99.999999999% (11 9’s) durability and 99.99% availability of objects over a given year. This high durability is achieved through data redundancy across multiple facilities and devices.
- Storage Classes: S3 offers different storage classes optimized for different access patterns and cost requirements:
- S3 Standard: General-purpose storage for frequently accessed data.
- S3 Intelligent-Tiering: Automatically moves objects between frequent, infrequent, and archive access tiers based on access patterns, optimizing costs.
- S3 Standard-IA (Infrequent Access): For data that is accessed less frequently, but requires rapid access when needed. Lower storage cost than S3 Standard, but with a retrieval cost.
- S3 One Zone-IA: Similar to S3 Standard-IA, but stores data in a single Availability Zone, offering lower cost but less redundancy.
- S3 Glacier Instant Retrieval: Archive storage for data rarely accessed.
- S3 Glacier Flexible Retrieval: Low-cost archive storage with retrieval times ranging from minutes to hours.
- S3 Glacier Deep Archive: Lowest-cost storage class for long-term archiving, with retrieval times of 12 hours or more.
- Versioning: S3 can keep multiple versions of an object in the same bucket. This protects against accidental deletion or overwriting.
- Lifecycle Policies: You can define rules to automatically transition objects between storage classes or delete them after a specified period. This helps optimize storage costs.
- Access Control: S3 offers granular access control using:
- IAM Policies: Control access for AWS users and groups.
- Bucket Policies: Control access to an entire bucket.
- Access Control Lists (ACLs): Control access to individual objects (generally, bucket policies are preferred).
- Pre-signed URLs: Generate temporary URLs that grant time-limited access to specific objects, even to users who don’t have AWS credentials.
- Static Website Hosting: S3 can host static websites (HTML, CSS, JavaScript, images) directly, without requiring a separate web server.
- Event Notifications: S3 can trigger notifications when certain events occur (e.g., object creation, deletion), which can be used to integrate with other AWS services (like Lambda, SQS, SNS).
Example: Uploading a File to S3 using the AWS CLI
This example demonstrates uploading a file to an S3 bucket using the AWS CLI. You’ll need the AWS CLI installed and configured.
-
Create an S3 Bucket (if you don’t have one):
aws s3 mb s3://my-unique-bucket-name --region us-east-1 # Replace with a unique bucket name and your desired region
Note: Bucket names must be globally unique.
-
Upload a File:
aws s3 cp my_file.txt s3://my-unique-bucket-name/ # Upload 'my_file.txt' to the root of the bucket aws s3 cp my_file.txt s3://my-unique-bucket-name/my_folder/my_file.txt # Upload to a "folder" (key prefix)
-
List Objects in the Bucket:
aws s3 ls s3://my-unique-bucket-name/
-
Download a File:
aws s3 cp s3://my-unique-bucket-name/my_file.txt . # Download 'my_file.txt' to the current directory
-
Remove a file:
aws s3 rm s3://my-unique-bucket-name/my_file.txt
-
Remove the bucket (the bucket must be empty):
aws s3 rb s3://my-unique-bucket-name