S3 (Simple Storage Service)

Amazon S3 (Simple Storage Service) is an object storage service that offers industry-leading scalability, data availability, security, and performance. It’s designed to store and retrieve any amount of data, at any time, from anywhere on the web. Think of it as a highly scalable, durable, and secure hard drive in the cloud.


Key Concepts

  • Object Storage: S3 stores data as objects within buckets. Unlike file systems (which have a hierarchical directory structure), object storage is flat. Each object has a unique key (like a filename) and consists of data, metadata (information about the data), and a globally unique ID.
  • Buckets: Containers for objects. Bucket names are globally unique across all AWS accounts. You create buckets in specific AWS regions to optimize for latency, minimize costs, or address regulatory requirements.
  • Keys: The unique identifier for an object within a bucket. The combination of a bucket name and a key uniquely identifies each object. Keys can include / characters to simulate a directory structure, but it’s still fundamentally flat storage. (e.g., my-bucket/images/logo.png - my-bucket is the bucket, images/logo.png is the key).
  • Regions: S3 buckets are created in specific AWS regions. Choosing a region close to your users can reduce latency.
  • Durability and Availability: S3 is designed for 99.999999999% (11 9’s) durability and 99.99% availability of objects over a given year. This high durability is achieved through data redundancy across multiple facilities and devices.
  • Storage Classes: S3 offers different storage classes optimized for different access patterns and cost requirements:
    • S3 Standard: General-purpose storage for frequently accessed data.
    • S3 Intelligent-Tiering: Automatically moves objects between frequent, infrequent, and archive access tiers based on access patterns, optimizing costs.
    • S3 Standard-IA (Infrequent Access): For data that is accessed less frequently, but requires rapid access when needed. Lower storage cost than S3 Standard, but with a retrieval cost.
    • S3 One Zone-IA: Similar to S3 Standard-IA, but stores data in a single Availability Zone, offering lower cost but less redundancy.
    • S3 Glacier Instant Retrieval: Archive storage for data rarely accessed.
    • S3 Glacier Flexible Retrieval: Low-cost archive storage with retrieval times ranging from minutes to hours.
    • S3 Glacier Deep Archive: Lowest-cost storage class for long-term archiving, with retrieval times of 12 hours or more.
  • Versioning: S3 can keep multiple versions of an object in the same bucket. This protects against accidental deletion or overwriting.
  • Lifecycle Policies: You can define rules to automatically transition objects between storage classes or delete them after a specified period. This helps optimize storage costs.
  • Access Control: S3 offers granular access control using:
    • IAM Policies: Control access for AWS users and groups.
    • Bucket Policies: Control access to an entire bucket.
    • Access Control Lists (ACLs): Control access to individual objects (generally, bucket policies are preferred).
    • Pre-signed URLs: Generate temporary URLs that grant time-limited access to specific objects, even to users who don’t have AWS credentials.
  • Static Website Hosting: S3 can host static websites (HTML, CSS, JavaScript, images) directly, without requiring a separate web server.
  • Event Notifications: S3 can trigger notifications when certain events occur (e.g., object creation, deletion), which can be used to integrate with other AWS services (like Lambda, SQS, SNS).

Example: Uploading a File to S3 using the AWS CLI

This example demonstrates uploading a file to an S3 bucket using the AWS CLI. You’ll need the AWS CLI installed and configured.

  1. Create an S3 Bucket (if you don’t have one):

    aws s3 mb s3://my-unique-bucket-name --region us-east-1  # Replace with a unique bucket name and your desired region
    

    Note: Bucket names must be globally unique.

  2. Upload a File:

    aws s3 cp my_file.txt s3://my-unique-bucket-name/  # Upload 'my_file.txt' to the root of the bucket
    aws s3 cp my_file.txt s3://my-unique-bucket-name/my_folder/my_file.txt  # Upload to a "folder" (key prefix)
    
  3. List Objects in the Bucket:

    aws s3 ls s3://my-unique-bucket-name/
    
  4. Download a File:

    aws s3 cp s3://my-unique-bucket-name/my_file.txt .  # Download 'my_file.txt' to the current directory
    
  5. Remove a file:

     aws s3 rm s3://my-unique-bucket-name/my_file.txt
    
  6. Remove the bucket (the bucket must be empty):

     aws s3 rb s3://my-unique-bucket-name